SelectPercentile
Select the top percentile of features by a univariate statistical test.
SelectPercentile applies the same univariate scoring approach as
SelectKBest but expresses the number of features to retain as a
percentage of all available features rather than as an absolute count.
Each feature is scored independently against the target using a chosen
statistical function, and the top percentile percent are kept.
This makes the selector robust to datasets with varying numbers of input features, since the number of retained features scales automatically with the input dimensionality. It is particularly convenient for grid search experiments where the feature set size may change across cross-validation folds or preprocessing stages.
Key properties:
- Supervised: requires the target array
yat fit time. percentileis an integer in [1, 100]; setting it to 100 passes all features through unchanged.- Uses the same family of scoring functions as
SelectKBest(f_classif,chi2,mutual_info_classif, etc.). - Feature ranking is univariate and does not capture interactions.
Wraps scikit-learn's SelectPercentile.
References
- [1] https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.SelectPercentile.html
Parameters
- percentile : integer, default=
10 - Percent of features to keep.
Methods
fit(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> 'FeatureSelectionConverter'
FeatureSelectionConverterFit the selector while remembering the input column types.
Parameters
- x : DashAIDataset
- The input dataset to fit the selector on.
- y : DashAIDataset, optional
- Target values for the supervised selectors. Defaults to None.
Returns
- FeatureSelectionConverter
- The fitted selector instance (self).
get_metadata(cls) -> 'Dict[str, Any]'
BaseConverterGet metadata for the converter, used by the DashAI frontend.
Parameters
- cls : type
- The converter class (injected automatically by Python for classmethods).
Returns
- Dict[str, Any]
- Dictionary containing display name, short description, image preview path, category, icon, color, and whether the converter is supervised.
get_output_type(self, column_name: str = None) -> DashAI.back.types.dashai_data_type.DashAIDataType
FeatureSelectionConverterReturn the original DashAI data type of a retained column.
Parameters
- column_name : str, optional
- The name of the retained column. Defaults to None.
Returns
- DashAIDataType
- The original type of the column. Falls back to
float64when the input type is unknown (feature selectors only operate on numbers).
get_schema(cls) -> dict
ConfigObjectGenerates the component related Json Schema.
Returns
- dict
- Dictionary representing the Json Schema of the component.
transform(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> 'DashAIDataset'
SklearnWrapperTransform the data using the fitted scikit-learn transformer.
Parameters
- x : DashAIDataset
- The input dataset to transform.
- y : DashAIDataset, optional
- Not used. Present for API consistency. Defaults to None.
Returns
- DashAIDataset
- The transformed dataset with updated DashAI column types.
validate_and_transform(self, raw_data: dict) -> dict
ConfigObjectIt takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.
Parameters
- raw_data : dict
- A dictionary with the data provided by the user to initialize the model.
Returns
- dict
- A validated dictionary with the necessary objects.