Skip to main content

SelectFdr

Converter
DashAI.back.converters.scikit_learn.SelectFdr

Select features by controlling the expected False Discovery Rate (FDR).

SelectFdr applies the Benjamini-Hochberg procedure to the p-values produced by a univariate scoring function, retaining only those features whose (adjusted) p-value is at most alpha. The FDR criterion bounds the expected proportion of selected features that are actually uninformative, offering a less conservative rejection policy than Family-Wise Error control while still providing statistical guarantees.

This filter is well suited to high dimensional settings (e.g. genomics, metabolomics) where many features are tested simultaneously and a small fraction of false positives among the selected set is acceptable in exchange for higher sensitivity.

Key properties:

  • Supervised: requires the target array y at fit time.
  • alpha is the target FDR level in [0, 1]; typical values are 0.05 or 0.10.
  • Less conservative than FWE (Bonferroni) correction: retains more features at the same nominal alpha when the number of tests is large.
  • The number of retained features is data-driven and not fixed in advance.

Wraps scikit-learn's SelectFdr.

References

Parameters

alpha : number, default=0.05
The highest uncorrected p-value for features to be kept.

Methods

fit(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> 'FeatureSelectionConverter'

Defined on FeatureSelectionConverter

Fit the selector while remembering the input column types.

Parameters

x : DashAIDataset
The input dataset to fit the selector on.
y : DashAIDataset, optional
Target values for the supervised selectors. Defaults to None.

Returns

FeatureSelectionConverter
The fitted selector instance (self).

get_metadata(cls) -> 'Dict[str, Any]'

Defined on BaseConverter

Get metadata for the converter, used by the DashAI frontend.

Parameters

cls : type
The converter class (injected automatically by Python for classmethods).

Returns

Dict[str, Any]
Dictionary containing display name, short description, image preview path, category, icon, color, and whether the converter is supervised.

get_output_type(self, column_name: str = None) -> DashAI.back.types.dashai_data_type.DashAIDataType

Defined on FeatureSelectionConverter

Return the original DashAI data type of a retained column.

Parameters

column_name : str, optional
The name of the retained column. Defaults to None.

Returns

DashAIDataType
The original type of the column. Falls back to float64 when the input type is unknown (feature selectors only operate on numbers).

get_schema(cls) -> dict

Defined on ConfigObject

Generates the component related Json Schema.

Returns

dict
Dictionary representing the Json Schema of the component.

transform(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> 'DashAIDataset'

Defined on SklearnWrapper

Transform the data using the fitted scikit-learn transformer.

Parameters

x : DashAIDataset
The input dataset to transform.
y : DashAIDataset, optional
Not used. Present for API consistency. Defaults to None.

Returns

DashAIDataset
The transformed dataset with updated DashAI column types.

validate_and_transform(self, raw_data: dict) -> dict

Defined on ConfigObject

It takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.

Parameters

raw_data : dict
A dictionary with the data provided by the user to initialize the model.

Returns

dict
A validated dictionary with the necessary objects.