Skip to main content

VarianceThreshold

Converter
DashAI.back.converters.scikit_learn.VarianceThreshold

Remove features whose variance across the training set is below a threshold.

For each feature column the sample variance is computed during fitting::

Var(x) = E[x^2] - (E[x])^2

Feature columns for which Var(x) < threshold are removed. Because the criterion is purely based on the marginal variance of each feature, this selector requires no class labels and runs in O(n * p) time.

Common use cases include:

  • Constant-feature removal: with the default threshold=0.0 any feature that takes the same value in every training sample is dropped.
  • Near-constant-feature removal: for binary features, a threshold of p * (1 - p) drops features that are True in fewer than a fraction p of samples (e.g. threshold=0.8 * 0.2 = 0.16 removes features that are True in less than 20 % of samples).
  • Pre-filtering before expensive selectors: quickly reducing dimensionality before applying supervised selection methods such as SelectKBest or RFECV.

References

Parameters

threshold : number, default=0.0
Features with a variance lower than this threshold will be removed.

Methods

fit(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> 'VarianceThreshold'

Defined on VarianceThreshold

Fit the selector, remembering input types and tolerating empty output.

Parameters

x : DashAIDataset
The input dataset to fit the selector on.
y : DashAIDataset, optional
Ignored; present for API consistency.

Returns

VarianceThreshold
The fitted selector instance (self).

get_output_type(self, column_name: str = None) -> DashAI.back.types.dashai_data_type.DashAIDataType

Defined on VarianceThreshold

Return the original DashAI data type of a retained column.

Parameters

column_name : str, optional
The name of the retained column. Defaults to None.

Returns

DashAIDataType
The original type of the column. Falls back to float64 when the input type is unknown (the selector only operates on numbers).

get_metadata(cls) -> 'Dict[str, Any]'

Defined on BaseConverter

Get metadata for the converter, used by the DashAI frontend.

Parameters

cls : type
The converter class (injected automatically by Python for classmethods).

Returns

Dict[str, Any]
Dictionary containing display name, short description, image preview path, category, icon, color, and whether the converter is supervised.

get_schema(cls) -> dict

Defined on ConfigObject

Generates the component related Json Schema.

Returns

dict
Dictionary representing the Json Schema of the component.

transform(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> 'DashAIDataset'

Defined on SklearnWrapper

Transform the data using the fitted scikit-learn transformer.

Parameters

x : DashAIDataset
The input dataset to transform.
y : DashAIDataset, optional
Not used. Present for API consistency. Defaults to None.

Returns

DashAIDataset
The transformed dataset with updated DashAI column types.

validate_and_transform(self, raw_data: dict) -> dict

Defined on ConfigObject

It takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.

Parameters

raw_data : dict
A dictionary with the data provided by the user to initialize the model.

Returns

dict
A validated dictionary with the necessary objects.