OneHotEncoder

Converter

DashAI.back.converters.scikit_learn.OneHotEncoder

Encode categorical columns as binary indicator (one hot) vectors.

For each input feature column every unique category value becomes a separate binary output column. Given a feature with k categories the encoding produces k columns (or k - 1 when drop is set) where exactly one column is 1 and the rest are 0:

Nominal categories without order: one hot encoding treats all categories as equidistant, which is appropriate for unordered labels such as city names or product types.
Avoiding the dummy-variable trap: the drop parameter can remove one indicator column per feature so that the resulting matrix has full rank, which is required by unregularized linear models.
Infrequent categories: min_frequency and max_categories can group rare values into a single infrequent_categories bin, reducing dimensionality.

The total number of output columns equals the sum of unique category counts across all encoded input columns (minus dropped columns).

References

[1] https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html

Parameters

categories : string, default=auto: The categories of each feature.
drop, default=None: Specifies a methodology to drop one of the categories per feature.
dtype : string, default=int64: Desired dtype of output.
handle_unknown : string, default=error: How to handle unknown categories during transform.
min_frequency, default=None: Minimum frequency of a category to be considered as frequent.
max_categories, default=None: Maximum number of categories to encode.
feature_name_combiner : string, default=concat: Method used to combine feature names.

Methods

get_output_type(self, column_name: str = None) -> DashAI.back.types.dashai_data_type.DashAIDataType

Defined on OneHotEncoder

Return the DashAI data type produced by this converter for a column.

Parameters

column_name : str, optional: Not used; all output columns share the same type. Defaults to None.

Returns

DashAIDataType: An Integer type backed by pyarrow type, representing the binary indicator values (0 or 1).

fit(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> DashAI.back.converters.base_converter.BaseConverter

Defined on SklearnWrapper

Fit the scikit-learn transformer to the data.

Parameters

x : DashAIDataset: The input dataset to fit the transformer on.
y : DashAIDataset, optional: Target values for supervised transformers. Defaults to None.

Returns

BaseConverter: The fitted transformer instance (self).

get_metadata(cls) -> 'Dict[str, Any]'

Defined on BaseConverter

Get metadata for the converter, used by the DashAI frontend.

Parameters

cls : type: The converter class (injected automatically by Python for classmethods).

Returns

Dict[str, Any]: Dictionary containing display name, short description, image preview path, category, icon, color, and whether the converter is supervised.

get_schema(cls) -> dict

Defined on ConfigObject

Generates the component related Json Schema.

Returns

dict: Dictionary representing the Json Schema of the component.

transform(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> 'DashAIDataset'

Defined on EncodingConverter

Encode x and append the result as new encoded_* columns.

Parameters

x : DashAIDataset: The dataset to transform (contains only the scope columns).
y : DashAIDataset, optional: Ignored. Defaults to None.

Returns

DashAIDataset: Dataset with the original scope columns preserved plus new encoded_* columns appended.

validate_and_transform(self, raw_data: dict) -> dict

Defined on ConfigObject

It takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.

Parameters

raw_data : dict: A dictionary with the data provided by the user to initialize the model.

Returns

dict: A validated dictionary with the necessary objects.

References​

Parameters​

Methods​

References

Parameters

Methods