SkewedChi2Sampler
Approximate the skewed chi-squared kernel feature map via random Fourier features.
The skewed chi-squared kernel is well-suited for histogram-based features (e.g. visual bag-of-words, colour histograms) and is defined as:
K(x, y) = prod_j 2 * sqrt(x_j + c) * sqrt(y_j + c) / (x_j + y_j + 2c)
where c is the skewedness parameter. A skewedness value of 0
recovers the ordinary chi-squared kernel; larger values reduce the
sensitivity to small feature values.
This converter maps inputs to a n_components-dimensional random Fourier
feature space in which a dot product approximates the above kernel,
following the approach of Rahimi & Recht (2007) [2]. Training a linear
classifier on the resulting features approximates an SVM with the skewed
chi-squared kernel.
Output columns are typed as Float64 in DashAI.
Wraps sklearn.kernel_approximation.SkewedChi2Sampler.
References
- [1] https://scikit-learn.org/stable/modules/generated/sklearn.kernel_approximation.SkewedChi2Sampler.html
- [2] Rahimi, A. & Recht, B. (2007). Random Features for Large-Scale Kernel Machines. Advances in Neural Information Processing Systems, 20.
Parameters
- skewedness : number, default=
1.0 - The skewedness parameter of the chi-squared kernel.
- n_components : integer, default=
100 - Number of Monte Carlo samples per original feature. Equals the dimensionality of the computed feature space.
- random_state, default=
None - Pseudo-random number generator to control the generation of the random weights and random offset when fitting the training data. Pass an int for reproducible output across multiple function calls.
Methods
get_output_type(self, column_name: str = None) -> DashAI.back.types.dashai_data_type.DashAIDataType
SkewedChi2SamplerReturn the DashAI data type produced by this converter for a column.
Parameters
- column_name : str, optional
- Not used; all output columns share the same type. Defaults to None.
Returns
- DashAIDataType
- A Float type backed by
pyarrow.float64().
changes_row_count(self) -> 'bool'
BaseConverterIndicate whether this converter changes the number of dataset rows.
Returns
- bool
- True if the converter may add or remove rows, False otherwise.
fit(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> DashAI.back.converters.base_converter.BaseConverter
SklearnWrapperFit the scikit-learn transformer to the data.
Parameters
- x : DashAIDataset
- The input dataset to fit the transformer on.
- y : DashAIDataset, optional
- Target values for supervised transformers. Defaults to None.
Returns
- BaseConverter
- The fitted transformer instance (self).
get_metadata(cls) -> 'Dict[str, Any]'
BaseConverterGet metadata for the converter, used by the DashAI frontend.
Parameters
- cls : type
- The converter class (injected automatically by Python for classmethods).
Returns
- Dict[str, Any]
- Dictionary containing display name, short description, image preview path, category, icon, color, and whether the converter is supervised.
get_schema(cls) -> dict
ConfigObjectGenerates the component related Json Schema.
Returns
- dict
- Dictionary representing the Json Schema of the component.
transform(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> 'DashAIDataset'
SklearnWrapperTransform the data using the fitted scikit-learn transformer.
Parameters
- x : DashAIDataset
- The input dataset to transform.
- y : DashAIDataset, optional
- Not used. Present for API consistency. Defaults to None.
Returns
- DashAIDataset
- The transformed dataset with updated DashAI column types.
validate_and_transform(self, raw_data: dict) -> dict
ConfigObjectIt takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.
Parameters
- raw_data : dict
- A dictionary with the data provided by the user to initialize the model.
Returns
- dict
- A validated dictionary with the necessary objects.