Nystroem
Approximate a kernel feature map using the Nystroem method.
The Nystroem method constructs an explicit low-dimensional feature map
phi(x) that approximates an arbitrary kernel k(x, x') = <phi(x), phi(x')>,
enabling the use of kernel methods with linear-complexity training algorithms.
It works by sub-sampling n_components landmark points from the training
data, evaluating the kernel between all training samples and these landmarks,
and then normalising the resulting matrix using the Cholesky factor of the
kernel matrix evaluated on the landmarks alone.
The approximation quality improves with n_components: as
n_components → n_samples the approximation becomes exact. In practice a
small number of landmarks (e.g. a few hundred) is often sufficient.
Combining Nystroem with a linear model (e.g. SGDClassifier) provides a
scalable alternative to kernel SVMs for large datasets.
Key properties:
- Supports any kernel available in scikit-learn (RBF, polynomial, sigmoid,
chi2, linear, etc.) as well as callable kernels via
kernel_params. - Unsupervised: no labels required at fit time.
- Output dimensionality equals
n_components, which is independent of the number of input features. - The
gamma,coef0, anddegreeparameters are passed directly to the chosen kernel function.
Wraps scikit-learn's Nystroem.
References
- [1] https://scikit-learn.org/stable/modules/generated/sklearn.kernel_approximation.Nystroem.html
- [2] Williams, C. K. I. & Seeger, M. (2001). "Using the Nyström method to speed up kernel machines." Advances in Neural Information Processing Systems 13 (NIPS 2000), 682-688.
Parameters
- kernel, default=
rbf - The kernel to use for the approximation.
- gamma, default=
None - Gamma parameter for RBF, laplacian, polynomial, exp chi2 and sigmoid kernels.
- coef0, default=
None - The coef0 parameter for polynomial and sigmoid kernels.
- degree, default=
None - The degree of the polynomial kernel.
- kernel_params, default=
None - Additional parameters (kwargs) for the kernel function.
- n_components : integer, default=
2 - The number of features to construct.
- random_state, default=
None - Seed of the pseudo random number generator to use when shuffling the data.
- n_jobs, default=
None - Number of parallel jobs to run.
Methods
get_output_type(self, column_name: str = None) -> DashAI.back.types.dashai_data_type.DashAIDataType
NystroemReturn the DashAI data type produced by this converter for a column.
Parameters
- column_name : str, optional
- Not used; all output columns share the same type. Defaults to None.
Returns
- DashAIDataType
- A Float type backed by
pyarrow.float64().
changes_row_count(self) -> 'bool'
BaseConverterIndicate whether this converter changes the number of dataset rows.
Returns
- bool
- True if the converter may add or remove rows, False otherwise.
fit(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> DashAI.back.converters.base_converter.BaseConverter
SklearnWrapperFit the scikit-learn transformer to the data.
Parameters
- x : DashAIDataset
- The input dataset to fit the transformer on.
- y : DashAIDataset, optional
- Target values for supervised transformers. Defaults to None.
Returns
- BaseConverter
- The fitted transformer instance (self).
get_metadata(cls) -> 'Dict[str, Any]'
BaseConverterGet metadata for the converter, used by the DashAI frontend.
Parameters
- cls : type
- The converter class (injected automatically by Python for classmethods).
Returns
- Dict[str, Any]
- Dictionary containing display name, short description, image preview path, category, icon, color, and whether the converter is supervised.
get_schema(cls) -> dict
ConfigObjectGenerates the component related Json Schema.
Returns
- dict
- Dictionary representing the Json Schema of the component.
transform(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> 'DashAIDataset'
SklearnWrapperTransform the data using the fitted scikit-learn transformer.
Parameters
- x : DashAIDataset
- The input dataset to transform.
- y : DashAIDataset, optional
- Not used. Present for API consistency. Defaults to None.
Returns
- DashAIDataset
- The transformed dataset with updated DashAI column types.
validate_and_transform(self, raw_data: dict) -> dict
ConfigObjectIt takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.
Parameters
- raw_data : dict
- A dictionary with the data provided by the user to initialize the model.
Returns
- dict
- A validated dictionary with the necessary objects.