Skip to main content

Nystroem

Converter
DashAI.back.converters.scikit_learn.Nystroem

Approximate a kernel feature map using the Nystroem method.

The Nystroem method constructs an explicit low-dimensional feature map phi(x) that approximates an arbitrary kernel k(x, x') = <phi(x), phi(x')>, enabling the use of kernel methods with linear-complexity training algorithms. It works by sub-sampling n_components landmark points from the training data, evaluating the kernel between all training samples and these landmarks, and then normalising the resulting matrix using the Cholesky factor of the kernel matrix evaluated on the landmarks alone.

The approximation quality improves with n_components: as n_components → n_samples the approximation becomes exact. In practice a small number of landmarks (e.g. a few hundred) is often sufficient. Combining Nystroem with a linear model (e.g. SGDClassifier) provides a scalable alternative to kernel SVMs for large datasets.

Key properties:

  • Supports any kernel available in scikit-learn (RBF, polynomial, sigmoid, chi2, linear, etc.) as well as callable kernels via kernel_params.
  • Unsupervised: no labels required at fit time.
  • Output dimensionality equals n_components, which is independent of the number of input features.
  • The gamma, coef0, and degree parameters are passed directly to the chosen kernel function.

Wraps scikit-learn's Nystroem.

References

Parameters

kernel, default=rbf
The kernel to use for the approximation.
gamma, default=None
Gamma parameter for RBF, laplacian, polynomial, exp chi2 and sigmoid kernels.
coef0, default=None
The coef0 parameter for polynomial and sigmoid kernels.
degree, default=None
The degree of the polynomial kernel.
kernel_params, default=None
Additional parameters (kwargs) for the kernel function.
n_components : integer, default=2
The number of features to construct.
random_state, default=None
Seed of the pseudo random number generator to use when shuffling the data.
n_jobs, default=None
Number of parallel jobs to run.

Methods

get_output_type(self, column_name: str = None) -> DashAI.back.types.dashai_data_type.DashAIDataType

Defined on Nystroem

Return the DashAI data type produced by this converter for a column.

Parameters

column_name : str, optional
Not used; all output columns share the same type. Defaults to None.

Returns

DashAIDataType
A Float type backed by pyarrow.float64().

changes_row_count(self) -> 'bool'

Defined on BaseConverter

Indicate whether this converter changes the number of dataset rows.

Returns

bool
True if the converter may add or remove rows, False otherwise.

fit(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> DashAI.back.converters.base_converter.BaseConverter

Defined on SklearnWrapper

Fit the scikit-learn transformer to the data.

Parameters

x : DashAIDataset
The input dataset to fit the transformer on.
y : DashAIDataset, optional
Target values for supervised transformers. Defaults to None.

Returns

BaseConverter
The fitted transformer instance (self).

get_metadata(cls) -> 'Dict[str, Any]'

Defined on BaseConverter

Get metadata for the converter, used by the DashAI frontend.

Parameters

cls : type
The converter class (injected automatically by Python for classmethods).

Returns

Dict[str, Any]
Dictionary containing display name, short description, image preview path, category, icon, color, and whether the converter is supervised.

get_schema(cls) -> dict

Defined on ConfigObject

Generates the component related Json Schema.

Returns

dict
Dictionary representing the Json Schema of the component.

transform(self, x: 'DashAIDataset', y: Optional[ForwardRef('DashAIDataset')] = None) -> 'DashAIDataset'

Defined on SklearnWrapper

Transform the data using the fitted scikit-learn transformer.

Parameters

x : DashAIDataset
The input dataset to transform.
y : DashAIDataset, optional
Not used. Present for API consistency. Defaults to None.

Returns

DashAIDataset
The transformed dataset with updated DashAI column types.

validate_and_transform(self, raw_data: dict) -> dict

Defined on ConfigObject

It takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.

Parameters

raw_data : dict
A dictionary with the data provided by the user to initialize the model.

Returns

dict
A validated dictionary with the necessary objects.