Saltar al contenido principal

SGDClassifier

Model
DashAI.back.models.scikit_learn.SGDClassifier

SGD classifier with probability calibration for consistent predict_proba output.

SGDClassifier supports multiple loss functions that correspond to different linear models (SVM with 'hinge', logistic regression with 'log_loss', etc.). Stochastic Gradient Descent allows efficient training on large datasets. Because not all loss functions expose predict_proba natively, this wrapper consistently calibrates the model with CalibratedClassifierCV.

Key hyperparameters include loss, alpha, max_iter, tol, and learning_rate. The implementation wraps scikit-learn's SGDClassifier.

References

Parameters

loss : string, default=hinge
The loss function to use. 'hinge' gives a linear SVM; 'log_loss' gives logistic regression; 'modified_huber' is smoother; 'squared_hinge' is like hinge but quadratically penalised; 'perceptron' is the linear loss used by the perceptron algorithm.
alpha : number, default=0.0001
Regularisation parameter. Higher values result in stronger regularisation.
max_iter : integer, default=1000
The maximum number of passes over the training data (epochs).
tol : number, default=0.001
The stopping criterion. Training stops when loss > best_loss - tol.
learning_rate : string, default=optimal
The learning rate schedule. 'optimal' uses 1/(alpha*(t+t0)); 'constant' keeps eta0 constant; 'invscaling' decreases as 1/t^power; 'adaptive' halves the rate when training stops.
random_state, default=None
The seed of the pseudo-random number generator. Pass an int for reproducible output, or None to not set a specific seed.

Methods

predict(self, x_pred) -> 'ndarray'

Defined on SGDClassifier

Return class-probability matrix using the calibrated model.

Parameters

x_pred : DashAIDataset or pd.DataFrame
Input data.

Returns

np.ndarray
Class probability matrix.

train(self, x_train, y_train, x_validation=None, y_validation=None)

Defined on SGDClassifier

Train using CalibratedClassifierCV to guarantee predict_proba availability.

Parameters

x_train : DashAIDataset
The input features for training.
y_train : DashAIDataset
The target labels for training.
x_validation : DashAIDataset, optional
Unused (sklearn models ignore validation split).
y_validation : DashAIDataset, optional
Unused.

Returns

self

calculate_metrics(self, split: DashAI.back.core.enums.metrics.SplitEnum = <SplitEnum.VALIDATION: 'validation'>, level: DashAI.back.core.enums.metrics.LevelEnum = <LevelEnum.LAST: 'last'>, log_index: int = None, x_data: 'DashAIDataset' = None, y_data: 'DashAIDataset' = None)

Defined on BaseModel

Calculate and save metrics for a given data split and level.

Parameters

split : SplitEnum
The data split to evaluate (TRAIN, VALIDATION, or TEST). Defaults to SplitEnum.VALIDATION.
level : LevelEnum
The metric granularity level (LAST, TRIAL, STEP, or BATCH). Defaults to LevelEnum.LAST.
log_index : int, optional
Explicit step index for the metric entry. If None, the next step index is computed automatically. Defaults to None.
x_data : DashAIDataset, optional
Input features. If None, the dataset stored in the model for the given split is used. Defaults to None.
y_data : DashAIDataset, optional
Target labels. If None, the labels stored in the model for the given split are used. Defaults to None.

get_metadata(cls) -> Dict[str, Any]

Defined on BaseModel

Get metadata values for the current model.

Returns

Dict[str, Any]
Dictionary containing UI metadata such as the model icon used in the DashAI frontend.

get_schema(cls) -> dict

Defined on ConfigObject

Generates the component related Json Schema.

Returns

dict
Dictionary representing the Json Schema of the component.

load(filename: str) -> None

Defined on SklearnLikeModel

Deserialise a model from disk using joblib.

Parameters

filename : str
Path to the file previously written by :meth:save.

Returns

SklearnLikeModel
The loaded model instance.

prepare_dataset(self, dataset: 'DashAIDataset', is_fit: bool = False) -> 'DashAIDataset'

Defined on SklearnLikeModel

Apply the model transformations to the dataset.

Parameters

dataset : DashAIDataset
The dataset to be transformed.
is_fit : bool, optional
If True, the method will fit encoders on the data. If False, will apply previously fitted encoders.

Returns

DashAIDataset
The prepared dataset ready to be converted to an accepted format in the model.

prepare_output(self, dataset: 'DashAIDataset', is_fit: bool = False) -> 'DashAIDataset'

Defined on SklearnLikeModel

Prepare output targets using Label encoding.

Parameters

dataset : DashAIDataset
The output dataset to be transformed.
is_fit : bool, optional
If True, fit the encoder. If False, use existing encodings.

Returns

DashAIDataset
Dataset with categorical columns converted to integers.

save(self, filename: str) -> None

Defined on SklearnLikeModel

Serialise the model to disk using joblib.

Parameters

filename : str
Destination file path where the model will be written.

validate_and_transform(self, raw_data: dict) -> dict

Defined on ConfigObject

It takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.

Parameters

raw_data : dict
A dictionary with the data provided by the user to initialize the model.

Returns

dict
A validated dictionary with the necessary objects.

Compatible with