Skip to main content

GradientBoostingR

Model
DashAI.back.models.scikit_learn.GradientBoostingR

Gradient boosting regressor that builds an ensemble of decision trees sequentially.

Gradient Boosting builds an additive model in a forward stage-wise fashion. At each stage a shallow decision tree is fitted to the negative gradient of the chosen loss function with respect to the current ensemble prediction. A learning_rate shrinkage factor scales the contribution of each new tree, trading a slower learning process for better generalisation.

Key hyperparameters include n_estimators (number of boosting stages), learning_rate, max_depth, subsample (fraction of training samples per tree, enabling stochastic gradient boosting), loss, and min_samples_split. The implementation wraps scikit-learn's GradientBoostingRegressor.

References

Parameters

loss : string, default=squared_error
Loss function to be optimized.
learning_rate : number, default=0.1
Learning rate shrinks the contribution of each tree.
n_estimators : integer, default=100
The number of boosting stages to be run.
subsample : number, default=1.0
The fraction of samples to be used for fitting the individual base learners.
criterion : string, default=friedman_mse
The function to measure the quality of a split.
min_samples_split : number, default=0.5
The minimum number of samples required to split an internal node.
min_samples_leaf : number, default=1
The minimum number of samples required to be at a leaf node.
min_weight_fraction_leaf : number, default=0.0
The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node.
max_depth, default=3
The maximum depth of the individual regression estimators.
min_impurity_decrease : number, default=0.0
A node will be split if this split induces a decrease of the impurity greater than or equal to this value.
random_state, default=None
The seed of the pseudo-random number generator to use when shuffling the data.
max_features, default=None
The number of features to consider when looking for the best split.
alpha : number, default=0.9
The alpha-quantile of the Huber loss function and the quantile loss function.
verbose : integer, default=0
Enable verbose output.
max_leaf_nodes, default=None
Grow trees with max_leaf_nodes in best-first fashion.
warm_start : boolean, default=False
When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble.
validation_fraction : number, default=0.1
The proportion of training data to set aside as validation set for early stopping.
n_iter_no_change, default=None
The number of iterations with no improvement to wait before stopping the training.
tol : number, default=0.0001
Tolerance for the early stopping.
ccp_alpha : number, default=0.0
Complexity parameter used for Minimal Cost-Complexity Pruning.

Methods

calculate_metrics(self, split: DashAI.back.core.enums.metrics.SplitEnum = <SplitEnum.VALIDATION: 'validation'>, level: DashAI.back.core.enums.metrics.LevelEnum = <LevelEnum.LAST: 'last'>, log_index: int = None, x_data: 'DashAIDataset' = None, y_data: 'DashAIDataset' = None)

Defined on BaseModel

Calculate and save metrics for a given data split and level.

Parameters

split : SplitEnum
The data split to evaluate (TRAIN, VALIDATION, or TEST). Defaults to SplitEnum.VALIDATION.
level : LevelEnum
The metric granularity level (LAST, TRIAL, STEP, or BATCH). Defaults to LevelEnum.LAST.
log_index : int, optional
Explicit step index for the metric entry. If None, the next step index is computed automatically. Defaults to None.
x_data : DashAIDataset, optional
Input features. If None, the dataset stored in the model for the given split is used. Defaults to None.
y_data : DashAIDataset, optional
Target labels. If None, the labels stored in the model for the given split are used. Defaults to None.

get_metadata(cls) -> Dict[str, Any]

Defined on BaseModel

Get metadata values for the current model.

Returns

Dict[str, Any]
Dictionary containing UI metadata such as the model icon used in the DashAI frontend.

get_schema(cls) -> dict

Defined on ConfigObject

Generates the component related Json Schema.

Returns

dict
Dictionary representing the Json Schema of the component.

load(filename: str) -> None

Defined on SklearnLikeModel

Deserialise a model from disk using joblib.

Parameters

filename : str
Path to the file previously written by :meth:save.

Returns

SklearnLikeModel
The loaded model instance.

predict(self, x_pred: 'DashAIDataset') -> 'ndarray'

Defined on SklearnLikeRegressor

Make a prediction with the model.

Parameters

x_pred : DashAIDataset
Dataset with the input data columns.

Returns

np.ndarray
Array with the predicted target values for x_pred

prepare_dataset(self, dataset: 'DashAIDataset', is_fit: bool = False) -> 'DashAIDataset'

Defined on SklearnLikeModel

Apply the model transformations to the dataset.

Parameters

dataset : DashAIDataset
The dataset to be transformed.
is_fit : bool, optional
If True, the method will fit encoders on the data. If False, will apply previously fitted encoders.

Returns

DashAIDataset
The prepared dataset ready to be converted to an accepted format in the model.

prepare_output(self, dataset: 'DashAIDataset', is_fit: bool = False) -> 'DashAIDataset'

Defined on SklearnLikeModel

Prepare output targets using Label encoding.

Parameters

dataset : DashAIDataset
The output dataset to be transformed.
is_fit : bool, optional
If True, fit the encoder. If False, use existing encodings.

Returns

DashAIDataset
Dataset with categorical columns converted to integers.

save(self, filename: str) -> None

Defined on SklearnLikeModel

Serialise the model to disk using joblib.

Parameters

filename : str
Destination file path where the model will be written.

train(self, x_train, y_train, x_validation=None, y_validation=None)

Defined on SklearnLikeModel

Train the sklearn model on the provided dataset.

Parameters

x_train : DashAIDataset
The input features for training.
y_train : DashAIDataset
The target labels for training.
x_validation : DashAIDataset, optional
Validation input features (unused in sklearn models). Defaults to None.
y_validation : DashAIDataset, optional
Validation target labels (unused in sklearn models). Defaults to None.

Returns

BaseModel
The fitted scikit-learn estimator (self).

validate_and_transform(self, raw_data: dict) -> dict

Defined on ConfigObject

It takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.

Parameters

raw_data : dict
A dictionary with the data provided by the user to initialize the model.

Returns

dict
A validated dictionary with the necessary objects.

Compatible with