ExtraTreesRegression
Extra-Trees regressor using fully randomised decision tree splits.
Extremely Randomised Trees pick thresholds at random instead of searching for the optimal split, introducing extra randomness that further reduces variance. Combined with averaging over many trees, Extra-Trees can achieve very low generalisation error on regression tasks while being fast to train.
Key hyperparameters include n_estimators, max_depth,
min_samples_split, and bootstrap. The implementation wraps
scikit-learn's ExtraTreesRegressor.
References
- [1] Geurts, P., Ernst, D. & Wehenkel, L. (2006). "Extremely randomized trees." Machine Learning, 63(1), 3-42.
- [2] https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesRegressor.html
Parameters
- n_estimators : integer, default=
100 - The number of trees in the forest.
- max_depth, default=
None - The maximum depth of the tree. If None, nodes are expanded until all leaves are pure or fewer than min_samples_split samples remain.
- min_samples_split : integer, default=
2 - Minimum number of samples required to split an internal node.
- min_samples_leaf : integer, default=
1 - Minimum number of samples required to be at a leaf node.
- bootstrap : boolean, default=
False - Whether bootstrap samples are used when building trees. If False, the whole dataset is used for each tree.
- random_state, default=
None - The seed of the pseudo-random number generator. Pass an int for reproducible output, or None to not set a specific seed.
Methods
calculate_metrics(self, split: DashAI.back.core.enums.metrics.SplitEnum = <SplitEnum.VALIDATION: 'validation'>, level: DashAI.back.core.enums.metrics.LevelEnum = <LevelEnum.LAST: 'last'>, log_index: int = None, x_data: 'DashAIDataset' = None, y_data: 'DashAIDataset' = None)
BaseModelCalculate and save metrics for a given data split and level.
Parameters
- split : SplitEnum
- The data split to evaluate (TRAIN, VALIDATION, or TEST). Defaults to SplitEnum.VALIDATION.
- level : LevelEnum
- The metric granularity level (LAST, TRIAL, STEP, or BATCH). Defaults to LevelEnum.LAST.
- log_index : int, optional
- Explicit step index for the metric entry. If None, the next step index is computed automatically. Defaults to None.
- x_data : DashAIDataset, optional
- Input features. If None, the dataset stored in the model for the given split is used. Defaults to None.
- y_data : DashAIDataset, optional
- Target labels. If None, the labels stored in the model for the given split are used. Defaults to None.
get_metadata(cls) -> Dict[str, Any]
BaseModelGet metadata values for the current model.
Returns
- Dict[str, Any]
- Dictionary containing UI metadata such as the model icon used in the DashAI frontend.
get_schema(cls) -> dict
ConfigObjectGenerates the component related Json Schema.
Returns
- dict
- Dictionary representing the Json Schema of the component.
load(filename: str) -> None
SklearnLikeModelDeserialise a model from disk using joblib.
Parameters
- filename : str
- Path to the file previously written by :meth:
save.
Returns
- SklearnLikeModel
- The loaded model instance.
predict(self, x_pred: 'DashAIDataset') -> 'ndarray'
SklearnLikeRegressorMake a prediction with the model.
Parameters
- x_pred : DashAIDataset
- Dataset with the input data columns.
Returns
- np.ndarray
- Array with the predicted target values for x_pred
prepare_dataset(self, dataset: 'DashAIDataset', is_fit: bool = False) -> 'DashAIDataset'
SklearnLikeModelApply the model transformations to the dataset.
Parameters
- dataset : DashAIDataset
- The dataset to be transformed.
- is_fit : bool, optional
- If True, the method will fit encoders on the data. If False, will apply previously fitted encoders.
Returns
- DashAIDataset
- The prepared dataset ready to be converted to an accepted format in the model.
prepare_output(self, dataset: 'DashAIDataset', is_fit: bool = False) -> 'DashAIDataset'
SklearnLikeModelPrepare output targets using Label encoding.
Parameters
- dataset : DashAIDataset
- The output dataset to be transformed.
- is_fit : bool, optional
- If True, fit the encoder. If False, use existing encodings.
Returns
- DashAIDataset
- Dataset with categorical columns converted to integers.
save(self, filename: str) -> None
SklearnLikeModelSerialise the model to disk using joblib.
Parameters
- filename : str
- Destination file path where the model will be written.
train(self, x_train, y_train, x_validation=None, y_validation=None)
SklearnLikeModelTrain the sklearn model on the provided dataset.
Parameters
- x_train : DashAIDataset
- The input features for training.
- y_train : DashAIDataset
- The target labels for training.
- x_validation : DashAIDataset, optional
- Validation input features (unused in sklearn models). Defaults to None.
- y_validation : DashAIDataset, optional
- Validation target labels (unused in sklearn models). Defaults to None.
Returns
- BaseModel
- The fitted scikit-learn estimator (self).
validate_and_transform(self, raw_data: dict) -> dict
ConfigObjectIt takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.
Parameters
- raw_data : dict
- A dictionary with the data provided by the user to initialize the model.
Returns
- dict
- A validated dictionary with the necessary objects.