DashAI.back.models.BagOfWordsTextClassificationModel
- class BagOfWordsTextClassificationModel(**kwargs)[source]
Text classification meta-model.
The metamodel has two main components:
- Tabular classification model: the underlying model that processes the data and
provides the prediction.
- Vectorizer: a BagOfWords that vectorizes the text into a sparse matrix to give
the correct input to the underlying model.
The tabular_model and vectorizer are created in the __init__ method and stored in the model.
To train the tabular_model the vectorizer is fitted and used to transform the train dataset.
To predict with the tabular_model the vectorizer is used to transform the dataset.
- __init__(**kwargs) None [source]
Initialize the BagOfWordsTextClassificationModel.
- Parameters:
kwargs (dict) – A dictionary containing the parameters for the model, including: - tabular_classifier: Configuration for the underlying classifier. - ngram_min_n: Minimum n-gram value. - ngram_max_n: Maximum n-gram value.
Methods
__init__
(**kwargs)Initialize the BagOfWordsTextClassificationModel.
fit
(x, y)Fit the estimator.
get_schema
()Generates the component related Json Schema.
get_vectorizer
(input_column[, output_column])Factory that returns a function to transform a text classification dataset into a tabular classification dataset.
load
(filename)Load the model of the specified path.
predict
(x)save
(filename)Save the model in the specified path.
validate_and_transform
(raw_data)It takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.
Attributes
COMPATIBLE_COMPONENTS
TYPE