DashAI.back.models.BagOfWordsTextClassificationModel

class BagOfWordsTextClassificationModel(**kwargs)[source]

Text classification meta-model.

The metamodel has two main components:

Tabular classification model: the underlying model that processes the data and
provides the prediction.
Vectorizer: a BagOfWords that vectorizes the text into a sparse matrix to give
the correct input to the underlying model.

The tabular_model and vectorizer are created in the __init__ method and stored in the model.

To train the tabular_model the vectorizer is fitted and used to transform the train dataset.

To predict with the tabular_model the vectorizer is used to transform the dataset.

__init__(**kwargs) → None[source]

Initialize the BagOfWordsTextClassificationModel.

Parameters:: kwargs (dict) – A dictionary containing the parameters for the model, including: - tabular_classifier: Configuration for the underlying classifier. - ngram_min_n: Minimum n-gram value. - ngram_max_n: Maximum n-gram value.

Methods

`__init__`(**kwargs)	Initialize the BagOfWordsTextClassificationModel.
`fit`(x, y)	Fit the estimator.
`get_schema`()	Generates the component related Json Schema.
`get_vectorizer`(input_column[, output_column])	Factory that returns a function to transform a text classification dataset into a tabular classification dataset.
`load`(filename)	Load the model of the specified path.
`predict`(x)
`save`(filename)	Save the model in the specified path.
`validate_and_transform`(raw_data)	It takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.

Attributes

`COMPATIBLE_COMPONENTS`
`TYPE`