Skip to main content

TextClassificationTask

Task
DashAI.back.tasks.TextClassificationTask

Task for classifying a single text column into discrete categories.

Text classification takes one input column of type Text and maps it to one categorical output column. The task covers any NLP scenario where a raw or pre-processed text sequence must be assigned to one of a fixed set of labels, such as sentiment analysis, spam detection, topic labelling, and intent recognition. Compatible models consume the text directly and output a predicted class label for each sample.

Methods

prepare_for_task(self, dataset: Union[ForwardRef('DatasetDict'), ForwardRef('DashAIDataset')], input_columns: List[str], output_columns: List[str]) -> 'DashAIDataset'

Defined on TextClassificationTask

Convert the dataset to DashAIDataset and check the columns types

Parameters

dataset : Union[DatasetDict, DashAIDataset]
Dataset to be changed

Returns

DashAIDataset
Dataset with the new types

get_metadata(cls) -> Dict[str, Any]

Defined on BaseTask

Return serialisable metadata for the current task.

Parameters

cls : type
The task class (injected automatically by Python for classmethods).

Returns

Dict[str, Any]
Dictionary with keys "inputs_types", "outputs_types", "inputs_cardinality", and "outputs_cardinality".

num_labels(self, dataset: 'DashAIDataset', output_column: str) -> int | None

Defined on ClassificationTask

Get the number of unique labels in the output column.

Parameters

dataset : DashAIDataset
Dataset used for training
output_column : str
Output column

Returns

int | None
Number of unique labels or None if not applicable

process_manual_input(self, manual_input: List[dict], dataset_path: str) -> 'DashAIDataset'

Defined on BaseTask

Process manual input data into a DashAIDataset with type validation.

Parameters

manual_input : List[dict]
List of dictionaries representing manual input data.
dataset_path : str
Path to the training dataset (used to get column specs for validation)

Returns

DashAIDataset
Processed DashAIDataset from manual input.

process_predictions(self, dataset: 'DashAIDataset', predictions: 'ndarray', output_column: str) -> 'ndarray'

Defined on ClassificationTask

Process the predictions to return the class labels.

Parameters

dataset : DashAIDataset
Dataset used for training
predictions : np.ndarray
Predictions from the model (probabilities for each class)
output_column : str
Output column

Returns

np.ndarray
Processed predictions with class labels

validate_dataset_for_task(self, dataset: 'DashAIDataset', dataset_name: str, input_columns: List[str], output_columns: List[str]) -> None

Defined on BaseTask

Validate a dataset for the current task.

Parameters

dataset : DashAIDataset
Dataset to be validated
dataset_name : str
Dataset name

Compatible with