Skip to main content

TranslationTask

Task
DashAI.back.tasks.TranslationTask

Task for sequence-to-sequence machine translation between languages.

Translation tasks take a single Text input column (source language) and produce a single Text output column (target language). The compatible metrics are BLEU and TER, which measure n-gram overlap and translation edit rate against reference translations respectively.

Methods

num_labels(self, dataset: 'DashAIDataset', output_column: str) -> int | None

Defined on TranslationTask

Get the number of unique labels in the output column.

Parameters

dataset : DashAIDataset
Dataset used for training
output_column : str
Output column

Returns

int | None
Number of unique labels or None if not applicable

prepare_for_task(self, dataset: Union[ForwardRef('DatasetDict'), ForwardRef('DashAIDataset')], input_columns: List[str], output_columns: List[str]) -> 'DashAIDataset'

Defined on TranslationTask

Convert the dataset to DashAIDataset and check the columns types

Parameters

dataset : Union[DatasetDict, DashAIDataset]
Dataset to be changed

Returns

DashAIDataset
Dataset with the new types

process_predictions(self, dataset: 'DashAIDataset', predictions: 'ndarray', output_column: str)

Defined on TranslationTask

Process the predictions

Parameters

dataset : DashAIDataset
Dataset used for training
predictions : np.ndarray
Predictions from the model
output_column : str
Output column

Returns

Processed predictions

get_metadata(cls) -> Dict[str, Any]

Defined on BaseTask

Return serialisable metadata for the current task.

Parameters

cls : type
The task class (injected automatically by Python for classmethods).

Returns

Dict[str, Any]
Dictionary with keys "inputs_types", "outputs_types", "inputs_cardinality", and "outputs_cardinality".

process_manual_input(self, manual_input: List[dict], dataset_path: str) -> 'DashAIDataset'

Defined on BaseTask

Process manual input data into a DashAIDataset with type validation.

Parameters

manual_input : List[dict]
List of dictionaries representing manual input data.
dataset_path : str
Path to the training dataset (used to get column specs for validation)

Returns

DashAIDataset
Processed DashAIDataset from manual input.

validate_dataset_for_task(self, dataset: 'DashAIDataset', dataset_name: str, input_columns: List[str], output_columns: List[str]) -> None

Defined on BaseTask

Validate a dataset for the current task.

Parameters

dataset : DashAIDataset
Dataset to be validated
dataset_name : str
Dataset name

Compatible with