TabularClassificationTask
Task for classifying structured tabular data into discrete categories.
Tabular classification predicts categorical labels from structured feature
tables (rows of observations, columns of features). It accepts numeric
(Float, Integer) and categorical (Categorical) inputs, requires
a single categorical output column, and is compatible with all sklearn-based
and DashAI tabular classifier models.
Methods
prepare_for_task(self, dataset: Union[ForwardRef('DatasetDict'), ForwardRef('DashAIDataset')], input_columns: List[str], output_columns: List[str]) -> 'DashAIDataset'
TabularClassificationTaskConvert the dataset to DashAIDataset and check the columns types
Parameters
- dataset : Union[DatasetDict, DashAIDataset]
- Dataset to be changed
Returns
- DashAIDataset
- Dataset with the new types
get_metadata(cls) -> Dict[str, Any]
BaseTaskReturn serialisable metadata for the current task.
Parameters
- cls : type
- The task class (injected automatically by Python for classmethods).
Returns
- Dict[str, Any]
- Dictionary with keys
"inputs_types","outputs_types","inputs_cardinality", and"outputs_cardinality".
num_labels(self, dataset: 'DashAIDataset', output_column: str) -> int | None
ClassificationTaskGet the number of unique labels in the output column.
Parameters
- dataset : DashAIDataset
- Dataset used for training
- output_column : str
- Output column
Returns
- int | None
- Number of unique labels or None if not applicable
process_manual_input(self, manual_input: List[dict], dataset_path: str) -> 'DashAIDataset'
BaseTaskProcess manual input data into a DashAIDataset with type validation.
Parameters
- manual_input : List[dict]
- List of dictionaries representing manual input data.
- dataset_path : str
- Path to the training dataset (used to get column specs for validation)
Returns
- DashAIDataset
- Processed DashAIDataset from manual input.
process_predictions(self, dataset: 'DashAIDataset', predictions: 'ndarray', output_column: str) -> 'ndarray'
ClassificationTaskProcess the predictions to return the class labels.
Parameters
- dataset : DashAIDataset
- Dataset used for training
- predictions : np.ndarray
- Predictions from the model (probabilities for each class)
- output_column : str
- Output column
Returns
- np.ndarray
- Processed predictions with class labels
validate_dataset_for_task(self, dataset: 'DashAIDataset', dataset_name: str, input_columns: List[str], output_columns: List[str]) -> None
BaseTaskValidate a dataset for the current task.
Parameters
- dataset : DashAIDataset
- Dataset to be validated
- dataset_name : str
- Dataset name