CohenKappa
Agreement between classifier predictions and true labels beyond chance.
Cohen's Kappa (κ) measures the extent to which two raters (here, the model and ground truth) agree on class assignments, correcting for the probability of agreement occurring by chance. Unlike raw accuracy, Kappa accounts for class imbalance and is particularly useful when evaluating classifiers on unbalanced datasets.
::
κ = (p_o - p_e) / (1 - p_e)
where p_o is the observed agreement (accuracy) and p_e is the expected agreement by chance.
Range: (-∞, 1]. Interpretation: < 0 worse than chance; 0.21-0.40 fair; 0.41-0.60 moderate; 0.61-0.80 substantial; 0.81-1.0 almost perfect.
References
- [1] Cohen, J. (1960). "A coefficient of agreement for nominal scales." Educational and Psychological Measurement, 20(1), 37-46.
- [2] https://scikit-learn.org/stable/modules/generated/sklearn.metrics.cohen_kappa_score.html
Methods
score(true_labels: 'DashAIDataset', probs_pred_labels: 'np.ndarray', multiclass: Optional[bool] = None) -> float
CohenKappaCalculate Cohen Kappa score between true labels and predicted labels.
Parameters
- true_labels : DashAIDataset
- A DashAI dataset with labels.
- probs_pred_labels : np.ndarray
- A two-dimensional matrix in which each column represents a class and the row values represent the probability that an example belongs to the class associated with the column.
- multiclass : bool, optional
- Whether the task is a multiclass classification. If None, it will be determined automatically from the number of unique labels.
Returns
- float
- Cohen Kappa score between true labels and predicted labels
get_metadata(cls: 'BaseMetric') -> Dict[str, Any]
BaseMetricGet metadata values for the current metric.
Returns
- Dict[str, Any]
- Dictionary with the metadata
is_multiclass(true_labels: 'np.ndarray') -> bool
ClassificationMetricDetermine if the classification problem is multiclass (more than 2 classes).
Parameters
- true_labels : np.ndarray
- Array of true labels.
Returns
- bool
- True if the problem has more than 2 unique classes, False otherwise.
Compatible with
TabularClassificationTaskImageClassificationTaskTextClassificationTask