KernelShap
Model-agnostic local explainer that estimates SHAP values via a weighted linear model.
Kernel SHAP (SHapley Additive exPlanations) unifies LIME and classic Shapley values from cooperative game theory. For each instance to explain, it fits a weighted linear model over a sampled coalition of feature subsets, where the sample weights are derived from the Shapley kernel. The resulting coefficients are the SHAP values — each one represents the marginal contribution of a feature to the model's prediction relative to a background (reference) distribution.
Because it treats the model as a black box (querying only predict_proba),
Kernel SHAP works with any classifier. The trade-off is higher computational
cost compared to model-specific SHAP implementations (Tree SHAP, Deep SHAP).
References
- [1] Lundberg, S.M. & Lee, S.I. (2017). "A Unified Approach to Interpreting Model Predictions." NeurIPS 30. https://arxiv.org/abs/1705.07874
- [2] https://shap.readthedocs.io/en/latest/generated/shap.KernelExplainer.html
Parameters
- link : string, default=
identity - Link function to connect feature importance values to the model's outputs. Options are 'identity' (identity function) or 'logit' (log-odds).
- fit_parameter_sample_background_data : boolean, default=
True - Parameter to fit the explainer. 'true' if background data must be sampled; otherwise the entire training set is used. Smaller datasets speed up the algorithm runtime.
- fit_parameter_background_fraction : number, default=
0.2 - If 'Sample background data' is selected, this corresponds to the fraction of background samples to draw from the training set.
- fit_parameter_sampling_method : string, default=
shuffle - If 'true', choose to sample random instances with 'shuffle' or summarize the dataset with 'kmeans'. If there are categorical features, 'shuffle' is used by default.
Methods
explain_instance(self, instances)
KernelShapMethod for explaining the model prediciton of an instance using the Kernel Shap method.
Parameters
- instances: DatasetDict
- Instances to be explained.
Returns
- dict
- dictionary with the shap values for each instance.
fit(self, background_dataset, sample_background_data='false', background_fraction=None, sampling_method=None)
KernelShapMethod to train the KernelShap explainer.
Parameters
- background_data: Tuple[DatasetDict, DatasetDict]
- Tuple with (input_samples, targets). Input samples are used to estimate feature attributions and establish a baseline for the calculation of SHAP values.
- sample_background_data: bool
- True if the background data must be sampled. Smaller data sets speed up the algorithm run time. False by default.
- background_fraction: float
- Proportion of background data from the training samples used to estimate SHAP values if
sample_background_data=True. - sampling_method: str
- Sampling method used to select the background samples if
sample_background_data=True. Options are 'shuffle' to select random samples or 'kmeans' to summarise the data set. 'kmeans' option can only be used if there are no categorical features.
Returns
- KernelShap object
plot(self, explanation: list[dict])
KernelShapMethod to create the explanation plot using plotly.
Parameters
- explanation: dict
- dictionary with the explanation generated by the explainer.
- Returns:
- List[dict]
- list of JSONs containing the information of the explanation plot to be rendered.
get_schema(cls) -> dict
ConfigObjectGenerates the component related Json Schema.
Returns
- dict
- Dictionary representing the Json Schema of the component.
validate_and_transform(self, raw_data: dict) -> dict
ConfigObjectIt takes the data given by the user to initialize the model and returns it with all the objects that the model needs to work.
Parameters
- raw_data : dict
- A dictionary with the data provided by the user to initialize the model.
Returns
- dict
- A validated dictionary with the necessary objects.