Consistent Individualized Feature Attribution for Tree Ensembles¶
Authors: Scott M. Lundberg, Gabriel G. Erion, Su-In Lee
Published: 2018 (Preprint)
Source: arXiv
Algorithm: TreeSHAP
arXiv: 1802.03888
Summary¶
Derives a polynomial-time exact algorithm for computing Shapley values on tree ensembles (TreeSHAP), exploiting the tree structure to avoid exponential enumeration of coalition orderings. Demonstrates that common tree feature importance heuristics are inconsistent, while TreeSHAP is the unique consistent and locally accurate attributor for models like XGBoost and Random Forests.
Abstract¶
Interpreting predictions from tree ensemble methods such as gradient boosting machines and random forests is important, yet feature attribution for trees is often heuristic and not individualized for each prediction. Here we show that popular feature attribution methods are inconsistent, meaning they can lower a feature's assigned importance when the true impact of that feature actually increases. This is a fundamental problem that casts doubt on any comparison between features. To address it we turn to recent applications of game theory and develop fast exact tree solutions for SHAP (SHapley Additive exPlanation) values, which are the unique consistent and locally accurate attribution values. We then extend SHAP values to interaction effects and define SHAP interaction values.
Links¶
Primary
Standard
Alternate
Tags¶
-
Explainability
-
Feature attribution
-
SHAP
-
Tree ensembles
-
TreeSHAP
-
Gradient boosting
-
Random forests
-
Shapley values