Comments (6)
Unfortunately not via TreeSHAP in LightGBM. But you could crunch interactions via the {treeshap} package.
from shapviz.
I assume it involves hacking C++ code, which I can't help with :/
from shapviz.
microsoft/LightGBM#6058 @hbaniecki :-)
from shapviz.
Oh, hmm...
from shapviz.
Tried to use treeshap but got an error #> Error in S_inter[, v, color_var]: subscript out of bounds
Here's the code
library(tidymodels)
#> Warning: package 'tidymodels' was built under R version 4.2.3
#> Warning: package 'broom' was built under R version 4.2.3
#> Warning: package 'dials' was built under R version 4.2.3
#> Warning: package 'dplyr' was built under R version 4.2.3
#> Warning: package 'ggplot2' was built under R version 4.2.3
#> Warning: package 'parsnip' was built under R version 4.2.3
#> Warning: package 'recipes' was built under R version 4.2.3
#> Warning: package 'tibble' was built under R version 4.2.3
#> Warning: package 'tidyr' was built under R version 4.2.3
#> Warning: package 'tune' was built under R version 4.2.3
#> Warning: package 'workflowsets' was built under R version 4.2.3
#> Warning: package 'yardstick' was built under R version 4.2.3
library(shapviz)
#> Warning: package 'shapviz' was built under R version 4.2.3
library(treeshap)
library(lightgbm)
#> Warning: package 'lightgbm' was built under R version 4.2.3
#> Loading required package: R6
#>
#> Attaching package: 'lightgbm'
#> The following object is masked from 'package:dplyr':
#>
#> slice
library(datasets)
library(bonsai)
#> Warning: package 'bonsai' was built under R version 4.2.3
# Use the fifa20 dataset
fifa20 <- fifa20$data %>%
select(-work_rate) %>%
bind_cols(data.frame(target = fifa20$target))
# Split the data
set.seed(123)
split <- initial_split(fifa20)
train <- training(split)
test <- testing(split)
# Recipe
rec <- recipe(target ~ ., data = train)
# Model specification
boost_spec <- boost_tree(
mode = "regression",
trees = 200,
tree_depth = 6
) %>%
set_engine("lightgbm") %>%
set_mode("regression")
# Workflow
workflow <- workflow() %>%
add_recipe(rec) %>%
add_model(boost_spec)
# Fit the model
boost_model <- workflow %>% fit(data = train)
# Create shap object with shapviz
shap_lgbm <- shapviz(extract_fit_engine(boost_model),
as.matrix(test %>% select(-target)),
test %>% select(-target))
# Create unified model representation
unified_lgbm <- treeshap::lightgbm.unify(extract_fit_engine(boost_model), train %>% select(-target))
# Derive interactions
interactions_lgbm <- treeshap::treeshap(unified_lgbm, test %>% select(-target), interactions = T, verbose = 0)
# Plot dependences
shap_lgbm$S_inter <- interactions_lgbm$interactions
sv_dependence(shap_lgbm, v = "overall", interactions = T, color_var = "height_cm")
#> Error in S_inter[, v, color_var]: subscript out of bounds
dim(shap_lgbm$S_inter)
#> [1] 54 54 4570
from shapviz.
An interaction cannot be assigned to a shapviz object, so this code here is wrong:
shap_lgbm$S_inter <- interactions_lgbm$interactions
This works, but I would decompose less rows and divide the response by 1e6 (or so):
shap_lgbm <- shapviz(interactions_lgbm)
top4 <- names(head(sv_importance(shap_lgbm, kind = "no"), 4))
sv_interaction(shap_lgbm[1:1000, top4])
sv_dependence(shap_lgbm, v = "overall", color_var = top4, interactions = TRUE)
from shapviz.
Related Issues (20)
- Multiclass/Multioutput/multiple models HOT 1
- Multiple plots: align SHAP axis limits
- issue with sv_importance function HOT 4
- Idea: sv_dependence2D() HOT 1
- Multioutput model names HOT 1
- Best practice for visualizing tidymodels last_fit() object HOT 6
- Cannot rename colnames/dimnames in post-processing HOT 2
- maintenance: changes in package_version() HOT 1
- Cannot set x-axis limits with beeswarm plot when data exist outside of specified xlims HOT 3
- Odd findings in sv_importance() using beeswarm. HOT 14
- Stacked/dodged bar plots? HOT 1
- Controlling threads HOT 2
- Individual baselines HOT 1
- Treatment of categorical features in `potential_interactions()`: suggestion to use R squared instead of squared correlation HOT 15
- Interaction importance HOT 4
- Not compatible with mlr3 package and DALEXtra package HOT 6
- Custom color palettes for the beeswarm plot HOT 1
- ENH Allow sv_importance() and sv_interaction() to be unsorted
- Baseline-value question HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from shapviz.