Comments (14)
Hi Neeraj,
Do you have the full stack trace? Where exactly does it throw the error? From the error itself it seems that somewhere in the code where I expect a type that resolves to a bool, I get a dataframe instead.
I didn't make any recent releases, so I would guess it's probably is an incompatibility with your environment? Did you update any packages in the last two days? What OS are you using? What kind of model are you using (and what version?)? Version of shap?
But happy that overall you find it useful! Let's get this error fixed! :)
from explainerdashboard.
Hi Oege,
Wow, thank you for replying so fast, I was not expecting it.
Attached is the error I get when I run the titanic example.
I have run individual components of the explainer in the jupyter notebook and they work fine, but when I call explainer dashboard is when this is thrown,
Below is the error when I run explainer on my dataset. It is a binary classification model and I am using lightgbm for that purpose.
the explainer object has no decision_trees property. so setting decision_trees=False...:
ValueError Traceback (most recent call last)
in
----> 1 ExplainerDashboard(explainer, mode='inline').run(8052)
~\Anaconda3\lib\site-packages\explainerdashboard\dashboards.py in init(self, explainer, tabs, title, hide_header, header_hide_title, header_hide_selector, block_selector_callbacks, pos_label, fluid, mode, width, height, external_stylesheets, server, url_base_pathname, importances, model_summary, contributions, shap_dependence, shap_interaction, decision_trees, **kwargs)
364 block_selector_callbacks=block_selector_callbacks,
365 pos_label=pos_label,
--> 366 fluid=fluid)
367 else:
368 tabs = self._convert_str_tabs(tabs)
~\Anaconda3\lib\site-packages\explainerdashboard\dashboards.py in init(self, explainer, tabs, title, hide_title, hide_selector, block_selector_callbacks, pos_label, fluid, **kwargs)
104
105 self.selector = PosLabelSelector(explainer, pos_label=pos_label)
--> 106 self.tabs = [instantiate_component(tab, explainer, **kwargs) for tab in tabs]
107 assert len(self.tabs) > 0, 'When passing a list to tabs, need to pass at least one valid tab!'
108
~\Anaconda3\lib\site-packages\explainerdashboard\dashboards.py in (.0)
104
105 self.selector = PosLabelSelector(explainer, pos_label=pos_label)
--> 106 self.tabs = [instantiate_component(tab, explainer, **kwargs) for tab in tabs]
107 assert len(self.tabs) > 0, 'When passing a list to tabs, need to pass at least one valid tab!'
108
~\Anaconda3\lib\site-packages\explainerdashboard\dashboards.py in instantiate_component(component, explainer, **kwargs)
48
49 if inspect.isclass(component) and issubclass(component, ExplainerComponent):
---> 50 return component(explainer, **kwargs)
51 elif isinstance(component, ExplainerComponent):
52 return component
~\Anaconda3\lib\site-packages\explainerdashboard\dashboard_tabs.py in init(self, explainer, title, name, hide_selector, importance_type, depth, cats)
38
39 self.importances = ImportancesComponent(explainer, hide_selector=hide_selector,
---> 40 importance_type=importance_type, depth=depth, cats=cats)
41
42 self.register_components(self.importances)
~\Anaconda3\lib\site-packages\explainerdashboard\dashboard_components\overview_components.py in init(self, explainer, title, name, hide_type, hide_depth, hide_cats, hide_title, hide_selector, pos_label, importance_type, depth, cats)
140 self.hide_title = hide_title
141 self.hide_selector = hide_selector
--> 142 if self.explainer.cats is None or not self.explainer.cats:
143 self.hide_cats = True
144
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in nonzero(self)
2148 def nonzero(self):
2149 raise ValueError(
-> 2150 f"The truth value of a {type(self).name} is ambiguous. "
2151 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
2152 )
ValueError: The truth value of a Index is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I am working in jupyter notebook, using python 3.7, on windows machine, I installed the package last week Monday, ran a sample test and worked fine, but yesterday it stopped working, so I uninstalled and installed it back again, it did not fix it obviously.
from explainerdashboard.
Hi Neeraj,
The first error could be related to running an old version of dash (the line self.server.register_blueprint(
is line 402 in the current dash version: https://github.com/plotly/dash/blob/dev/dash/dash.py, but seems to be line 154 in your version)
When the server
parameter is kept to the default (server=True
) then dash.Dash()
should instantiate a flask app in line 279: self.server = flask.Flask(name) if server else None
, in which case self.server is no longer bool.
So not sure what's going in your case, but I guess it's an old version. So could you pip install -U dash
and see if it helps?
How did you construct the explainer for the second error? explainer.cats should be a list of strings that you passed to the constructor. But from the error it seems that in your example explainer.cats is either a pd.Series or a pd.DataFrame?
from explainerdashboard.
Hi Oege,
Thanks again!
Indeed, installing dash resolved the issue for the first example, and you were right, I actually did the pd.Series instead, and did not check the exact format that was needed, passing it in the form of list of strings, resolved it. Thank you so very much!
I have one more question though, so when we use LIME explainer, then it needs data in certain format, for example, when we before label encoding the categories, it needs dictionary of those key value label encodes to be passed for it determine correctly, if "sex" is category and if yes, 1- means Male.
Do we need something like that in this case as well? does train or test set need to be np.array format, or it does not matter.
In titanic example, the data seem to have one-hot encoding done categories, do we need to do that in all cases.?
PS- also do you know how to share the dashboard quickly, is it supposed to be depoyed on heroku or something?
Thank you for responding so fast!
Best,
Neeraj
from explainerdashboard.
Yeah, the cats
parameter assumes that you have already onehot-encoded your variables with underscores(varname_category
), e.g. sex_male
, sex_female
, etc, and then autodetects the categories.
In order to share the dashboard you need to deploy it somewhere. You should talk to IT within your organization to see if they have a server available to host it. The deployment section of the docs give some info on how to do it. Or otherwise the dash deployment documentation.
You'd probably also want to think of adding some authentication of some kind (will probably add this into the package in the near future as well): https://dash.plotly.com/authentication
from explainerdashboard.
I have a simple dataset & a SVR. However, explainer = RegressionExplainer(model, X_test, pd.Series(y_test))
yields
ValueError Traceback (most recent call last)
<ipython-input-14-d5cec15fc8f7> in <module>
----> 1 explainer = RegressionExplainer(model, X_test, pd.Series(y_test))
c:\users\...\pandas\core\series.py in __init__(self, data, index, dtype, name, copy, fastpath)
237 name = ibase.maybe_extract_name(name, data, type(self))
238
--> 239 if is_empty_data(data) and dtype is None:
240 # gh-17261
241 warnings.warn(
c:\users\...\pandas\core\construction.py in is_empty_data(data)
626 is_none = data is None
627 is_list_like_without_dtype = is_list_like(data) and not hasattr(data, "dtype")
--> 628 is_simple_empty = is_list_like_without_dtype and not data
629 return is_none or is_simple_empty
630
c:\users\...\pandas\core\generic.py in __nonzero__(self)
1438 def __nonzero__(self):
1439 raise ValueError(
-> 1440 f"The truth value of a {type(self).__name__} is ambiguous. "
1441 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
1442 )
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
What does it mean, what could be the problem?
from explainerdashboard.
Do you have a deeper stacktrace? Not clear to me where this error actually originates... does it happen when you just wrap y_test
in a pd.Series
: pd.Series(y_test)
?
from explainerdashboard.
That's everything. Without the wrap (X and y are pd.read_csv()) it is
ValueError Traceback (most recent call last)
<ipython-input-4-29b3018c6d0d> in <module>
1 # Generate explainer object
----> 2 explainer = RegressionExplainer(model, X_test, y_test, cats=['Kurs', 'FTF', 'Wochentag'])
c:\users\...\explainerdashboard\explainers.py in __init__(self, model, X, y, permutation_metric, shap, X_background, model_output, cats, idxs, index_name, target, descriptions, n_jobs, permutation_cv, na_fill, precision, units)
2451 shap, X_background, model_output,
2452 cats, idxs, index_name, target, descriptions,
-> 2453 n_jobs, permutation_cv, na_fill, precision)
2454
2455 self._params_dict = {**self._params_dict, **dict(units=units)}
c:\users\...\explainerdashboard\explainers.py in __init__(self, model, X, y, permutation_metric, shap, X_background, model_output, cats, idxs, index_name, target, descriptions, n_jobs, permutation_cv, na_fill, precision)
160
161 if y is not None:
--> 162 self.y = pd.Series(y).astype(precision)
163 self.y_missing = False
164 else:
c:\users\...\pandas\core\series.py in __init__(self, data, index, dtype, name, copy, fastpath)
229 name = ibase.maybe_extract_name(name, data, type(self))
230
--> 231 if is_empty_data(data) and dtype is None:
232 # gh-17261
233 warnings.warn(
c:\users\...\pandas\core\construction.py in is_empty_data(data)
589 is_none = data is None
590 is_list_like_without_dtype = is_list_like(data) and not hasattr(data, "dtype")
--> 591 is_simple_empty = is_list_like_without_dtype and not data
592 return is_none or is_simple_empty
593
c:\users\...\pandas\core\generic.py in __nonzero__(self)
1325 def __nonzero__(self):
1326 raise ValueError(
-> 1327 f"The truth value of a {type(self).__name__} is ambiguous. "
1328 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
1329 )
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
from explainerdashboard.
ah, so the issue is in the line self.y = pd.Series(y).astype(precision)
. The precision
parameter is set to 'float64'
by default, but can be set to 'float32'
to save on memory use. (The 0.3 release is all about saving on memory usage in production). But clearly there is something about your y_test
that does not allow it to be case as a float64
dtype
.
Are there any nan's in your y
? What is the dtype
?
from explainerdashboard.
No nan's.
y_test.dtypes
returns float64 only. pd.Series(y_test).dtypes
throws the same ambiguity error...
from explainerdashboard.
... ah, hence I have to use np.array(y_test)[:,0]
.
from explainerdashboard.
Ah, so you y_test, was not one dimensional? Usually you would get a dimensionality error though:
pd.Series(np.ones((1, 10))).astype('float32')
---------------------------------------------------------------------------
ValueError: Data must be 1-dimensional
Is there something about your input data that I could autodetect and then correct for?
from explainerdashboard.
It's a dataframe of shape (1000, 1)
i.e. np.array interprets it as 1000x1-matrix. Maybe pandas.DataFrame.squeeze
is the way to go here?
from explainerdashboard.
hmm. I could just put in an assertion assert isinstance(y, pd.Series) or isinstance(y, np.ndarray) or isinstance(y, list)
Or just wrap it in a try ... except
block and give a more useful error message...
from explainerdashboard.
Related Issues (20)
- Autogluon and explainerdashboard integration HOT 4
- ImportError: cannot import name 'dtreeviz' from 'dtreeviz.trees' HOT 1
- Dashboard loading stuck in docker
- Categorical columns HOT 3
- Add support for string labels
- support for pandas 2.0 HOT 2
- whatif component customization limiting range and rounding off decimal value
- Add support for CalibratedClassifierCV algorithm
- Support for GPUTree HOT 5
- ImportError: cannot import name 'dtreeviz' from 'dtreeviz.trees' HOT 2
- ValueError: Must pass 2-d input. HOT 1
- Showcase in a HuggingFace space? HOT 1
- Dashboard is not running correctly when I am trying to use saved joblib file. HOT 3
- Dashboards not loading from saved yaml, joblib files. HOT 2
- Speed up Dashboard joblib/yaml export HOT 3
- skorch models raising The SHAP explanations do not sum up to the model's output
- integration tests failing due dash_duo.get_logs() returning None HOT 1
- Update component plots when selecting data HOT 3
- Aggregated SHAP values for one hot encoded features are overestimated
- Logodds: difference between contributions plot and prediction box
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from explainerdashboard.