Code Monkey home page Code Monkey logo

Comments (14)

oegedijk avatar oegedijk commented on May 22, 2024

Hi Neeraj,

Do you have the full stack trace? Where exactly does it throw the error? From the error itself it seems that somewhere in the code where I expect a type that resolves to a bool, I get a dataframe instead.

I didn't make any recent releases, so I would guess it's probably is an incompatibility with your environment? Did you update any packages in the last two days? What OS are you using? What kind of model are you using (and what version?)? Version of shap?

But happy that overall you find it useful! Let's get this error fixed! :)

from explainerdashboard.

neerajnj10 avatar neerajnj10 commented on May 22, 2024

Hi Oege,
Wow, thank you for replying so fast, I was not expecting it.
image

Attached is the error I get when I run the titanic example.
I have run individual components of the explainer in the jupyter notebook and they work fine, but when I call explainer dashboard is when this is thrown,



Below is the error when I run explainer on my dataset. It is a binary classification model and I am using lightgbm for that purpose.

the explainer object has no decision_trees property. so setting decision_trees=False...:
ValueError Traceback (most recent call last)
in
----> 1 ExplainerDashboard(explainer, mode='inline').run(8052)

~\Anaconda3\lib\site-packages\explainerdashboard\dashboards.py in init(self, explainer, tabs, title, hide_header, header_hide_title, header_hide_selector, block_selector_callbacks, pos_label, fluid, mode, width, height, external_stylesheets, server, url_base_pathname, importances, model_summary, contributions, shap_dependence, shap_interaction, decision_trees, **kwargs)
364 block_selector_callbacks=block_selector_callbacks,
365 pos_label=pos_label,
--> 366 fluid=fluid)
367 else:
368 tabs = self._convert_str_tabs(tabs)

~\Anaconda3\lib\site-packages\explainerdashboard\dashboards.py in init(self, explainer, tabs, title, hide_title, hide_selector, block_selector_callbacks, pos_label, fluid, **kwargs)
104
105 self.selector = PosLabelSelector(explainer, pos_label=pos_label)
--> 106 self.tabs = [instantiate_component(tab, explainer, **kwargs) for tab in tabs]
107 assert len(self.tabs) > 0, 'When passing a list to tabs, need to pass at least one valid tab!'
108

~\Anaconda3\lib\site-packages\explainerdashboard\dashboards.py in (.0)
104
105 self.selector = PosLabelSelector(explainer, pos_label=pos_label)
--> 106 self.tabs = [instantiate_component(tab, explainer, **kwargs) for tab in tabs]
107 assert len(self.tabs) > 0, 'When passing a list to tabs, need to pass at least one valid tab!'
108

~\Anaconda3\lib\site-packages\explainerdashboard\dashboards.py in instantiate_component(component, explainer, **kwargs)
48
49 if inspect.isclass(component) and issubclass(component, ExplainerComponent):
---> 50 return component(explainer, **kwargs)
51 elif isinstance(component, ExplainerComponent):
52 return component

~\Anaconda3\lib\site-packages\explainerdashboard\dashboard_tabs.py in init(self, explainer, title, name, hide_selector, importance_type, depth, cats)
38
39 self.importances = ImportancesComponent(explainer, hide_selector=hide_selector,
---> 40 importance_type=importance_type, depth=depth, cats=cats)
41
42 self.register_components(self.importances)

~\Anaconda3\lib\site-packages\explainerdashboard\dashboard_components\overview_components.py in init(self, explainer, title, name, hide_type, hide_depth, hide_cats, hide_title, hide_selector, pos_label, importance_type, depth, cats)
140 self.hide_title = hide_title
141 self.hide_selector = hide_selector
--> 142 if self.explainer.cats is None or not self.explainer.cats:
143 self.hide_cats = True
144

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in nonzero(self)
2148 def nonzero(self):
2149 raise ValueError(
-> 2150 f"The truth value of a {type(self).name} is ambiguous. "
2151 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
2152 )

ValueError: The truth value of a Index is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().


I am working in jupyter notebook, using python 3.7, on windows machine, I installed the package last week Monday, ran a sample test and worked fine, but yesterday it stopped working, so I uninstalled and installed it back again, it did not fix it obviously.

from explainerdashboard.

oegedijk avatar oegedijk commented on May 22, 2024

Hi Neeraj,

The first error could be related to running an old version of dash (the line self.server.register_blueprint( is line 402 in the current dash version: https://github.com/plotly/dash/blob/dev/dash/dash.py, but seems to be line 154 in your version)

When the server parameter is kept to the default (server=True) then dash.Dash() should instantiate a flask app in line 279: self.server = flask.Flask(name) if server else None, in which case self.server is no longer bool.

So not sure what's going in your case, but I guess it's an old version. So could you pip install -U dash and see if it helps?

How did you construct the explainer for the second error? explainer.cats should be a list of strings that you passed to the constructor. But from the error it seems that in your example explainer.cats is either a pd.Series or a pd.DataFrame?

from explainerdashboard.

neerajnj10 avatar neerajnj10 commented on May 22, 2024

Hi Oege,

Thanks again!
Indeed, installing dash resolved the issue for the first example, and you were right, I actually did the pd.Series instead, and did not check the exact format that was needed, passing it in the form of list of strings, resolved it. Thank you so very much!

I have one more question though, so when we use LIME explainer, then it needs data in certain format, for example, when we before label encoding the categories, it needs dictionary of those key value label encodes to be passed for it determine correctly, if "sex" is category and if yes, 1- means Male.
Do we need something like that in this case as well? does train or test set need to be np.array format, or it does not matter.

In titanic example, the data seem to have one-hot encoding done categories, do we need to do that in all cases.?

PS- also do you know how to share the dashboard quickly, is it supposed to be depoyed on heroku or something?
Thank you for responding so fast!

Best,
Neeraj

from explainerdashboard.

oegedijk avatar oegedijk commented on May 22, 2024

Yeah, the cats parameter assumes that you have already onehot-encoded your variables with underscores(varname_category), e.g. sex_male, sex_female, etc, and then autodetects the categories.

In order to share the dashboard you need to deploy it somewhere. You should talk to IT within your organization to see if they have a server available to host it. The deployment section of the docs give some info on how to do it. Or otherwise the dash deployment documentation.

You'd probably also want to think of adding some authentication of some kind (will probably add this into the package in the near future as well): https://dash.plotly.com/authentication

from explainerdashboard.

hkoppen avatar hkoppen commented on May 22, 2024

I have a simple dataset & a SVR. However, explainer = RegressionExplainer(model, X_test, pd.Series(y_test)) yields

ValueError                                Traceback (most recent call last)
<ipython-input-14-d5cec15fc8f7> in <module>
----> 1 explainer = RegressionExplainer(model, X_test, pd.Series(y_test))

c:\users\...\pandas\core\series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    237             name = ibase.maybe_extract_name(name, data, type(self))
    238 
--> 239             if is_empty_data(data) and dtype is None:
    240                 # gh-17261
    241                 warnings.warn(

c:\users\...\pandas\core\construction.py in is_empty_data(data)
    626     is_none = data is None
    627     is_list_like_without_dtype = is_list_like(data) and not hasattr(data, "dtype")
--> 628     is_simple_empty = is_list_like_without_dtype and not data
    629     return is_none or is_simple_empty
    630 

c:\users\...\pandas\core\generic.py in __nonzero__(self)
   1438     def __nonzero__(self):
   1439         raise ValueError(
-> 1440             f"The truth value of a {type(self).__name__} is ambiguous. "
   1441             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
   1442         )

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

What does it mean, what could be the problem?

from explainerdashboard.

oegedijk avatar oegedijk commented on May 22, 2024

Do you have a deeper stacktrace? Not clear to me where this error actually originates... does it happen when you just wrap y_test in a pd.Series: pd.Series(y_test)?

from explainerdashboard.

hkoppen avatar hkoppen commented on May 22, 2024

That's everything. Without the wrap (X and y are pd.read_csv()) it is

ValueError                                Traceback (most recent call last)
<ipython-input-4-29b3018c6d0d> in <module>
      1 # Generate explainer object
----> 2 explainer = RegressionExplainer(model, X_test, y_test, cats=['Kurs', 'FTF', 'Wochentag'])

c:\users\...\explainerdashboard\explainers.py in __init__(self, model, X, y, permutation_metric, shap, X_background, model_output, cats, idxs, index_name, target, descriptions, n_jobs, permutation_cv, na_fill, precision, units)
   2451                             shap, X_background, model_output,
   2452                             cats, idxs, index_name, target, descriptions,
-> 2453                             n_jobs, permutation_cv, na_fill, precision)
   2454 
   2455         self._params_dict = {**self._params_dict, **dict(units=units)}

c:\users\...\explainerdashboard\explainers.py in __init__(self, model, X, y, permutation_metric, shap, X_background, model_output, cats, idxs, index_name, target, descriptions, n_jobs, permutation_cv, na_fill, precision)
    160 
    161         if y is not None:
--> 162             self.y = pd.Series(y).astype(precision)
    163             self.y_missing = False
    164         else:

c:\users\...\pandas\core\series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    229             name = ibase.maybe_extract_name(name, data, type(self))
    230 
--> 231             if is_empty_data(data) and dtype is None:
    232                 # gh-17261
    233                 warnings.warn(

c:\users\...\pandas\core\construction.py in is_empty_data(data)
    589     is_none = data is None
    590     is_list_like_without_dtype = is_list_like(data) and not hasattr(data, "dtype")
--> 591     is_simple_empty = is_list_like_without_dtype and not data
    592     return is_none or is_simple_empty
    593 

c:\users\...\pandas\core\generic.py in __nonzero__(self)
   1325     def __nonzero__(self):
   1326         raise ValueError(
-> 1327             f"The truth value of a {type(self).__name__} is ambiguous. "
   1328             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
   1329         )

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

from explainerdashboard.

oegedijk avatar oegedijk commented on May 22, 2024

ah, so the issue is in the line self.y = pd.Series(y).astype(precision). The precision parameter is set to 'float64' by default, but can be set to 'float32' to save on memory use. (The 0.3 release is all about saving on memory usage in production). But clearly there is something about your y_test that does not allow it to be case as a float64 dtype.

Are there any nan's in your y? What is the dtype?

from explainerdashboard.

hkoppen avatar hkoppen commented on May 22, 2024

No nan's.

y_test.dtypes returns float64 only. pd.Series(y_test).dtypes throws the same ambiguity error...

from explainerdashboard.

hkoppen avatar hkoppen commented on May 22, 2024

... ah, hence I have to use np.array(y_test)[:,0].

from explainerdashboard.

oegedijk avatar oegedijk commented on May 22, 2024

Ah, so you y_test, was not one dimensional? Usually you would get a dimensionality error though:

pd.Series(np.ones((1, 10))).astype('float32')

---------------------------------------------------------------------------
ValueError: Data must be 1-dimensional

Is there something about your input data that I could autodetect and then correct for?

from explainerdashboard.

hkoppen avatar hkoppen commented on May 22, 2024

It's a dataframe of shape (1000, 1) i.e. np.array interprets it as 1000x1-matrix. Maybe pandas.DataFrame.squeeze is the way to go here?

from explainerdashboard.

oegedijk avatar oegedijk commented on May 22, 2024

hmm. I could just put in an assertion assert isinstance(y, pd.Series) or isinstance(y, np.ndarray) or isinstance(y, list)

Or just wrap it in a try ... except block and give a more useful error message...

from explainerdashboard.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.