Code Monkey home page Code Monkey logo

microsoft / responsible-ai-toolbox Goto Github PK

View Code? Open in Web Editor NEW
1.2K 25.0 320.0 113.89 MB

Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.

Home Page: https://responsibleaitoolbox.ai/

License: MIT License

Python 14.21% JavaScript 0.28% TypeScript 82.52% HTML 0.01% Shell 0.01% Jupyter Notebook 2.98%
ui responsible-ai data-science fairness fairness-ml fairness-ai explainable-ai explainable-ml explainability machinelearning

responsible-ai-toolbox's Introduction

MIT license

Responsible AI Widgets Python Build UI deployment to test environment

PyPI raiwidgets PyPI responsibleai PyPI erroranalysis PyPI raiutils PyPI rai_test_utils

npm model-assessment

Responsible AI Toolbox

Responsible AI is an approach to assessing, developing, and deploying AI systems in a safe, trustworthy, and ethical manner, and take responsible decisions and actions.

Responsible AI Toolbox is a suite of tools providing a collection of model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.

ResponsibleAIToolboxOverview

The Toolbox consists of three repositories:

 

Repository Tools Covered
Responsible-AI-Toolbox Repository (Here) This repository contains four visualization widgets for model assessment and decision making:
1. Responsible AI dashboard, a single pane of glass bringing together several mature Responsible AI tools from the toolbox for a holistic responsible assessment and debugging of models and making informed business decisions. With this dashboard, you can identify model errors, diagnose why those errors are happening, and mitigate them. Moreover, the causal decision-making capabilities provide actionable insights to your stakeholders and customers.
2. Error Analysis dashboard, for identifying model errors and discovering cohorts of data for which the model underperforms.
3. Interpretability dashboard, for understanding model predictions. This dashboard is powered by InterpretML.
4. Fairness dashboard, for understanding model’s fairness issues using various group-fairness metrics across sensitive features and cohorts. This dashboard is powered by Fairlearn.
Responsible-AI-Toolbox-Mitigations Repository The Responsible AI Mitigations Library helps AI practitioners explore different measurements and mitigation steps that may be most appropriate when the model underperforms for a given data cohort. The library currently has two modules:
1. DataProcessing, which offers mitigation techniques for improving model performance for specific cohorts.
2. DataBalanceAnalysis, which provides metrics for diagnosing errors that originate from data imbalance either on class labels or feature values.
3. Cohort: provides classes for handling and managing cohorts, which allows the creation of custom pipelines for each cohort in an easy and intuitive interface. The module also provides techniques for learning different decoupled estimators (models) for different cohorts and combining them in a way that optimizes different definitions of group fairness.
Responsible-AI-Tracker Repository Responsible AI Toolbox Tracker is a JupyterLab extension for managing, tracking, and comparing results of machine learning experiments for model improvement. Using this extension, users can view models, code, and visualization artifacts within the same framework enabling therefore fast model iteration and evaluation processes. Main functionalities include:
1. Managing and linking model improvement artifacts
2. Disaggregated model evaluation and comparisons
3. Integration with the Responsible AI Mitigations library
4. Integration with mlflow
Responsible-AI-Toolbox-GenBit Repository The Responsible AI Gender Bias (GenBit) Library helps AI practitioners measure gender bias in Natural Language Processing (NLP) datasets. The main goal of GenBit is to analyze your text corpora and compute metrics that give insights into the gender bias present in a corpus.

Introducing Responsible AI dashboard

Responsible AI dashboard is a single pane of glass, enabling you to easily flow through different stages of model debugging and decision-making. This customizable experience can be taken in a multitude of directions, from analyzing the model or data holistically, to conducting a deep dive or comparison on cohorts of interest, to explaining and perturbing model predictions for individual instances, and to informing users on business decisions and actions.

ResponsibleAIDashboard

In order to achieve these capabilities, the dashboard integrates together ideas and technologies from several open-source toolkits in the areas of

  • Error Analysis powered by Error Analysis, which identifies cohorts of data with higher error rate than the overall benchmark. These discrepancies might occur when the system or model underperforms for specific demographic groups or infrequently observed input conditions in the training data.

  • Fairness Assessment powered by Fairlearn, which identifies which groups of people may be disproportionately negatively impacted by an AI system and in what ways.

  • Model Interpretability powered by InterpretML, which explains blackbox models, helping users understand their model's global behavior, or the reasons behind individual predictions.

  • Counterfactual Analysis powered by DiCE, which shows feature-perturbed versions of the same datapoint who would have received a different prediction outcome, e.g., Taylor's loan has been rejected by the model. But they would have received the loan if their income was higher by $10,000.

  • Causal Analysis powered by EconML, which focuses on answering What If-style questions to apply data-driven decision-making – how would revenue be affected if a corporation pursues a new pricing strategy? Would a new medication improve a patient’s condition, all else equal?

  • Data Balance powered by Responsible AI, which helps users gain an overall understanding of their data, identify features receiving the positive outcome more than others, and visualize feature distributions.

Responsible AI dashboard is designed to achieve the following goals:

  • To help further accelerate engineering processes in machine learning by enabling practitioners to design customizable workflows and tailor Responsible AI dashboards that best fit with their model assessment and data-driven decision making scenarios.
  • To help model developers create end to end and fluid debugging experiences and navigate seamlessly through error identification and diagnosis by using interactive visualizations that identify errors, inspect the data, generate global and local explanations models, and potentially inspect problematic examples.
  • To help business stakeholders explore causal relationships in the data and take informed decisions in the real world.

This repository contains the Jupyter notebooks with examples to showcase how to use this widget. Get started here.

Installation

Use the following pip command to install the Responsible AI Toolbox.

If running in jupyter, please make sure to restart the jupyter kernel after installing.

pip install raiwidgets

Responsible AI dashboard Customization

The Responsible AI Toolbox’s strength lies in its customizability. It empowers users to design tailored, end-to-end model debugging and decision-making workflows that address their particular needs. Need some inspiration? Here are some examples of how Toolbox components can be put together to analyze scenarios in different ways:

Please note that model overview (including fairness analysis) and data explorer components are activated by default!  

Responsible AI Dashboard Flow Use Case
Model Overview -> Error Analysis -> Data Explorer To identify model errors and diagnose them by understanding the underlying data distribution
Model Overview -> Fairness Assessment -> Data Explorer To identify model fairness issues and diagnose them by understanding the underlying data distribution
Model Overview -> Error Analysis -> Counterfactuals Analysis and What-If To diagnose errors in individual instances with counterfactual analysis (minimum change to lead to a different model prediction)
Model Overview -> Data Explorer -> Data Balance To understand the root cause of errors and fairness issues introduced via data imbalances or lack of representation of a particular data cohort
Model Overview -> Interpretability To diagnose model errors through understanding how the model has made its predictions
Data Explorer -> Causal Inference To distinguish between correlations and causations in the data or decide the best treatments to apply to see a positive outcome
Interpretability -> Causal Inference To learn whether the factors that model has used for decision making has any causal effect on the real-world outcome.
Data Explorer -> Counterfactuals Analysis and What-If To address customer questions about what they can do next time to get a different outcome from an AI.
Data Explorer -> Data Balance To gain an overall understanding of the data, identify features receiving the positive outcome more than others, and visualize feature distributions

Useful Links

Tabular Examples:

Text Examples:

Vision Examples:

Supported Models

This Responsible AI Toolbox API supports models that are trained on datasets in Python numpy.ndarray, pandas.DataFrame, iml.datatypes.DenseData, or scipy.sparse.csr_matrix format.

The explanation functions of Interpret-Community accept both models and pipelines as input as long as the model or pipeline implements a predict or predict_proba function that conforms to the Scikit convention. If not compatible, you can wrap your model's prediction function into a wrapper function that transforms the output into the format that is supported (predict or predict_proba of Scikit), and pass that wrapper function to your selected interpretability techniques.

If a pipeline script is provided, the explanation function assumes that the running pipeline script returns a prediction. The repository also supports models trained via PyTorch, TensorFlow, and Keras deep learning frameworks.

Other Use Cases

Tools within the Responsible AI Toolbox can also be used with AI models offered as APIs by providers such as Azure Cognitive Services. To see example use cases, see the folders below:

Maintainers

responsible-ai-toolbox's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

responsible-ai-toolbox's Issues

Fairness: Configurable single model view table

We'd like the table in the single model view to be configurable so that we can select additional metrics to show in the table. For example, in addition to the previously selected performance (e.g. accuracy) and fairness metrics (e.g. demographic parity difference) it may make sense to compare other performance metrics (e.g. F1 score) or fairness metrics (e.g. false positive rate difference).

This builds onto #63 as an extension that's nice to have but not essential for getting this to users.

Export atomic pages

Pages in dashboards should be exportable and then consumable as standalone components.
Currently, they rely on data processing that happens in the main component, this should be extracted to a static method, so exportable page components can have interfaces that are a subset of the main page interface and use its helper functions for processing.

Azure ML VM environment

The environment detection test for determining if it is an azure-ml VM returns a false positive when the user runs the azureml sdk on a local machine. The sdk sets the os.environ variable we check against.

Fairness: surface errors properly as opposed to showing spinner

Example: balanced accuracy as performance metric, where one group has only a single class (say, all labels = 0, none with 1)

Error from python metrics code:
error=Only one class present in y_true. ROC AUC score is not defined in that case.

However, the spinner is shown as opposed to an explanation for why it will never show up.

image

Fairness: download report should have more information

Currently the download button functionality in Insights.tsx is disabled. @nessamilan 's idea for that functionality:

If I’m thinking from the perspective of a user printing a full report, let’s say, to email someone else or add to a document, It would be nice to include:

  • model name
  • data set specs (when applicable)
  • date report is generated
  • drop down category label: selection
  • visualization

Alternatively, if the chart is all some users may want, we could offer the option to download visualization -or- full report (we would need to tweak the call to action UI slightly).

Error Analysis: Widget crash - Both in full screen or inline in jupyter

Happened in 2 occasions:
1- When clicking the first cell in a 1-dimensional heatmap.
2- When clicking Cohort Info.

This is the last part of the error in console.

Uncaught TypeError: Cannot read property 'label' of undefined
at (index):371456
at Array.map ()
at ErrorCohort.cohortFiltersToString ((index):371453)
at (index):371503
at Array.map ()
at ErrorCohort.cohortCompositeFiltersToString ((index):371501)
at (index):371506
at Array.map ()
at ErrorCohort.cohortCompositeFiltersToString ((index):371501)
at (index):371506
(index):338687 Warning: Can't perform a React state update on an unmounted component. This is a no-op, but it indicates a memory leak in your application. To fix, cancel all subscriptions and asynchronous tasks in the componentWillUnmount method.
in TreeViewRenderer (created by ErrorAnalysisView)

=====================================
This is the full error

Download the React DevTools for a better development experience: https://fb.me/react-devtools
(index):338687 Warning: Using UNSAFE_componentWillReceiveProps in strict mode is not recommended and may indicate bugs in your code. See https://fb.me/react-unsafe-component-lifecycles for details.

  • Move data fetching code or side effects to componentDidUpdate.
  • If you're updating state whenever props change, refactor your code to use memoization techniques or move it to static getDerivedStateFromProps. Learn more at: https://fb.me/react-derived-state

Please update the following components: DropdownBase, ResizeGroupBase
printWarning @ (index):338687
(index):338687 Warning: Using UNSAFE_componentWillUpdate in strict mode is not recommended and may indicate bugs in your code. See https://fb.me/react-unsafe-component-lifecycles for details.

  • Move data fetching code or side effects to componentDidUpdate.

Please update the following components: OverflowSetBase
printWarning @ (index):338687
(index):338687 Warning: findDOMNode is deprecated in StrictMode. findDOMNode was passed an instance of WithResponsiveMode which is inside StrictMode. Instead, add a ref directly to the element you want to reference. Learn more about using refs safely here: https://fb.me/react-strict-mode-find-node
in div (created by DropdownBase)
in DropdownBase (created by WithResponsiveMode)
in WithResponsiveMode (created by StyledWithResponsiveMode)
in StyledWithResponsiveMode (created by commandBarButtonAs)
in commandBarButtonAs (created by OuterWithDefaultRender)
in OuterWithDefaultRender (created by OverflowSetBase)
in div (created by OverflowSetBase)
in div (created by OverflowSetBase)
in OverflowSetBase (created by StyledOverflowSetBase)
in StyledOverflowSetBase (created by ResizeGroupBase)
in div (created by FocusZone)
in FocusZone (created by ResizeGroupBase)
in div (created by ResizeGroupBase)
in div (created by ResizeGroupBase)
in div (created by ResizeGroupBase)
in ResizeGroupBase (created by CommandBarBase)
in CommandBarBase (created by StyledCommandBarBase)
in StyledCommandBarBase (created by MainMenu)
in div (created by MainMenu)
in div (created by MainMenu)
in MainMenu (created by ErrorAnalysisDashboard)
in div (created by ErrorAnalysisDashboard)
in ErrorAnalysisDashboard (created by ErrorAnalysis)
in ErrorAnalysis (created by App)
in App
in StrictMode
printWarning @ (index):338687
(index):364550 Warning: Can't call setState on a component that is not yet mounted. This is a no-op, but it might indicate a bug in your application. Instead, assign to this.state directly or define a state = {}; class property with the desired state in the TreeViewRenderer component.
printWarning @ (index):364550
(index):364550 Warning: Can't call forceUpdate on a component that is not yet mounted. This is a no-op, but it might indicate a bug in your application. Instead, assign to this.state directly or define a state = {}; class property with the desired state in the TreeViewRenderer component.
printWarning @ (index):364550
(index):338687 Warning: Using UNSAFE_componentWillMount in strict mode is not recommended and may indicate bugs in your code. See https://fb.me/react-unsafe-component-lifecycles for details.

  • Move code with side effects to componentDidMount, and set initial state in the constructor.

Please update the following components: CalloutContentBase, Popup
printWarning @ (index):338687
(index):338687 Warning: Using UNSAFE_componentWillUpdate in strict mode is not recommended and may indicate bugs in your code. See https://fb.me/react-unsafe-component-lifecycles for details.

  • Move data fetching code or side effects to componentDidUpdate.

Please update the following components: CalloutContentBase
printWarning @ (index):338687
(index):338687 Warning: Using UNSAFE_componentWillReceiveProps in strict mode is not recommended and may indicate bugs in your code. See https://fb.me/react-unsafe-component-lifecycles for details.

  • Move data fetching code or side effects to componentDidUpdate.
  • If you're updating state whenever props change, refactor your code to use memoization techniques or move it to static getDerivedStateFromProps. Learn more at: https://fb.me/react-derived-state

Please update the following components: Autofill, ComboBox
printWarning @ (index):338687
(index):364550 Warning: Can't call setState on a component that is not yet mounted. This is a no-op, but it might indicate a bug in your application. Instead, assign to this.state directly or define a state = {}; class property with the desired state in the MatrixArea component.
printWarning @ (index):364550
(index):371456 Uncaught TypeError: Cannot read property 'label' of undefined
at (index):371456
at Array.map ()
at ErrorCohort.cohortFiltersToString ((index):371453)
at (index):371503
at Array.map ()
at ErrorCohort.cohortCompositeFiltersToString ((index):371501)
at (index):371506
at Array.map ()
at ErrorCohort.cohortCompositeFiltersToString ((index):371501)
at (index):371506
(index):338687 Warning: Using UNSAFE_componentWillReceiveProps in strict mode is not recommended and may indicate bugs in your code. See https://fb.me/react-unsafe-component-lifecycles for details.

  • Move data fetching code or side effects to componentDidUpdate.
  • If you're updating state whenever props change, refactor your code to use memoization techniques or move it to static getDerivedStateFromProps. Learn more at: https://fb.me/react-derived-state

Please update the following components: FocusTrapZone
printWarning @ (index):338687
(index):358126 The above error occurred in the component:
in PredictionPath (created by CohortInfo)
in div (created by CohortInfo)
in div (created by CohortInfo)
in div (created by PanelBase)
in div (created by PanelBase)
in div (created by PanelBase)
in div (created by FocusTrapZone)
in FocusTrapZone (created by PanelBase)
in div (created by PanelBase)
in div (created by Popup)
in Popup (created by PanelBase)
in div (created by FabricBase)
in FabricBase (created by StyledFabricBase)
in StyledFabricBase (created by LayerBase)
in span (created by LayerBase)
in LayerBase (created by Context.Consumer)
in CustomizedLayer (created by StyledCustomizedLayer)
in StyledCustomizedLayer (created by PanelBase)
in PanelBase (created by StyledPanelBase)
in StyledPanelBase (created by CohortInfo)
in CohortInfo (created by ErrorAnalysisDashboard)
in div (created by FabricBase)
in FabricBase (created by StyledFabricBase)
in StyledFabricBase (created by LayerBase)
in span (created by LayerBase)
in LayerBase (created by Context.Consumer)
in CustomizedLayer (created by StyledCustomizedLayer)
in StyledCustomizedLayer (created by ErrorAnalysisDashboard)
in Customizer (created by ErrorAnalysisDashboard)
in div (created by ErrorAnalysisDashboard)
in div (created by ErrorAnalysisDashboard)
in ErrorAnalysisDashboard (created by ErrorAnalysis)
in ErrorAnalysis (created by App)
in App
in StrictMode

Consider adding an error boundary to your tree to customize error handling behavior.
Visit https://fb.me/react-error-boundaries to learn more about error boundaries.
logCapturedError @ (index):358126
(index):349701 Uncaught TypeError: Cannot read property 'label' of undefined
at (index):371456
at Array.map ()
at ErrorCohort.cohortFiltersToString ((index):371453)
at (index):371503
at Array.map ()
at ErrorCohort.cohortCompositeFiltersToString ((index):371501)
at (index):371506
at Array.map ()
at ErrorCohort.cohortCompositeFiltersToString ((index):371501)
at (index):371506
(index):338687 Warning: Can't perform a React state update on an unmounted component. This is a no-op, but it indicates a memory leak in your application. To fix, cancel all subscriptions and asynchronous tasks in the componentWillUnmount method.
in TreeViewRenderer (created by ErrorAnalysisView)
printWarning @ (index):338687
DevTools failed to load SourceMap: Could not load content for http://localhost:5000/runtime.js.map: HTTP error: status code 404, net::ERR_HTTP_RESPONSE_CODE_FAILURE
DevTools failed to load SourceMap: Could not load content for http://localhost:5000/polyfills.js.map: HTTP error: status code 404, net::ERR_HTTP_RESPONSE_CODE_FAILURE
DevTools failed to load SourceMap: Could not load content for http://localhost:5000/vendor.js.map: HTTP error: status code 404, net::ERR_HTTP_RESPONSE_CODE_FAILURE
DevTools failed to load SourceMap: Could not load content for http://localhost:5000/main.js.map: HTTP error: status code 404, net::ERR_HTTP_RESPONSE_CODE_FAILURE

Upgrade FairlearnDashboard in python package to work with fairlearn version vNext (>0.4.6)

The next version adds several important capabilities that we require including additional metrics. Those capabilities are currently commented out. This issue tracks the work required to uncomment and thereby enable the new metrics.

Missing metrics in v0.4.6:

  • log loss
  • F1 score
  • several more missing but the capabilities in Fairlearn need to be set up as well (e.g. for parity metrics)

Lots of missing disparity metrics in FairlearnDashboard

Currently only 4 shown, and their description is identical to the title which isn't helpful.

These should be identical with the ones from the metrics proposal, and vary dependending on the task (regression, classification, etc.) https://github.com/fairlearn/fairlearn-proposals/blob/master/api/METRICS.md

This also means we need to add descriptions for all the metrics.

For classification we'll have at least around 16 metrics, so the dropdown by itself may not be a good solution long term.

Extract common components

Any components that could have use across projects should be moved to the common-ui package, so that code is shared. This will require defining interfaces for things that up until now have been defined by concrete classes. (eg. Cohort, dataset)

Fairness: key insights for single model view not defined yet

The insights work only for model comparison right now, because we haven't defined what they should be for a single model. Probably something along the lines of

  • disparity in <chosen performance metric> is <disparity>. Max value <val> is from group <max group>, min value <val> is from group <min group>
  • disparity in <chosen parity metric> is <disparity>. Max value in the underlying metric <metric> is <val> from group <max group>, min value <val> is from group <min group>

Error Analysis: Shifting a cohort does not refresh the tree or the heatmap

See error below:

Found array with 0 sample(s) (shape=(0, 6)) while a minimum of 1 is required.

===================

Traceback (most recent call last):
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\raiwidgets\error_analysis_dashboard_input.py", line 398, in debug_ml
diff = self._model.predict(input_data) != true_y
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\utils\metaestimators.py", line 119, in
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\pipeline.py", line 407, in predict
Xt = transform.transform(Xt)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\compose_column_transformer.py", line 604, in transform
Xs = self._fit_transform(X, None, _transform_one, fitted=True)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\compose_column_transformer.py", line 467, in _fit_transform
self._iter(fitted=fitted, replace_strings=True), 1))
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\joblib\parallel.py", line 1048, in call
if self.dispatch_one_batch(iterator):
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\joblib\parallel.py", line 866, in dispatch_one_batch
self._dispatch(tasks)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\joblib\parallel.py", line 784, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\joblib_parallel_backends.py", line 208, in apply_async
result = ImmediateResult(func)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\joblib_parallel_backends.py", line 572, in init
self.results = batch()
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\joblib\parallel.py", line 263, in call
for func, args, kwargs in self.items]
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\joblib\parallel.py", line 263, in
for func, args, kwargs in self.items]
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\pipeline.py", line 719, in _transform_one
res = transformer.transform(X)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\pipeline.py", line 549, in _transform
Xt = transform.transform(Xt)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\impute_base.py", line 415, in transform
X = self._validate_input(X, in_fit=False)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\impute_base.py", line 251, in _validate_input
raise ve
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\impute_base.py", line 244, in _validate_input
copy=self.copy)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\base.py", line 420, in _validate_data
X = check_array(X, **check_params)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\utils\validation.py", line 72, in inner_f
return f(**kwargs)
File "C:\Users\benushi.REDMOND.conda\envs\ea\lib\site-packages\sklearn\utils\validation.py", line 653, in check_array
context))
ValueError: Found array with 0 sample(s) (shape=(0, 6)) while a minimum of 1 is required.

Fairness: roc_auc_score failure should be caught

Example: balanced accuracy score (perhaps also ROC AUC score)

{"error":"Only one class present in y_true. ROC AUC score is not defined in that case.","locals":"{'data': {'binVector': [1, 1, 2, 2, 1, 1, 1, 1, 0, 0, 2, 2, 0, 3, 2, 2, 0, 4, 4, 4, 1, 2, 1, 1, 4, 1, 1, 2, 1, 3, 1, 4, 1, 2, 0, 1, 1, 3, 2, 2, 2, 0, 2, 0, 1, 4, 0, 2, 1, 2, 3, 2, 3, 1, 1, 1, 1, 1, 1, 4, 2, 2, 1, 1, 3, 1, 4, 3, 4, 0, 2, 2, 1, 2, 3, 1, 0, 1, 1, 0, 3, 4, 2, 1, 2, 1, 1, 0, 3, 4, 0, 2, 2, 2, 0, 0, 2, 1, 1, 1, 0, 1, 2, 2, 3, 0, 3, 1, 2, 3, 1, 4, 3, 2], 'metricKey': 'balanced_accuracy_score', 'modelIndex': 0, 'true_y': [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1], 'predicted_ys': [[0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1], [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1]], 'dataset': [[13.4], [13.21], [14.02], [14.26], [13.03], [11.34], [12.05], [11.7], [7.729], [10.26], [14.69], [14.62], [9.397], [16.84], [14.64], [15.46], [9.042], [20.51], [19.55], [20.94], [11.84], [16.24], [13.47], [11.84], [21.37], [11.93], [10.9], [13.73], [13.4], [18.77], [12.81], [20.57], [13.14], [16.16], [9.738], [12.68], [13.27], [19.0], [14.86], [15.08], [15.13], [8.878], [16.02], [10.32], [13.24], [21.56], [8.671], [14.03], [11.81], [14.54], [19.17], [14.86], [18.45], [11.6], [12.46], [11.47], [11.32], [12.27], [12.18], [21.75], [13.77], [16.07], [11.14], [12.87], [17.6], [12.04], [22.27], [17.46], [20.58], [9.465], [14.42], [16.13], [11.52], [14.6], [18.65], [13.0], [8.571], [13.05], [11.66], [8.734], [16.6], [20.18], [15.78], [12.43], [16.03], [12.32], [13.5], [8.196], [17.19], [20.6], [9.504], [15.12], [14.99], [15.22], [9.405], [9.876], [15.5], [12.89], [11.57], [11.43], [8.888], [12.36], [14.97], [14.05], [18.46], [9.268], [17.91], [12.88], [15.61], [17.42], [12.75], [20.18], [18.31], [15.04]], 'classification_methods': ['accuracy_score', 'balanced_accuracy_score', 'precision_score', 'recall_score', 'f1_score'], 'regression_methods': ['root_mean_squared_error', 'mean_squared_error', 'mean_absolute_error', 'r2_score'], 'probability_methods': ['auc', 'root_mean_squared_error', 'balanced_root_mean_squared_error', 'mean_squared_error', 'mean_absolute_error', 'log_loss'], 'model_names': ['a', 'b']}, 'metric_method': <function roc_auc_score at 0x000002A424AF6678>, 'ex': ValueError('Only one class present in y_true. ROC AUC score is not defined in that case.'), 'sys': <module 'sys' (built-in)>, 'traceback': <module 'traceback' from 'C:\\\\Anaconda3\\\\lib\\\\traceback.py'>, 'exc_type': <class 'ValueError'>, 'exc_value': ValueError('Only one class present in y_true. ROC AUC score is not defined in that case.'), 'exc_traceback': <traceback object at 0x000002A42C7705C8>, 'self': <raiwidgets.fairness_dashboard.FairnessDashboard object at 0x000002A406220D88>}","stacktrace":"['Traceback (most recent call last):\\n', '  File \"c:\\\\git\\\\responsible-ai-core\\\\raiwidgets\\\\raiwidgets\\\\fairness_dashboard.py\", line 123, in fairness_metrics_calculation\\n    sensitive_features=data[\"binVector\"])\\n', '  File \"C:\\\\Anaconda3\\\\lib\\\\site-packages\\\\fairlearn\\\\metrics\\\\_metric_frame.py\", line 151, in __init__\\n    self._by_group = self._compute_by_group(func_dict, y_t, y_p, sf_list, cf_list)\\n', '  File \"C:\\\\Anaconda3\\\\lib\\\\site-packages\\\\fairlearn\\\\metrics\\\\_metric_frame.py\", line 169, in _compute_by_group\\n    return self._compute_dataframe_from_rows(func_dict, y_true, y_pred, rows)\\n', '  File \"C:\\\\Anaconda3\\\\lib\\\\site-packages\\\\fairlearn\\\\metrics\\\\_metric_frame.py\", line 194, in _compute_dataframe_from_rows\\n    curr_metric = func_dict[func_name].evaluate(y_true, y_pred, mask)\\n', '  File \"C:\\\\Anaconda3\\\\lib\\\\site-packages\\\\fairlearn\\\\metrics\\\\_function_container.py\", line 103, in evaluate\\n    return self.func_(y_true[mask], y_pred[mask], **params)\\n', '  File \"C:\\\\Anaconda3\\\\lib\\\\site-packages\\\\sklearn\\\\utils\\\\validation.py\", line 73, in inner_f\\n    return f(**kwargs)\\n', '  File \"C:\\\\Anaconda3\\\\lib\\\\site-packages\\\\sklearn\\\\metrics\\\\_ranking.py\", line 393, in roc_auc_score\\n    sample_weight=sample_weight)\\n', '  File \"C:\\\\Anaconda3\\\\lib\\\\site-packages\\\\sklearn\\\\metrics\\\\_base.py\", line 77, in _average_binary_score\\n    return binary_metric(y_true, y_score, sample_weight=sample_weight)\\n', '  File \"C:\\\\Anaconda3\\\\lib\\\\site-packages\\\\sklearn\\\\metrics\\\\_ranking.py\", line 223, in _binary_roc_auc_score\\n    raise ValueError(\"Only one class present in y_true. ROC AUC score \"\\n', 'ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.\\n']"}

Replace user token in GitHub workflow

Currently @KeXu444 's user token is used. This should probably be replaced with a more generic team token or else we all have access to npm through @KeXu444 's account.

Fairlearn Dashboard: single model view should have several charts

In v1 we had "fairness in accuracy" and "fairness in predictions" charts, which correspond to an over- and underprediction chart for binary classification and a selection rate chart.

In the current v2 there's only an equalized odds chart, and a table for related metrics.

What it should look like:

  • table with performance and fairness metrics at the top.
  • dropdown or other selection mechanism to select which chart to show from
    • over- and underprediction chart (binary classification)
    • selection rate chart (binary classification) to show demographic disparity
    • error rate chart (for regression)
    • other charts in the future (e.g. calibration)

Adding new charts is not part of this feature, just the general setup to allow for switching between charts.

Fairness Dashboard: table should have the right performance and fairness metrics

It should by default show the chosen performance and fairness metrics. The fairness metric won't have individual values per group since it's an aggregate value. It may be nice to show relevant metrics, though, e.g. selection rate when the fairness metric is demographic parity.

There should be a description for how it calculates these metrics including which groups contributed min and max value.

#65 goes a step further by making the columns configurable but that's out of scope for this initial change.

ACTION REQUIRED: Microsoft needs this private repository to complete compliance info

There are open compliance tasks that need to be reviewed for your responsible-ai-widgets repo.

Action required: 4 compliance tasks

To bring this repository to the standard required for 2021, we require administrators of this and all Microsoft GitHub repositories to complete a small set of tasks within the next 60 days. This is critical work to ensure the compliance and security of your microsoft GitHub organization.

Please take a few minutes to complete the tasks at: https://repos.opensource.microsoft.com/orgs/microsoft/repos/responsible-ai-widgets/compliance

  • The GitHub AE (GitHub inside Microsoft) migration survey has not been completed for this private repository
  • No Service Tree mapping has been set for this repo. If this team does not use Service Tree, they can also opt-out of providing Service Tree data in the Compliance tab.
  • No repository maintainers are set. The Open Source Maintainers are the decision-makers and actionable owners of the repository, irrespective of administrator permission grants on GitHub.
  • Classification of the repository as production/non-production is missing in the Compliance tab.

You can close this work item once you have completed the compliance tasks, or it will automatically close within a day of taking action.

If you no longer need this repository, it might be quickest to delete the repo, too.

GitHub inside Microsoft program information

More information about GitHub inside Microsoft and the new GitHub AE product can be found at https://aka.ms/gim or by contacting [email protected]

FYI: current admins at Microsoft include @xuke444, @romanlutz, @chnldw, @gregorybchris, @imatiach-msft, @riedgar-ms

Error Analysis : Can't seem to change axis in explanation view

I went to the following page with the breast cancer example, and clicked on the indicated axis label (side note, it wasn't particularly obvious that this was clickable):

image

Should there be an "OK" or "Apply" button in the pane popping out on the right? I can't seem to get the axis to update.

Shifting a cohort on the tree map view is not working

I shifted the cohort from all data (2 filters) to all data on the tree map view and the tree view did not get updates. I was expecting for the tree view to show me the root node selected... But tree kept its previous node selection.

Fairness: points can be in the same spot and unreachable

If two models are in the same spot (as defined by performance and parity metrics) in the multimodel view of the fairness dashboard we can't click on both of them to get the single model view, but rather only the one on top.

Move legacy interpret dashboard

Interpret dashboard has two top-level components, ExpanationDashboard and NewExplanationDashboard. When the new dashboard is determined to be sufficient, the old dashboard and its components should be moved to a legacy folder.

Fairness: more intuitive selection of performance and fairness metric in model comparison view

Long-term we want the dropdowns / selectors to be more intuitive in the model comparison view. Perhaps this could be handled similar to what the Explanationdashboard does, i.e. the selector is at the corresponding axis. This won't work for sensitive features, but for performance metric and fairness metric.

Given that they'll have quite a few entries we may want to make them searchable as well.

Related to #59

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.