Code Monkey home page Code Monkey logo

yellowbrick's Introduction

Yellowbrick

Build Status Coverage Status Code Health Documentation Status Stories in Ready

Visual analysis and diagnostic tools to facilitate machine learning model selection.

Follow the yellow brick road Image by Quatro Cinco, used with permission, Flickr Creative Commons.

This README is a guide for developers, if you're new to Yellowbrick, get started at our documentation.

What is Yellowbrick?

Yellowbrick is a suite of visual diagnostic tools called "Visualizers" that extend the Scikit-Learn API to allow human steering of the model selection process. In a nutshell, Yellowbrick combines Scikit-Learn with Matplotlib in the best tradition of the Scikit-Learn documentation, but to produce visualizations for your models!

Visualizers

Visualizers

Visualizers are estimators (objects that learn from data) whose primary objective is to create visualizations that allow insight into the model selection process. In Scikit-Learn terms, they can be similar to transformers when visualizing the data space or wrap an model estimator similar to how the "ModelCV" (e.g. RidgeCV, LassoCV) methods work. The primary goal of Yellowbrick is to create a sensical API similar to Scikit-Learn. Some of our most popular visualizers include:

Feature Visualization

  • Rank2D: pairwise ranking of features to detect relationships
  • Parallel Coordinates: horizontal visualization of instances
  • Radial Visualization: separation of instances around a circular plot

Classification Visualization

  • Class Balance: see how the distribution of classes affects the model
  • Classification Report: visual representation of precision, recall, and F1
  • ROC/AUC Curves: receiver operator characteristics and area under the curve
  • Confusion Matrices: visual description of class decision making

Regression Visualization

  • Prediction Error Plots: find model breakdowns along the domain of the target
  • Residuals Plot: show the difference in residuals of training and test data
  • Alpha Selection: show how the choice of alpha influences regularization

Clustering Visualization

  • K-Elbow Plot: select k using the elbow method and various metrics
  • Silhouette Plot: select k by visualizing silhouette coefficient values

Text Visualization

  • Term Frequency: visualize the frequency distribution of terms in the corpus
  • TSNE: use stochastic neighbor embedding to project documents.

And more! Visualizers are being added all the time, be sure to check the examples (or even the develop branch) and feel free to contribute your ideas for Visualizers!

Using Yellowbrick

The Yellowbrick API is specifically designed to play nicely with Scikit-Learn. Here is an example of a typical workflow sequence with Scikit-Learn and Yellowbrick:

Feature Visualization

In this example, we see how Rank2D performs pairwise comparisons of each feature in the data set with a specific metric or algorithm, then returns them ranked as a lower left triangle diagram.

from yellowbrick.features import Rank2D

visualizer = Rank2D(features=features, algorithm='covariance')
visualizer.fit(X, y)                # Fit the data to the visualizer
visualizer.transform(X)             # Transform the data
visualizer.poof()                   # Draw/show/poof the data

Model Visualization

In this example, we instantiate a Scikit-Learn classifier, and then we use Yellowbrick's ROCAUC class to visualize the tradeoff between the classifier's sensitivity and specificity.

from sklearn.svm import LinearSVC
from yellowbrick.classifier import ROCAUC

model = LinearSVC()
model.fit(X,y)
visualizer = ROCAUC(model)
visualizer.score(X,y)
visualizer.poof()

For additional information on getting started with Yellowbrick, check out our examples notebook.

We also have a quick start guide.

Contributing to Yellowbrick

Yellowbrick is an open source project that is supported by a community who will gratefully and humbly accept any contributions you might make to the project. Large or small, any contribution makes a big difference; and if you've never contributed to an open source project before, we hope you will start with Yellowbrick!

Principally, Yellowbrick development is about the addition and creation of visualizers --- objects that learn from data and create a visual representation of the data or model. Visualizers integrate with Scikit-Learn estimators, transformers, and pipelines for specific purposes and as a result, can be simple to build and deploy. The most common contribution is therefore a new visualizer for a specific model or model family. We'll discuss in detail how to build visualizers later.

Beyond creating visualizers, there are many ways to contribute:

  • Submit a bug report or feature request on GitHub Issues.
  • Contribute a Jupyter notebook to our examples gallery.
  • Assist us with user testing.
  • Add to the documentation or help with our website, scikit-yb.org.
  • Write unit or integration tests for our project.
  • Answer questions on our issues, mailing list, Stack Overflow, and elsewhere.
  • Translate our documentation into another language.
  • Write a blog post, tweet, or share our project with others.
  • Teach someone how to use Yellowbrick.

As you can see, there are lots of ways to get involved and we would be very happy for you to join us! The only thing we ask is that you abide by the principles of openness, respect, and consideration of others as described in the Python Software Foundation Code of Conduct.

For more information, checkout CONTRIBUTING.md.

yellowbrick's People

Contributors

balavenkatesan avatar bbengfort avatar jkeung avatar lauralorenz avatar lcombs avatar mariusvniekerk avatar mattandahalfew avatar morganmendis avatar naturallogofx avatar ndanielsen avatar nealhumphrey avatar ojedatony1616 avatar pbwitt avatar pdamodaran avatar pvomelveny avatar rebeccabilbro avatar stampedpassp0rt avatar tsterbak avatar tuulihill avatar waffle-iron avatar xerebz avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.