datarobot_clustering's Introduction

DataRobot Prediction Explanation Clustering

This project demonstrates how you can take a DataRobot model and build clusters on the basis of the prediction explanations.

Status: Functional

Todo: Generate a downloadable dataset with the cluster labels added

Dependencies

You will need a DataRobot account and access to a dedicated prediction server.

You will also need a bunch of python libraries, including the DataRobot package

pip install numpy
pip install pandas
pip install sklearn
pip install matplotlib
pip install functools
pip install hdbscan
pip install datarobot

To run this application you will need a YAML file that authenticates you against your DataRobot instance when using the DataRobot Python Package. Please follow these guidelines to set this up

About

The core functions that retrieve the predictions and their explanations can be found in the file drpredexplanations.py

The results generated by the above file can then be clustered using one of several functions found in the file drclustering.py

The above functions are used by the example script and the web application example.

Caveats

Currently the implementation allows you to build either K-Means or HDBScan clusters. The clustering is done on a sparse matrix representation of the prediction explanation strengths.

Additional algorithms, features and distance metrics will be added given time.

Usage

The script example.py shows you how to create clusters by specifying a DataRobot project model and dataset using an interactive python session.

The file app.py and the contents of the templates directory is a python flask web application you can use to run the clustering on any of your DataRobot projects, provided that you supply a data set to score against.

It will store the plots generated in the folder static so that they do not need to be re-generated.

To run:

python app.py

Then follow the prompts

Recommend Projects

john-hawkins / datarobot_clustering Goto Github PK

datarobot_clustering's Introduction

DataRobot Prediction Explanation Clustering

Dependencies

About

Caveats

Usage

datarobot_clustering's People

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent