Comments (6)
As for 2. I can do two things. Easy:
- create an extra function: .data_from_cluster_id(id) where id is an int (or maybe list of ints) with the cluster ID you gathered from the tooltip. It returns .csv data/or maybe dataframe which you can store to_csv.
Harder:
- create a dynamic keplermapper application that runs in the browser. Then add ability to download as .csv from the tooltips, or create lasso tools to select nodes/subset of network.
As for 1. I'll implement this soon.
As for "what drives the seperation of such clusters", I am coding up a decision tree based method to find decision rules for "random sample" negative class and "member of cluster" positive class.
I also already looked at providing statistics on the cluster compared to the entire dataset. Provide stats like: age = 3STD over dataset mean. for every cluster: http://mlwave.github.io/tda/bake.html
Any feedback on this?
from kepler-mapper.
import km
# Load digits data
from sklearn import datasets
data, labels = datasets.load_digits().data, datasets.load_digits().target
# Initialize
mapper = km.KeplerMapper(verbose=2)
# Fit and transform data
projected_data = mapper.fit_transform(data,
projection=km.manifold.TSNE(random_state=1))
# Create the graph (we cluster on the projected data and suffer projection loss)
graph = mapper.map(projected_data,
clusterer=km.cluster.DBSCAN(eps=0.3, min_samples=15),
nr_cubes=35,
overlap_perc=0.9)
# Create the visualizations (increased the graph_gravity for a tighter graph-look.)
mapper.visualize(graph,
path_html="keplermapper_digits_ylabel_tooltips.html",
graph_gravity=0.25,
custom_tooltips=labels)
# Collect cluster data
X_cluster = mapper.data_from_cluster_id(430, graph, data)
y_cluster = mapper.data_from_cluster_id(430, graph, labels)
print(X_cluster)
print(X_cluster.shape)
print(y_cluster)
print(y_cluster.shape)
[[ 0. 0. 1. ..., 3. 0. 0.]
[ 0. 0. 7. ..., 0. 0. 0.]
[ 0. 0. 1. ..., 8. 0. 0.]
...,
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 0. 0. 0.]
[ 0. 0. 0. ..., 2. 0. 0.]]
(24, 64)
[1 8 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
(24,)
from kepler-mapper.
We might be able to quickly build a Javascript function that could do most of this from the visualization.
The html already has all of the graph metadata, which includes index information. I could see a right-click on a node providing options to save the data, or copy it to the clipboard.
Otherwise, this kind of exploration loop would be best done inside a notebook, where the mapper is persistent.
from kepler-mapper.
@MLWave @sauln 👍 Perhaps selecting multiple nodes using a lasso tool, and then exporting them. This feature would help in studying and understanding cluster/groups of nodes with similar features and color.
from kepler-mapper.
A lasso tool is a great idea. I've been working on a few updates to the visualize parts and will take a look at incorporating something like this.
I've been having trouble myself in trying to extract the data of multiple nodes. Going node by node can be tedious.
Do you use mapper within Jupyter or open the html in a browser?
from kepler-mapper.
KeplerMapper is great! Definitely interested in Having a lasso tool (or other method of extracting multiple nodes) as part of the visualization.
Have there been any updates on this since last spring?
Thanks,
Jackson
from kepler-mapper.
Related Issues (20)
- try different min_intersections from the visualization
- not able to understand this HOT 1
- Class methods are not being rendered by autosummary
- Examples, gallery not included in readthedocs build HOT 2
- idea: rewrite main readme and release file to .rst, import into docs HOT 5
- Bug: min_cluster_samples should not be set to a non-integer HOT 4
- plotlyviz expects 1d color values, but gets 2d instead HOT 1
- Outdated Documentation HOT 1
- `test_cubes_overlap` may be faulty HOT 2
- Idea - Convert networkx graph object or a graph in edge list format to a Mapper object HOT 8
- Doc toc restructure proposal (minor) HOT 4
- Shadowed test fails to run
- Min-Max confusion in projection statistic in cluster details
- making html files generated by visualize self-contained
- plotlyviz error
- Losing data
- Please refer to igraph instead of python-igraph HOT 2
- Directly producing color values for each node
- Overlapping bins in the HTML visualization.
- Issue with generating visuals in mapper HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kepler-mapper.