Hi there, thanks for this amazing fancy version of mapper! After wor

As for 2. I can do two things. Easy: create an extra function:

<div class="highlight highlight-source-python notranslate position-relative overflow-auto" dir="auto

We might be able to quickly build a Java function that could do most of this fro

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

suggestion: download row ID in clusters about kepler-mapper HOT 6 OPEN

scikit-tda commented on May 21, 2024

suggestion: download row ID in clusters

from kepler-mapper.

Comments (6)

MLWave commented on May 21, 2024

As for 2. I can do two things. Easy:

create an extra function: .data_from_cluster_id(id) where id is an int (or maybe list of ints) with the cluster ID you gathered from the tooltip. It returns .csv data/or maybe dataframe which you can store to_csv.

Harder:

create a dynamic keplermapper application that runs in the browser. Then add ability to download as .csv from the tooltips, or create lasso tools to select nodes/subset of network.

As for 1. I'll implement this soon.

As for "what drives the seperation of such clusters", I am coding up a decision tree based method to find decision rules for "random sample" negative class and "member of cluster" positive class.

I also already looked at providing statistics on the cluster compared to the entire dataset. Provide stats like: age = 3STD over dataset mean. for every cluster: http://mlwave.github.io/tda/bake.html

Any feedback on this?

from kepler-mapper.

MLWave commented on May 21, 2024

import km

# Load digits data
from sklearn import datasets
data, labels = datasets.load_digits().data, datasets.load_digits().target

# Initialize
mapper = km.KeplerMapper(verbose=2)

# Fit and transform data
projected_data = mapper.fit_transform(data,
                                      projection=km.manifold.TSNE(random_state=1))

# Create the graph (we cluster on the projected data and suffer projection loss)
graph = mapper.map(projected_data, 
                   clusterer=km.cluster.DBSCAN(eps=0.3, min_samples=15),
                   nr_cubes=35,
                   overlap_perc=0.9)

# Create the visualizations (increased the graph_gravity for a tighter graph-look.)
mapper.visualize(graph, 
                 path_html="keplermapper_digits_ylabel_tooltips.html",
                 graph_gravity=0.25,
                 custom_tooltips=labels)

# Collect cluster data
X_cluster = mapper.data_from_cluster_id(430, graph, data)
y_cluster = mapper.data_from_cluster_id(430, graph, labels)

print(X_cluster)
print(X_cluster.shape)
print(y_cluster)
print(y_cluster.shape)

[[ 0.  0.  1. ...,  3.  0.  0.]
 [ 0.  0.  7. ...,  0.  0.  0.]
 [ 0.  0.  1. ...,  8.  0.  0.]
 ...,
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  0.  0.  0.]
 [ 0.  0.  0. ...,  2.  0.  0.]]
(24, 64)
[1 8 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
(24,)

from kepler-mapper.

sauln commented on May 21, 2024

We might be able to quickly build a Javascript function that could do most of this from the visualization.

The html already has all of the graph metadata, which includes index information. I could see a right-click on a node providing options to save the data, or copy it to the clipboard.

Otherwise, this kind of exploration loop would be best done inside a notebook, where the mapper is persistent.

from kepler-mapper.

commented on May 21, 2024

@MLWave @sauln 👍 Perhaps selecting multiple nodes using a lasso tool, and then exporting them. This feature would help in studying and understanding cluster/groups of nodes with similar features and color.

from kepler-mapper.

sauln commented on May 21, 2024

A lasso tool is a great idea. I've been working on a few updates to the visualize parts and will take a look at incorporating something like this.

I've been having trouble myself in trying to extract the data of multiple nodes. Going node by node can be tedious.

Do you use mapper within Jupyter or open the html in a browser?

from kepler-mapper.

totport commented on May 21, 2024

KeplerMapper is great! Definitely interested in Having a lasso tool (or other method of extracting multiple nodes) as part of the visualization.

Have there been any updates on this since last spring?

Thanks,

Jackson

from kepler-mapper.

suggestion: download row ID in clusters about kepler-mapper HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent