Code Monkey home page Code Monkey logo

dygetviz's Introduction

🌟DyGETViz

Our framework is DyGETViz, which stands for Dynamic Graph Embedding Trajectories Visualization.

[Project Page] [Demo] [Data]

Contents

Installation

Automatic Installation

conda create -n dygetviz python=3.9 -y
conda activate dygetviz
pip install --upgrade pip  # enable PEP 660 support
pip install -e .

Manual Installation

If you want to manually install the dependencies, run:

conda install scikit-learn pandas numpy matplotlib plotly
conda install -c conda-forge dash dash-daq dash-bootstrap-components biopython
pip install umap

Please refer to the homepage of PyTorch, PyTorch Geometric, and PyTorch Geometric Temporal to install these 3 packages, respectively.

Upgrade to latest code base

git pull
pip install -e .

Demo

Please check our demo at our website.

Download the data

  • Download all the data from Google Drive
  • Put both data/ and outputs/ under the root directory of this repo.

Getting Started

Procedures of Generating the Visualization

  • Step 1: Discrete-Time Dynamic Graph (DTDG) embedding training

    • We use the GConvGRU model from PyTorch Geometric Temporal to train embeddings of all datasets
    • We extended the dataloader so that we can use a wide variety of data input formats. The original dataloader only used static input at each snapshot.
    • Note: This part is not included in the code yet. For now, we directly provide the embeddings.
  • Output: DTDG embeddings of shape (T, N, D)

    • T: The number of timestamps / snapshots
    • N: The number of nodes
    • D: Embedding dimension

Step 2: Embedding Trajectories Generation

  • Input: DTDG embeddings of shape (T, N, D)

  • Output: JSON file that store the embedding trajectory for Dash

Step 3: Visualizing in a Dash app interactively using the JSON file

  • Users should be able to incrementally add node trajectories / all nodes under a certain category (e.g., normal users v.s. anomalous users) to the visualization

  • highlighted_nodes: List of nodes to be highlighted in the visualization. We need to specify these nodes because we only show the names of a small number of nodes in the plotly visualization. Otherwise, the generated plot will be too messy.

  • plot_dtdg.py: Script for generating the visualization

Generate the visualization using the command:

python dygetviz/plot_dtdg.py --dataset_name <DATASET_NAME> --model GConvGRU

Currently, DATASET_NAME can be selected from one of: Ant, Chickenpox, DGraphFin, Reddit

python dygetviz/plot_dtdg.py --dataset_name Chickenpox --model GConvGRU

python dygetviz/plot_dash.py --dataset_name Chickenpox --model GConvGRU

Data Format

dygetviz supports all temporal networks in [Stanford Large Network Dataset Collection] (https://snap.stanford.edu/data/index.html). Basically, each row is a tuple of (source, target, timestamp) representing an edge in the graph snapshot,

edges.tsv

SRC	DST	TIME
1	2	1082040961
3	4	1082155839
5	2	1082414391
6	7	1082439619
8	7	1082439756
9	10	1082440403
...

An optional nodes.tsv can be provided to indicate the node names. If not provided, the node names will be automatically generated as integers starting from 0.

ID  NAME
0   Anna
1   Bob
2   Charlie
3   David
4   Emma
...

You can also specify an additional column to indicate the node label, such as whether the user is a normal user or an anomalous user.

ID  NAME    LABEL
0   Anna    0
1   Bob     1
2   Charlie 0
3   David   0
4   Emma    1
...

Terminology

  • DG: Dynamic Graphs, which can be categorized into DTDG and CTDG
  • DTDG: Discrete-Time Dynamic Graphs (the type of graphs we are dealing with)
  • CTDG: Continuous-Time Dynamic Graphs
  • Embedding Trajectories: Please refer to the JODIE paper (KDD2019) for more details

Datasets

We provide the following dataset to be viewd in our visualization tool:

Explanation of Each Data File

  • node2idx: A dictionary that maps node names to node indices (usually starting from 0 to #nodes-1).
  • embeds_<DATASET_NAME>.npy: The node embeddings generated by DyGET. The shape of the embeddings is #nodes x #time_steps x #embedding_dim.

Note

  • The Reddit dataset is a bit special because it is the only dataset that describes a bipartite graph. The first 60 snapshots are for each of the 60 snapshots. The last snapshot is for the background nodes. The shape of the embeddings is ``

Acknowledgments

We thank members of the CLAWS Lab and SRI International for their feedback and support.

dygetviz's People

Contributors

ahren09 avatar steveand117 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.