Code Monkey home page Code Monkey logo

godm's Introduction

GODM

PyPI version

GODM is a data augmentation package for supervised graph outlier detection. It generates synthetic graph outliers with latent diffusion models. This is the official implementation of Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models.

model architecture

Installation

It is recommended to use pip for installation:

pip install godm

Alternatively, you can build from source by cloning this repository:

git clone https://github.com/kayzliu/godm.git
cd pygod
pip install .

If you have trouble installing metis, take a look at this tutorial.

Usage

from pygod.utils import load_data
data = load_data('weibo') # load data

from godm import GODM     # import GODM
godm = GODM(lr=0.004)     # init. GODM
aug_data = godm(data)     # augment data

detector(aug_data)        # train on data

The input data should be torch_geometric.Data object with the following keys:

  • x: node features,
  • edge_index: edge index,
  • edge_time: edge times (optional, name can be changed by time_attr),
  • edge_type: edge types (optional, name can be changed by type_attr),
  • y: node labels,
  • train_mask: training node mask,
  • val_mask: validation node mask,
  • test_mask: testing node mask.

So far, no additional keys is allowed. We may support more keys by padding in the future.

Parameters

  • hid_dim (type: int, default: None): hidden dimension for VAE, i.e., latent embedding dimension. None means the largest power of 2 that is less than or equal to the feature dimension divided by two.
  • diff_dim (type: int, default: None): hidden dimension for denoiser. None means as twice as hid_dim.
  • vae_epochs (type: int, default: 100): number of epochs for training VAE.
  • diff_epochs (type: int, default: 100): number of epochs for training diffusion model.
  • patience (type: int, default: 50): patience for early stopping.
  • lr (type: float, default: 0.001): learning rate.
  • wd (type: float, default: 0.): weight decay.
  • batch_size (type: int, default: 2048): batch size.
  • threshold (type: float, default: 0.75): threshold for edge generation.
  • wx (type: float, default: 1.): weight for node feature reconstruction loss.
  • we (type: float, default: 0.5): weight for edge reconstruction loss.
  • beta (type: float, default: 0.001): weight for KL divergence loss.
  • wt (type: float, default: 1.): weight for time prediction loss.
  • time_attr (type: str, default: edge_time): attribute name for edge time.
  • type_attr (type: str, default: edge_type): attribute name for edge type.
  • wp (type: float, default: 0.3): weight for node prediction loss.
  • gen_nodes (type: int, default: None): number of nodes to generate. None means the same as the number of outliers in the original graph.
  • sample_steps (type: int, default: 50): number of steps for diffusion model sampling.
  • device (type: int, default: 0): GPU index, set to -1 for CPU.
  • verbose (type: bool, default: False): verbose mode, enable for logging.

Cite Us:

Our paper is publicly available. If you use GODM in a scientific publication, we would appreciate your citations:

@article{liu2023data,
  title={Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models},
  author={Liu, Kay and Zhang, Hengrui and Hu, Ziqing and Wang, Fangxin and Yu, Philip S.},
  journal={arXiv preprint arXiv:2312.17679},
  year={2023}
}

or:

Liu, K., Zhang, H., Hu, Z., Wang, F., and Yu, P.S. 2023. Data Augmentation for Supervised Graph Outlier Detection with Latent Diffusion Models. arXiv preprint arXiv:2312.17679.

godm's People

Contributors

kayzliu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

godm's Issues

Training with None loss

Hello! I have a problem about training godm in Cora dataset. During training, my loss was always nan. I went to debug the program and found that after the node embedding through some layers it became a nan tensor, and there was no pattern. Will you help me?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.