Code Monkey home page Code Monkey logo

han_baseline's Introduction

This is only Baseline code!

by WangYC @NWPU changan Apr.18th 2022

baseline使用方法:

训练

环境要求:tensorflow-gpu=1.15.0 cuda=10.0 python=3.6 首先找到对应的ex_dataset.py文件

  1. 修改要用来训练的gpu编号
  2. python ex_dataset.py argv[1] 其中的argv是patience

测试

环境要求:pytorch=1.7.0 python=3.8 evaluate.py 修改:

  1. 数据集名称
  2. metapath集合
  3. 分类数量 python evaluate.py argv[1] 其中的argv还是patience

Paper

https://github.com/Jhy1993/Representation-Learning-on-Heterogeneous-Graph

HAN

The source code of Heterogeneous Graph Attention Network (WWW-2019).

The source code is based on GAT

Reference

If you make advantage of the HAN model or use the datasets released in our paper, please cite the following in your manuscript:

@article{han2019,
title={Heterogeneous Graph Attention Network},
author={Xiao, Wang and Houye, Ji and Chuan, Shi and  Bai, Wang and Peng, Cui and P. , Yu and Yanfang, Ye},
journal={WWW},
year={2019}
}

How to preprocess DBLP?

Demo: preprocess_dblp.py

Slides

https://github.com/Jhy1993/HAN/blob/master/0516纪厚业%20www%20ppt%20copy.pdf

Q&A

  1. ACM_3025 in our experiments is based on the preprocessed version ACM in other paper (\data\ACM\ACM.mat). Subject is just like Neural Network, Multi-Object Optimization and Face Recognition. In ACM3025, PLP is actually PSP. You can find it in our code.
  2. In ACM, train+val+test < node_num. That is because our model is a semi-supervised model which only need a few labels to optimize our model. The num of node can be found in meta-path based adj mat.
  3. "the model can generate node embeddings for previous unseen nodes or even unseen graph" means the propose HAN can do inductive experiments. However, we cannot find such heterogeneous graph dataset. See experiments setting in Graphsage and GAT for details, especially on PPI dataset.
  4. meta-path can be symmetric or asymmetric. HAN can deal with different types of nodes via project them into the same space.
  5. Can we change the split of dataset and re-conduct some experiments? of course, you can split the dataset by yourself, as long as you use the same split for all models.
  6. How to run baseline (e.g., GCN) and report the best performance of baselines? Taking ACM as an example, we translate heterogenesous graph into two homogeneous graphs via meta-path PAP&PSP. For PAP based homogeneous graph, it only has one type of node paper and two paper connected via PAP. Then, we run GCN on two graphs and report the best performance. Ref https://arxiv.org/pdf/1902.01475v1.pdf and http://web.cs.wpi.edu/~xkong/publications/papers/www18.pdf
  7. Several principles for preprocess data. 1)Extract nodes which have all meta-path based neighbors. 2)Extract features which may meaningful in identifying the characteristics of nodes. For example, if all nodes have one feature, this feature is not meaningful. If only several nodes have one feature, this feature is not meaningful. 3) Extract balanced node label which means different classes should have almost the same number of node. For k classes, each class should select 500 nodes and label them, so we get 500*k labeled nodes.

Datasets

Preprocessed ACM can be found in: https://pan.baidu.com/s/1V2iOikRqHPtVvaANdkzROw 提取码:50k2

https://bupteducn-my.sharepoint.com/:u:/g/personal/jhy1993_bupt_edu_cn/EfLZcHE2e4xBplCVnzcJbQYBurNVOCk7ZIne2YsO3jKbSw?e=vMQ18v

Preprocessed DBLP can be found in: https://pan.baidu.com/s/1Qr2e97MofXsBhUvQqgJqDg 提取码:6b3h

https://bupteducn-my.sharepoint.com/:u:/g/personal/jhy1993_bupt_edu_cn/Ef6A6m2njZ5CqkTN8QcwU8QBuENpB7eDVJRnsV9cWXWmsA?e=wlErKk

Preprocessed IMDB can be found in: 链接:https://pan.baidu.com/s/199LoAr5WmL3wgx66j-qwaw 密码:qkec

Run

Download preprocessed data and modify data path in def load_data_dblp(path='/home/jhy/allGAT/acm_hetesim/ACM3025.mat'):

python ex_acm3025.py

HAN in DGL

https://github.com/dmlc/dgl/tree/master/examples/pytorch/han

han_baseline's People

Contributors

wangyc-99 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.