Code Monkey home page Code Monkey logo

hinormer's Introduction

HINormer: Representation Learning On Heterogeneous Information Networks with Graph Transformer

We provide the implementaion of HINormer based on the official PyTorch implementation of HGB(https://github.com/THUDM/HGB)

1. Descriptions

The repository is organised as follows:

  • dataset/: the original data of four benchmark dataset.
  • run.py: multi-class node classificaiton of HINormer.
  • run_multi.py: multi-label node classification of HINormer on IMDB.
  • model.py: implementation of HINormer.
  • utils/: contains tool functions.
  • HGB-output/: contains test files on HGB.

2. Requirements

  • Python==3.9.0
  • Pytorch==1.12.0
  • Networkx==2.8.4
  • numpy==1.22.3
  • dgl==0.9.0
  • scikit-learn==1.1.1
  • scipy==1.7.3

3. Running experiments

We train our model using NVIDIA TITAN Xp GPU with CUDA 10.2.

For node classification with offline evaluation:

  • python run.py --dataset DBLP --len-seq 50 --dropout 0.5 --beta 0.1 --temperature 2
  • python run_multi.py --dataset IMDB --len-seq 20 --beta 0.1 --temperature 0.1
  • python run.py --dataset Freebase --num-gnns 3 --len-seq 30 --num-layers 3 --dropout 0 --beta 0.5 --temperature 0.2
  • python run.py --dataset AMiner --len-seq 80 --num-gnns 3 --num-layers 4 --temperature 0.5

For node classification with online evaluation on HGB:

  • python run.py --dataset DBLP-HGB --len-seq 50 --num-heads 2 --dropout 0.5 --beta 0.1 --temperature 0.1 --mode 1
  • python run_multi.py --dataset IMDB-HGB --len-seq 150 --beta 0.5 --temperature 1 --mode 1

And we provide our test files on DBLP-HGB and IMDB-HGB in 'HGB-output/'.

For reproducing our results in the paper and applying HINormer to other datasets, you need to tune the values of key parameters like 'num-gnns','num-layers','len-seq', 'dropout', 'temperature' and 'beta' in your experimental environment.

4. Citation

hinormer's People

Stargazers

西楼月 avatar Lideng Cai avatar  avatar  avatar aliaz35 avatar Koin avatar  avatar codeghost avatar  avatar  avatar  avatar Zeyuan Zhao avatar hhh avatar riri avatar Cheng Yang avatar  avatar Kehan Yin avatar Shuai Wang avatar Zhiyuan Lu avatar Liang Wang avatar Zemin Liu avatar

Watchers

 avatar

hinormer's Issues

I cannot reproduce the results of Freebase and AMiner

Hi, Ffffffffire
I am trying to reproduce the results of HINormer on Freebase and AMiner, using your codes and scripts, but there is a huge gap between the results of mine and those published on your paper. Here are the logs when running:

python run.py --dataset AMiner --num-gnns 3 --len-seq 100 --beta 0.1 --temperature 0.1
repeat: 1
{'micro-f1': 0.8541893362350381, 'macro-f1': 0.7472750110308501}
repeat: 2
{'micro-f1': 0.8524483133841132, 'macro-f1': 0.7513401519532572}
repeat: 3
{'micro-f1': 0.8559303590859632, 'macro-f1': 0.7567903249575831}
repeat: 4
{'micro-f1': 0.8550598476605006, 'macro-f1': 0.7548341771060799}
repeat: 5
{'micro-f1': 0.8596300326441784, 'macro-f1': 0.7515768889725416}
Micro-f1: 0.8555, std: 0.0027
Macro-f1: 0.7524, std: 0.0036
python run.py --dataset Freebase --num-gnns 3 --len-seq 30 --num-layers 3 --dropout 0 --beta 1 --temperature 0.2
repeat: 1
{'micro-f1': 0.6764826175869121, 'macro-f1': 0.6157334967628284}
repeat: 2
{'micro-f1': 0.6666666666666666, 'macro-f1': 0.62344618380074}
repeat: 3
{'micro-f1': 0.6785276073619632, 'macro-f1': 0.5977709087708544}
repeat: 4
{'micro-f1': 0.6768916155419223, 'macro-f1': 0.6315030856090981}
repeat: 5
{'micro-f1': 0.665439672801636, 'macro-f1': 0.6003430074946984}
Micro-f1: 0.6728, std: 0.0062
Macro-f1: 0.6138, std: 0.0146

How can I reproduce the results shown in your paper?

The results are unstable, also it's hard to be stable.

Hi,
I found that there is no seed in your original code, and then added seed into the code by this:

def set_seed(seed):
    torch.manual_seed(seed)
    np.random.seed(seed)
    random.seed(seed)
    if torch.cuda.is_available():
        torch.cuda.manual_seed(seed) 
        torch.backends.cudnn.deterministic = True
        torch.backends.cudnn.enabled = False
        torch.backends.cudnn.benchmark = False
set_seed(args.seed)
ap.add_argument('--seed', type=int, default=10)

Unfortunately, I still got different results in every repeat when I repeated 5 times in this seed. In my opinion, I should get 5 same results if the seed is set correctly.

So I have no idea how to reproduce the results stably, could you give me some suggetions? Or how did you ensure to get stable results by not setting seed?

Experiment on vanilla Transformer

hello,authors.
I am curious about the experiment on vanilla Transformer for HGNN, can you tell me the details for training or show me some part of the code,thanks a lot😆

About Freebase dataset

Hi,
I try to reproduce the result in Freebase dataset and notice that it seems like the Freebase dataset used in your paper is different from the dataset in origin HGB paper. The Freebase dataset in your paper is smaller than HGB's, and I think the data distribution is also different.

But in your Freebase setting, I can not reproduce the reported result, could you give me more details about hyper-parameters of Freebase dataset?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.