Code Monkey home page Code Monkey logo

mixhop-and-n-gcn's Introduction

MixHop and N-GCN

PWC Arxiv codebeat badge repo sizebenedekrozemberczki

A PyTorch implementation of "MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing" (ICML 2019) and "A Higher-Order Graph Convolutional Layer" (NeurIPS 2018).


Abstract

Recent methods generalize convolutional layers from Euclidean domains to graph-structured data by approximating the eigenbasis of the graph Laplacian. The computationally-efficient and broadly-used Graph ConvNet of Kipf & Welling, over-simplifies the approximation, effectively rendering graph convolution as a neighborhood-averaging operator. This simplification restricts the model from learning delta operators, the very premise of the graph Laplacian. In this work, we propose a new Graph Convolutional layer which mixes multiple powers of the adjacency matrix, allowing it to learn delta operators. Our layer exhibits the same memory footprint and computational complexity as a GCN. We illustrate the strength of our proposed layer on both synthetic graph datasets, and on several real-world citation graphs, setting the record state-of-the-art on Pubmed.

This repository provides a PyTorch implementation of MixHop and N-GCN as described in the papers:

MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing Sami Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Hrayr Harutyunyan, Nazanin Alipourfard, Kristina Lerman, Greg Ver Steeg, and Aram Galstyan. ICML, 2019. [Paper]

A Higher-Order Graph Convolutional Layer. Sami A Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Nazanin Alipourfard, Hrayr Harutyunyan. NeurIPS, 2018. [Paper]

The original TensorFlow implementation of MixHop is available [Here].

Requirements

The codebase is implemented in Python 3.5.2. package versions used for development are just below.

networkx          2.4
tqdm              4.28.1
numpy             1.15.4
pandas            0.23.4
texttable         1.5.0
scipy             1.1.0
argparse          1.1.0
torch             1.1.0
torch-sparse      0.3.0

Datasets

The code takes the **edge list** of the graph in a csv file. Every row indicates an edge between two nodes separated by a comma. The first row is a header. Nodes should be indexed starting with 0. A sample graph for `Cora` is included in the `input/` directory. In addition to the edgelist there is a JSON file with the sparse features and a csv with the target variable.

The **feature matrix** is a sparse binary one it is stored as a json. Nodes are keys of the json and feature indices are the values. For each node feature column ids are stored as elements of a list. The feature matrix is structured as:

{ 0: [0, 1, 38, 1968, 2000, 52727],
  1: [10000, 20, 3],
  2: [],
  ...
  n: [2018, 10000]}

The **target vector** is a csv with two columns and headers, the first contains the node identifiers the second the targets. This csv is sorted by node identifiers and the target column contains the class meberships indexed from zero.

NODE ID Target
0 3
1 1
2 0
3 1
... ...
n 3

Options

Training an N-GCN/MixHop model is handled by the `src/main.py` script which provides the following command line arguments.

Input and output options

  --edge-path       STR    Edge list csv.         Default is `input/cora_edges.csv`.
  --features-path   STR    Features json.         Default is `input/cora_features.json`.
  --target-path     STR    Target classes csv.    Default is `input/cora_target.csv`.

Model options

  --model             STR     Model variant.                 Default is `mixhop`.               
  --seed              INT     Random seed.                   Default is 42.
  --epochs            INT     Number of training epochs.     Default is 2000.
  --early-stopping    INT     Early stopping rounds.         Default is 10.
  --training-size     INT     Training set size.             Default is 1500.
  --validation-size   INT     Validation set size.           Default is 500.
  --learning-rate     FLOAT   Adam learning rate.            Default is 0.01.
  --dropout           FLOAT   Dropout rate value.            Default is 0.5.
  --lambd             FLOAT   Regularization coefficient.    Default is 0.0005.
  --layers-1          LST     Layer sizes (upstream).        Default is [200, 200, 200]. 
  --layers-2          LST     Layer sizes (bottom).          Default is [200, 200, 200].
  --cut-off           FLOAT   Norm cut-off for pruning.      Default is 0.1.
  --budget            INT     Architecture neuron budget.    Default is 60.

Examples

The following commands learn a neural network and score on the test set. Training a model on the default dataset.

$ python src/main.py

Training a MixHop model for a 100 epochs.

$ python src/main.py --epochs 100

Increasing the learning rate and the dropout.

$ python src/main.py --learning-rate 0.1 --dropout 0.9

Training a model with diffusion order 2:

$ python src/main.py --layers 64 64

Training an N-GCN model:

$ python src/main.py --model ngcn

License


mixhop-and-n-gcn's People

Contributors

benedekrozemberczki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mixhop-and-n-gcn's Issues

About "torch_scatter"

I cannot successfully install "torch_scatter".
When I run the command line : pip3 install torch_scatter, an error always occurs, just like below. I tried to solve the problem, but I don't find the correct method. Could you help me? Thanks a lot!

The error:
...
cpu/scatter.cpp:1:29: fatal error: torch/extension.h: No such file or directory
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1


Failed building wheel for torch-scatter
Running setup.py clean for torch-scatter
Failed to build torch-scatter
Installing collected packages: torch-scatter
Running setup.py install for torch-scatter ... error
Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-build-ijd9s63n/torch-scatter/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-4tkx7v_s-record/install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.5
creating build/lib.linux-x86_64-3.5/torch_scatter
copying torch_scatter/mul.py -> build/lib.linux-x86_64-3.5/torch_scatter
copying torch_scatter/mean.py -> build/lib.linux-x86_64-3.5/torch_scatter
copying torch_scatter/sub.py -> build/lib.linux-x86_64-3.5/torch_scatter
copying torch_scatter/min.py -> build/lib.linux-x86_64-3.5/torch_scatter
copying torch_scatter/std.py -> build/lib.linux-x86_64-3.5/torch_scatter
copying torch_scatter/init.py -> build/lib.linux-x86_64-3.5/torch_scatter
copying torch_scatter/max.py -> build/lib.linux-x86_64-3.5/torch_scatter
copying torch_scatter/div.py -> build/lib.linux-x86_64-3.5/torch_scatter
copying torch_scatter/add.py -> build/lib.linux-x86_64-3.5/torch_scatter
creating build/lib.linux-x86_64-3.5/test
copying test/test_multi_gpu.py -> build/lib.linux-x86_64-3.5/test
copying test/utils.py -> build/lib.linux-x86_64-3.5/test
copying test/test_std.py -> build/lib.linux-x86_64-3.5/test
copying test/init.py -> build/lib.linux-x86_64-3.5/test
copying test/test_forward.py -> build/lib.linux-x86_64-3.5/test
copying test/test_backward.py -> build/lib.linux-x86_64-3.5/test
creating build/lib.linux-x86_64-3.5/torch_scatter/utils
copying torch_scatter/utils/ext.py -> build/lib.linux-x86_64-3.5/torch_scatter/utils
copying torch_scatter/utils/init.py -> build/lib.linux-x86_64-3.5/torch_scatter/utils
copying torch_scatter/utils/gen.py -> build/lib.linux-x86_64-3.5/torch_scatter/utils
running build_ext
building 'torch_scatter.scatter_cpu' extension
creating build/temp.linux-x86_64-3.5
creating build/temp.linux-x86_64-3.5/cpu
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/zgh/.local/lib/python3.5/site-packages/torch/lib/include -I/home/zgh/.local/lib/python3.5/site-packages/torch/lib/include/TH -I/home/zgh/.local/lib/python3.5/site-packages/torch/lib/include/THC -I/usr/include/python3.5m -c cpu/scatter.cpp -o build/temp.linux-x86_64-3.5/cpu/scatter.o -Wno-unused-variable -DTORCH_EXTENSION_NAME=scatter_cpu -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
cpu/scatter.cpp:1:29: fatal error: torch/extension.h: No such file or directory
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

----------------------------------------

Command "/usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-build-ijd9s63n/torch-scatter/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-4tkx7v_s-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-ijd9s63n/torch-scatter/
You are using pip version 8.1.1, however version 19.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

Higher Powers Implementation

image

for iteration in range(self.iterations-1):
base_features = spmm(normalized_adjacency_matrix["indices"], normalized_adjacency_matrix["values"], base_features.shape[0], base_features)
return base_features

To me it looks like this is implementing
H(l+1) = σ(A^j * H(l) * W(l))

image

Can you explain where W_j and the concatenation are taking place?

FileNotFoundError: [Errno 2] No such file or directory: './input/cora_edges.csv'

hello,
when i run src/main.py,the error message appears:
File "pandas_libs\parsers.pyx", line 361, in pandas._libs.parsers.TextReader.cinit
File "pandas_libs\parsers.pyx", line 653, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: './input/cora_edges.csv'

do you know how to solve it?

IndexError:

hello, when I run main.py, the error message occurs:

File "D:\anaconda3.4\lib\site-packages\torch_sparse\spmm.py", line 30, in spmm
out = matrix[col]
IndexError: index 10241 is out of bounds for dimension 0 with size 10241

the content of the spmm.py:
`# import torch
from torch_scatter import scatter_add

def spmm(index, value, m, n, matrix):
"""Matrix product of sparse matrix with dense matrix.
稀疏矩阵与稠密矩阵的矩阵乘积

Args:
    index (:class:`LongTensor`): The index tensor of sparse matrix.
    value (:class:`Tensor`): The value tensor of sparse matrix.
    m (int): The first dimension of corresponding dense matrix.
    n (int): The second dimension of corresponding dense matrix.
    matrix (:class:`Tensor`): The dense matrix.
  :rtype: :class:`Tensor`
"""

assert n == matrix.size(0)

row, col = index

matrix = matrix if matrix.dim() > 1 else matrix.unsqueeze(-1)

out = matrix[col]
out = out * value.unsqueeze(-1)
out = scatter_add(out, row, dim=0, dim_size=m)

return out

`
by the way, I use my own datasets, and the number of node is 10242. do you know how to solve it?

Citeseer and Pubmed Datasets

Hi Benedek,

Thank you so much for the code. I want to run your code on Citeseer and Pubmed datasets. Would you mind providing Citeseer and Pubmed data in this format? By the way, after running MixHop model with default parameters I got the test accuracy 0.7867. Did the accuracy depend on the system that the code is running?

Thanks in advance

running error

hello, when I run main.py, the error message occurs:
File "/home/tj/anaconda3/lib/python3.6/site-packages/torch_sparse/init.py", line 22, in
raise OSError(e)
OSError: libcusparse.so.10.0: cannot open shared object file: No such file or directory

my python version 3.6 cuda10.0 torch1.0.1 torch-sparse0.5.1
do you know how to solve it?

some problem about codes

When I run the code, some error occured as follows:
MixHop-and-N-GCN-master\src\utils.py", line 45, in feature_reader out_features["indices"] = torch.LongTensor(np.concatenate([features.row.reshape(-1,1), features.col.reshape(-1,1)],axis=1).T) TypeError: can't convert np.ndarray of type numpy.int32. The only supported types are: float64, float32, float16, int64, int32, int16, int8, and uint8.
I search it on the Internet and found that it seems to be a list of lists that are not of the same length. I was stuck in it and do not know how to correct it! Looking forward to your help!!Thanks!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.