rampasek / graphgps Goto Github PK

View Code? Open in Web Editor NEW

611.0 10.0 111.0 12.89 MB

Recipe for a General, Powerful, Scalable Graph Transformer

License: MIT License

Python 98.24% Shell 1.76%

graph-representation-learning long-range-dependence graph-neural-network graph-transformer

graphgps's People

Contributors

Stargazers

Watchers

Forkers

odellus shijiale0609 stjordanis nashid harishgovardhandamodar jakubgajski ilwoof rish-16 liuchuang0059 romainfd rmenegaux luoyk1999 linweiii shuowang-ai jiashuhao518 ramanarayan86 suntaochun lavoiems mikado98765 tmukande-debug alexoarga remylau zadubrovsky birdylinch luis-mueller xiangyan93 chendiqian guillaumehu hamed1375 jkminder lucian-code233 ywen666 gxglxy qsnznjing 1124562662 doloresgarcia fxf864 stonelab-np stonelab-np vhepp inpefess alinutzal chrissly31415 kaansancak haldate-yu jackcai1206 avudzor mathsrocks qlinhta tedsiweiliu arunraja-hub abokalam barongeng faraz2023 tz545 san98215 kylogong kellygong isefos branchialspace flazerain currytang lvxiangwei harry-patter eugene29 maximilianvie preacherwhite bvskp-projects greatlying jeongwhanchoi jiaqingxie jks17 sailfish009 techthiyanes ykrmm qustfmy madhav1590 arash79 gabrielleberrada gregori0o jiamlu maxojeda

graphgps's Issues

Results on PATTERN

Hi,

Thank you for your impressive work. I try to reproduce the results of SAN and GatedGCN on PATTERN using GraphGPS framework. The results are all around 89.9 test accuracy, which are significantly higher than the results reported in the paper (~85.6 test accuracy).

I understand that you report the results from benchmarking gnn. May I ask if you have encountered same issue during the experiments?

The config file I use is in config.zip

Thank you!

Yiming

How to implement graphgps to node classification

How to implement graphGPS to node classification? Is there any example?

About Multiple GPUs Running GraphGPS in Parallel

Hello GraphGPS team, have you tried to run GraphGPS with multiple GPUs? If you have tried, what is the specific operation mode? Thank you.

Questions about the output channel

Hi, I notice that there is a version in pyg implementation.
https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.nn.conv.GPSConv.html#torch_geometric.nn.conv.GPSConv
I wonder why it does not have the size of output channel? I think in the paper it mentioned that the model will update the dimensions of embeddings.

Thanks a lot.

Implementation on custom dataset

Thank you for good architecture, but I cannot implement this in our own dataset, is there any possibility to do that?

pretrained models

Hi @rampasek ,
Do you have any plan to release your checkpoints for the PCQM4M dataset?

Is there a vanilla implementation without graphgym

Use of graphgym has made it very difficult to change and understand code. Does there exist an vanilla implementation without use of graphgym

Where is the "batch.node_label_index" property set

When i set cfg.dataset.task=node, cfg.model.type=gnn ， cfg.gnn.stage_type=stack, then it come s to self.post_mp = GNNHead(dim_in=d_in, dim_out=dim_out) in gnn using：

class GNNNodeHead(nn.Module):
    '''Head of GNN, node prediction'''
    def __init__(self, dim_in, dim_out):
        super(GNNNodeHead, self).__init__()
        self.layer_post_mp = MLP(dim_in,
                                 dim_out,
                                 num_layers=cfg.gnn.layers_post_mp,
                                 bias=True)

    def _apply_index(self, batch):
        if batch.node_label_index.shape[0] == batch.node_label.shape[0]:
            return batch.node_feature[batch.node_label_index], batch.node_label
        else:
            return batch.node_feature[batch.node_label_index], \
                   batch.node_label[batch.node_label_index]

    def forward(self, batch):
        batch = self.layer_post_mp(batch)
        pred, label = self._apply_index(batch)
        return pred, label

i want to know Where is the "batch.node_label_index", "batch.node_label" property set

Why I can not import from torch_geometric.graphgym.optimizer import create_optimizer, \ # create_scheduler, OptimizerConfig, SchedulerConfig

I want to use windows , pycharm to run your code, [If i use linux system, it has no problem at all]
I installed torch_geometric in window pc

--I can successfully import those packages from a console,(torch1.10.0, )
import torch
import torch_geometric 2.0.4
import torch_scatter 2.0.9
import torch_sparse 0.6.13
import torch_spline_conv 1.2.1
import torch_cluster 1.6.0

but this line
from torch_geometric.graphgym.optimizer import create_optimizer,
create_scheduler, OptimizerConfig, SchedulerConfig
it shows errors on .graphgym.optimizer (as unrecognized modules).
I can not import from torch_geometric.graphgym.optimizer import create_optimizer,
create_scheduler, OptimizerConfig, SchedulerConfig --- from console, it can not import

error--- module = self._system_import(name, *args, **kwargs)
ModuleNotFoundError: No module named 'torch_geometric.graphgym.optimizer'

why linux system, it can import, and my window can not?

i try to replace from torch_geometric.graphgym.optimizer import
to from torch_geometric.graphgym.optim import ...

then when it goes to

optimizer = create_optimizer(model.parameters(),
new_optimizer_config(cfg))
it show errors, raise ValueError(f"'cfg.{arg_name}' undefined")
raise ValueError(f"'cfg.{arg_name}' undefined")
ValueError: 'cfg.optimizer_config' undefined

how can i solve this problem, thank you very much.

PCQM4Mv2 code instruction missing

question about computational complexity in global-attn

The paper claims that global-attention is linear complexity in the number of nodes, but the code(https://github.com/rampasek/GraphGPS/blob/main/graphgps/layer/gps_layer.py#L201) seems to be square complexity. Is it linear or square?

Cannot launch experiments for some datasets (PATTERN/CLUSTER/Peptides-func/Peptides-struct)

Hi, thanks for your nice work.

I tried running GraphGPS for all datasets, but I could find out the following errors for the datasets below:

PATTERN (from configs/GPS/pattern-GPS.yaml) / CLUSTER (from configs/GPS/cluster-GPS.yaml)

Traceback (most recent call last):
  File "/usr2/wjeon/scratch/GraphGPS/main.py", line 143, in <module>
    loaders = create_loader()
  File "/opt/conda/lib/python3.10/site-packages/torch_geometric/graphgym/loader.py", line 311, in create_loader
    dataset = create_dataset()
  File "/opt/conda/lib/python3.10/site-packages/torch_geometric/graphgym/loader.py", line 242, in create_dataset
    dataset = load_dataset()
  File "/opt/conda/lib/python3.10/site-packages/torch_geometric/graphgym/loader.py", line 187, in load_dataset
    dataset = func(format, name, dataset_dir)
  File "/local/mnt/workspace/scratch/wjeon/GraphGPS/graphgps/loader/master_loader.py", line 110, in load_dataset_master
    dataset = preformat_GNNBenchmarkDataset(dataset_dir, name)
  File "/local/mnt/workspace/scratch/wjeon/GraphGPS/graphgps/loader/master_loader.py", line 287, in preformat_GNNBenchmarkDataset
    return dataset
UnboundLocalError: local variable 'dataset' referenced before assignment

peptides-func (from configs/GPS/peptides-func) / peptides-struct (from configs/GPS/peptides-struct)

Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/urllib/request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/opt/conda/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/opt/conda/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/opt/conda/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/opt/conda/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/opt/conda/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/opt/conda/lib/python3.10/http/client.py", line 1454, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/opt/conda/lib/python3.10/ssl.py", line 513, in wrap_socket
    return self.sslsocket_class._create(
  File "/opt/conda/lib/python3.10/ssl.py", line 1071, in _create
    self.do_handshake()
  File "/opt/conda/lib/python3.10/ssl.py", line 1342, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr2/wjeon/scratch/GraphGPS/main.py", line 142, in <module>
    loaders = create_loader()
  File "/opt/conda/lib/python3.10/site-packages/torch_geometric/graphgym/loader.py", line 312, in create_loader
    dataset = create_dataset()
  File "/opt/conda/lib/python3.10/site-packages/torch_geometric/graphgym/loader.py", line 243, in create_dataset
    dataset = load_dataset()
  File "/opt/conda/lib/python3.10/site-packages/torch_geometric/graphgym/loader.py", line 188, in load_dataset
    dataset = func(format, name, dataset_dir)
  File "/local/mnt/workspace/scratch/wjeon/GraphGPS/graphgps/loader/master_loader.py", line 160, in load_dataset_master
    dataset = preformat_Peptides(dataset_dir, name)
  File "/local/mnt/workspace/scratch/wjeon/GraphGPS/graphgps/loader/master_loader.py", line 522, in preformat_Peptides
    dataset = PeptidesFunctionalDataset(dataset_dir)
  File "/local/mnt/workspace/scratch/wjeon/GraphGPS/graphgps/loader/dataset/peptides_functional.py", line 56, in __init__
    super().__init__(self.folder, transform, pre_transform)
  File "/opt/conda/lib/python3.10/site-packages/torch_geometric/data/in_memory_dataset.py", line 55, in __init__
    super().__init__(root, transform, pre_transform, pre_filter, log)
  File "/opt/conda/lib/python3.10/site-packages/torch_geometric/data/dataset.py", line 91, in __init__
    self._download()
  File "/opt/conda/lib/python3.10/site-packages/torch_geometric/data/dataset.py", line 185, in _download
    self.download()
  File "/local/mnt/workspace/scratch/wjeon/GraphGPS/graphgps/loader/dataset/peptides_functional.py", line 75, in download
    if decide_download(self.url):
  File "/opt/conda/lib/python3.10/site-packages/ogb/utils/url.py", line 12, in decide_download
    d = ur.urlopen(url)
  File "/opt/conda/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/opt/conda/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/opt/conda/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/opt/conda/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/opt/conda/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/opt/conda/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)>

May I ask how we can launch the code for those datasets?

Thanks!

Graph generation

I am attempting to use this to analyze chess games represented as graphs. Is it possible to modify this model for graph generation?

Question about maxfreqs

Dear contributors,
I wonder why you use max_freqs = 1 in e.g. https://github.com/rampasek/GraphGPS/blob/main/configs/GPS/zinc-GPS-LapPE%2BRWSE.yaml#L26
Because the eigvec corresponding to the least eigval is ones. So the nodes are indistinguishable with only 1 element of eigvec..

Horrible and Unexpected Statistics While Training

Hello, your work is very good and I appreciate it that I learned a lot from your paper and your code.

But when I reproduce your experiment using python main.py --cfg configs/GPS/ogbg-ppa-GPS.yaml wandb.use False as you instructed, the training statistics are very very strange, or rather, very low for many many epochs.

I checked some closed issues in this repository and found one was similar to mine (this one), where the data set related was code2, and here the issue is about ppa.

As for me, dependencies are as follows:

Here is part of my training process:

It kept very low for very long time (about 160 epochs), less than 0.1117, but abruptly went to 0.78 at epoch 165 (which is not abnormal any more)

I don't think package versions should matter. I'm sorry for bothering but could you just tell me why or reproduce the issue and figure out why? I've tried but still cannot find where is wrong.

error when running the code

first of all, i thank you for this code wonderful,
i have this error when running it:
(graphgps) abdel@abdel-Latitude-5300-2-in-1:~/Downloads/GraphGPS-main$ python main.py --cfg configs/GPS/zinc-GPS+RWSE.yaml wandb.use False
[] Run ID 0: seed=0, split_index=0
Starting now: 2022-12-27 13:48:05.514160
[] Loaded dataset 'Cora' from 'PyG':
Data(x=[2708, 1433], edge_index=[2, 10556], y=[2708], train_mask=[2708], val_mask=[2708], test_mask=[2708])
undirected: True
num graphs: 1
avg num_nodes/graph: 2708
num node features: 1433
num edge features: 0
num classes: 7
Traceback (most recent call last):
File "/home/abdel/Downloads/GraphGPS-main/main.py", line 139, in
loaders = create_loader()
File "/home/abdel/mambaforge/envs/graphgps/lib/python3.9/site-packages/torch_geometric/graphgym/loader.py", line 295, in create_loader
dataset = create_dataset()
File "/home/abdel/mambaforge/envs/graphgps/lib/python3.9/site-packages/torch_geometric/graphgym/loader.py", line 226, in create_dataset
dataset = load_dataset()
File "/home/abdel/mambaforge/envs/graphgps/lib/python3.9/site-packages/torch_geometric/graphgym/loader.py", line 171, in load_dataset
dataset = func(format, name, dataset_dir)
File "/home/abdel/Downloads/GraphGPS-main/graphgps/loader/master_loader.py", line 212, in load_dataset_master
prepare_splits(dataset)
File "/home/abdel/Downloads/GraphGPS-main/graphgps/loader/split_generator.py", line 21, in prepare_splits
setup_random_split(dataset)
File "/home/abdel/Downloads/GraphGPS-main/graphgps/loader/split_generator.py", line 118, in setup_random_split
set_dataset_splits(dataset, [train_index, val_index, test_index])
File "/home/abdel/Downloads/GraphGPS-main/graphgps/loader/split_generator.py", line 145, in set_dataset_splits
mask = index2mask(split_index, size=dataset.data.y.shape[0])
File "/home/abdel/mambaforge/envs/graphgps/lib/python3.9/site-packages/torch_geometric/utils/mask.py", line 15, in index_to_mask
index = index.view(-1)
TypeError: Cannot interpret '-1' as a data type

Saving LapPE instead of precomputing every run

Hello Ladislav!

You may remember me from the LoGaG talk! Thanks for the session :D

I have been playing around with the GraphGPS configs and realised the LapPE precomputing process takes place from scratch every time I run the pcqm4m-GPS.yaml config with main.py. Would it be possible to add a patch that saves this pre-computed information locally so I can quickly access it without having to run the same operation again?

I'm benchmarking such models for my research so I'll be putting it up for training regularly, so was hoping to find ways to avoid the precomputing every time.

Appreciate your consideration, enjoyed reading the paper!

Implementing on our custom dataset

Hello,
First, Thanks for your great work.
Would you please elaborate more that how can I use the functionalities of this project on my attributed graphs?

Thanks

error when running ogbg-ppa

Hello,
Thank you very much for the great code. It is super well-written and very modularized, making it easy to extend. I encounter a error when running ogbg-ppa
with python main.py --cfg configs/GPS/ogbg-ppa-GPS.yaml wandb.use False gt.layer_type CustomGatedGCN+Performer

I got many lines of

accuracy() missing 1 required positional argument: 'task'
accuracy() missing 1 required positional argument: 'task'
accuracy() missing 1 required positional argument: 'task'
accuracy() missing 1 required positional argument: 'task'
accuracy() missing 1 required positional argument: 'task'
accuracy() missing 1 required positional argument: 'task'
accuracy() missing 1 required positional argument: 'task'
accuracy() missing 1 required positional argument: 'task'

followed by

  File "/code/GraphGPS/main.py", line 174, in <module>
    train_dict[cfg.train.mode](https://github.com/rampasek/GraphGPS/issues/loggers,%20loaders,%20model,%20optimizer,%0A%20%20File%20%22/code/GraphGPS/graphgps/train/custom_train.py%22,%20line%20122,%20in%20custom_train%0A%20%20%20%20perf%5B0%5D.append(loggers%5B0%5D.write_epoch(cur_epoch))
  File "/code/GraphGPS/graphgps/logger.py", line 245, in write_epoch
    task_stats = self.classification_multilabel()
  File "/code/GraphGPS/graphgps/logger.py", line 146, in classification_multilabel
    'accuracy': reformat(acc(pred_score, true)),
  File "/code/GraphGPS/graphgps/metric_wrapper.py", line 323, in call
    return self.compute(preds, target)
  File "/code/GraphGPS/graphgps/metric_wrapper.py", line 310, in compute
    metric_val = torch.nanmean(torch.stack(metric_val))  # PyTorch1.10
RuntimeError: stack expects a non-empty TensorList

I was wondering can you identify the source of error. I am using torchmetrics=0.11.0. and torch=1.10.2. Let me know if you need any further information. Thank you!

Run it on any graph dataset supported by PyG

According to final conclusions published on tds , we can easly plug and "Run it on any graph dataset supported by PyG".
I still did not find at documentation how to do it so. Or perhaps, I did not undertand it yet and I apologyze.
Could you please elaborate more on that?
I have my own graph-dataset and I would like to teste the GraphGPS on it.
thanks!

Feature Request: Caching of Precomputed Encodings

Hello,

Thank you for the excellent work! I wanted to inquire if there are any plans to implement saving/caching of the preprocessed graph. When working with new models, the pre-computation step can be quite cumbersome and time-consuming.

Link broken for PCQM-Contact

Hi.
I found that the link for downloading PCQM-Contact seems to be broken,
which is
self.url = 'https://datasets-public-research.s3.us-east-2.amazonaws.com/PCQM4M/pcqm4m-contact.tsv.gz'
at line 294 in pcqm4mv2_contact.py. Could you please check that? Thanks.

AssertionError: Invalid type <class 'NoneType'> for key layer_edge_indices_dir; valid types = {<class 'tuple'>, <class 'int'>, <class 'float'>, <class 'bool'>, <class 'list'>, <class 'str'>}

When I use yacs=0.1.8, I get an error KeyError: 'non-key config key: train.mode'
When I use yacs==0.1.6, I get an error AssertionError: Invalid type <class 'NoneType'> for key layer_edge_indices_dir; valid types = {<class 'tuple'>, <class 'int'>, <class 'float'>, <class 'bool'>, <class 'list'>, <class 'str'>}
I hope you can help to solve it

Where can I download pretrianed ckeckpoints for inference?

Where can I download pretrianed ckeckpoints for inference?
I just want to inference without training.

ValueError: 'cfg.optimizer_config' undefined

Hi, I met this issue, and I try to change the version of torch-geometric. But it have no effect.
I try the version from 1.3.0 to the latest.
Created a temporary directory at /tmp/tmpod96tla3
Writing /tmp/tmpod96tla3/_remote_module_non_scriptable.py
Traceback (most recent call last):
File "/media/tdq/25962996-51be-4a1c-89ae-bbc730928dbb/tdq/lrgb-main/main.py", line 140, in
optimizer = create_optimizer(model.parameters(),
File "/home/tdq/anaconda3/envs/graphgps/lib/python3.10/site-packages/torch_geometric/graphgym/optim.py", line 38, in create_optimizer
return from_config(func)(params, cfg=cfg)
File "/home/tdq/anaconda3/envs/graphgps/lib/python3.10/site-packages/torch_geometric/graphgym/config.py", line 588, in wrapper
raise ValueError(f"'cfg.{arg_name}' undefined")
ValueError: 'cfg.optimizer_config' undefined

AttributeError: 'NoneType' object has no attribute 'mem'

When i tried to run python main.py --cfg configs/GPS/actor-GPS.yaml wandb.use False, it came out the following error:

    register_act('swish', partial(SWISH, inplace=cfg.mem.inplace))
AttributeError: 'NoneType' object has no attribute 'mem'

Constructing custom GT+GNN+POSENC object on own graph

Hey

I am currently working on a problem where I would like to try out different sets of PE/SE encodings combined with different types of GT and GNNs.
From the wording of the GraphGPS paper and TDS post, I half expected that there would be a very clear way of constructing one or several objects that altogether would make up such a model.

In my specific case, I have tensors containing a predefined set of node features and node-node indices indicating edges.

I see that you provide config files specific to each of your datasets, and I see the object components that make up the final GPS model in each case. What I don't understand is how this would all be put together.

Say I wanted to construct an arbitrary model according to the recipe provided in the paper, how would I go about this given my node features and edge indices?

Concerns about time consumption of function graphormer_pre_processing()

I tried to use Graphformer to process some much smaller dataset, such as Clintox in MoleculeNet, than ZINC. I adopted the configuration
provided in configs/Graphormer/zinc-Graphormer.yaml. However, I found that the time comsumption of running function graphormer_pre_processing() is too long. It tooks about 5 mins to process 256 molecules (~6000 nodes) on CPU.

The main time conception is in this step:

graph_index = torch.empty(2, N ** 2, dtype=torch.long)

for i in tqdm(range(N)):
  for j in range(N):
    graph_index[0, i * N + j] = i
    graph_index[1, i * N + j] = j

Is this normal? Or could it be something wrong with my side?

mismatch shape on inference

Hi, thanks for the great project.
While trying to get inference after training the small model with PCQ dataset, I got an error with mismatch shapes. I just ran all the necessary commands in the readme file, and after the line of inference it showed up. I add a screenshot of the error.

Precomputing Positional Encoding statistics: ['RWSE'] for all graphs...
...estimated to be undirected: True
77%|██████████████████████████ | 282370/368014 [05:18<01:37, 881.32it/s]100%|██████████████████████████████████| 368014/368014 [06:54<00:00, 888.64it/s]
Done! Took 00:07:01.53
[*] Loading from pretrained model: pretrained/pcqm4m-GPS+RWSE.deep/0/ckpt/148.ckpt
Traceback (most recent call last):
File "/home//GraphGPS/main.py", line 146, in
model = init_model_from_pretrained(
File "/home//GraphGPS/graphgps/finetuning.py", line 142, in init_model_from_pretrained
model.load_state_dict(model_dict)
File "/home/almogben/miniconda3/envs/graphgps/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1671, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for GraphGymModule:
size mismatch for model.encoder.node_encoder.encoder1.atom_embedding_list.1.weight: copying a param with shape torch.Size([4, 236]) from checkpoint, the shape in current model is torch.Size([5, 236]).

Also I want to train a model on PCQ and after that to do fine-tuning to the model with ZINC, So after I trained the model where it is saved? Should I change your code? or just after thhe training the fine-tuning is automatically on the previous trained model?

Thanks!

error when installing pyg=2.0.4

Hi, @rampasek ,

When I perform the installation on prerequisite packages, I got the following error when running conda install pyg=2.0.4 -c pyg -c conda-forge

(graphgps) root@milton-WS2:/data/code13/GraphGPS# conda install pyg=2.0.4 -c pyg -c conda-forge
Collecting package metadata (current_repodata.json): done
Solving environment: failed with current_repodata.json, will retry with next repodata source.
Initial quick solve with frozen env failed.  Unfreezing env and trying again.
Solving environment: failed with current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed
Initial quick solve with frozen env failed.  Unfreezing env and trying again.
Solving environment: failed

UnsatisfiableError: The following specifications were found to be incompatible with each other:



Package cudatoolkit conflicts for:
pyg=2.0.4 -> pytorch-sparse -> pytorch-scatter -> cudatoolkit[version='10.2.*,11.3.*']
torchaudio -> cudatoolkit[version='>=11.1,<11.2']
torchvision -> pytorch[version='*,1.1.*,1.2.0.*,1.3.1.*,>=0.4',build=cpu*] -> magma[version='>=2.5.2,<2.5.3.0a0,>=2.5.4,<2.5.5.0a0'] -> cudatoolkit[version='10.0|10.0.*,10.1|10.1.*,10.2|10.2.*,9.2|9.2.*']
cudatoolkit
Package pytorch conflicts for:
pytorch=1.10

Any hints to fix this issue?

Thanks

BUG: adding graphormer graph token breaks node permutation invariance

I was experimenting with the graphormer model, specifically for graph classification using the virtual node for global pooling (graph_pooling: graph_token).

Problem

I noticed that the model was producing different outputs for the same input graph with permuted node order. The problem should be easy to replicate, here is an example:

import torch
from torch_geometric.data import Batch

# given some data batch, e.g. inside the training loop
# create a copy of the first graph
data = Batch.from_data_list([batch.get_example(0).clone()])
data_p = Batch.from_data_list([batch.get_example(0).clone()])

# and permute the nodes: 
# here we simply put the previously last node in first place of the first graph
n = data_p.x.size(0)
p = torch.arange(n, dtype=torch.long) - 1
p[0] = n - 1
data_p.x = data_p.x[p]
assert (data_p.x[0, :] == data.x[-1, :]).all()
assert (data_p.x[1:, :] == data.x[:-1, :]).all()

# make sure to permute the other node features as well
data_p.batch = data_p.batch[p]
data_p.in_degrees = data_p.in_degrees[p]
data_p.out_degrees = data_p.out_degrees[p]

# and change the indices accordingly (all increase by one, just the last one gets set to zero)
n = data_p.x.size(0)
data_p.edge_index += 1
data_p.edge_index[data_p.edge_index == n] = 0
data_p.graph_index += 1
data_p.graph_index[data_p.graph_index == n] = 0

# then get the model outputs for each graph
model.eval()
with torch.no_grad():
    output, _ = model(data)
    output_p, _ = model(data_p)

# check if outputs are equal
assert torch.allclose(output, output_p), "Permuted graph produces different output!"

This is unexpected (and worrisome) behavior. In theory, the model architecture should be invariant to such changes, as should any GNN.

Cause

The cause turned out to be in the add_graph_token function, in this line:

data.batch, sort_idx = torch.sort(data.batch)
data.x = data.x[sort_idx]

torch.sort is called to get all the newly concatenated virtual nodes neatly grouped together with their respective other batch nodes.

But it is called without the argument stable, which means the default stable=False is used. As a result the indices inside each graph (same batch index) don't stay in the same order as before. Rather, each graph gets its nodes permuted by the sorting algorithm. This by itself would not necessarily be a problem, as the model should be invariant to such permutations. However, all the indices used in the other data attributes (edge_index, in_degrees, att_bias, etc.) are still referencing the old node order and should then also get permuted/ remapped.

Fix

Of course the much simpler solution is to simply use the stable sorting, and change the line to:

data.batch, sort_idx = torch.sort(data.batch, stable=True)

When running the example from above again with this change the outputs are now indeed the same!

I haven't done any testing yet on how this bug fix affects the training and classification performance, but I could imagine that being node permutation invariant, and not having the node features "randomly" permuted would make things a bit easier for the model...

Test ACC for PPA dataset

Hi,

I tried train the GPS on ppa dataset by using configs/GPS/ogbg-ppa-GPS.yaml. I found the training acc can be near 100%, but the val and test acc are very low, near 1%. The enviroments are the same with Readme file. Is there any potential issues? Thank you!

Configuration Training options

In the Configuration training options, there is no attribute called mode. However, you added this attribute to the configuration files
training:
mode: custom