Code Monkey home page Code Monkey logo

graphgym's Introduction

GraphGym

GraphGym is a platform for designing and evaluating Graph Neural Networks (GNN). GraphGym is proposed in Design Space for Graph Neural Networks, Jiaxuan You, Rex Ying, Jure Leskovec, NeurIPS 2020 Spotlight.

Please also refer to PyG for a tightly integrated version of GraphGym and PyG.

Highlights

1. Highly modularized pipeline for GNN

  • Data: Data loading, data splitting
  • Model: Modularized GNN implementation
  • Tasks: Node / edge / graph level GNN tasks
  • Evaluation: Accuracy, ROC AUC, ...

2. Reproducible experiment configuration

  • Each experiment is fully described by a configuration file

3. Scalable experiment management

  • Easily launch thousands of GNN experiments in parallel
  • Auto-generate experiment analyses and figures across random seeds and experiments.

4. Flexible user customization

  • Easily register your own modules in graphgym/contrib/, such as data loaders, GNN layers, loss functions, etc.

News

  • GraphGym 0.3.0 has been released. Now you may install stable version of GraphGym via pip install graphgym.
  • GraphGym 0.2.0 has been released. Now GraphGym supports Pytorch Geometric backend, in addition to the default DeepSNAP backend. You may try it out in run_single_pyg.sh.
cd run
bash run_single_pyg.sh 

Example use cases

Why GraphGym?

TL;DR: GraphGym is great for GNN beginners, domain experts and GNN researchers.

Scenario 1: You are a beginner to GNN, who wants to understand how GNN works.

You probably have read many exciting papers on GNN, and try to write your own GNN implementation. Using existing packages for GNN, you still have to code up the essential pipeline on your own. GraphGym is a perfect place for your to start learning standardized GNN implementation and evaluation.


Figure 1: Modularized GNN implementation.

Scenario 2: You want to apply GNN to your exciting applications.

You probably know that there are hundreds of possible GNN models, and selecting the best model is notoriously hard. Even worse, we have shown in our paper that the best GNN designs for different tasks differ drastically. GraphGym provides a simple interface to try out thousands of GNNs in parallel and understand the best designs for your specific task. GraphGym also recommends a "go-to" GNN design space, after investigating 10 million GNN model-task combinations.


Figure 2: A guideline for desirable GNN design choices.

(Sampling from 10 million GNN model-task combinations.)

Scenario 3: You are a GNN researcher, who wants to innovate GNN models / propose new GNN tasks.

Say you have proposed a new GNN layer ExampleConv. GraphGym can help you convincingly argue that ExampleConv is better than say GCNConv: when randomly sample from 10 million possible model-task combinations, how often ExampleConv will outperform GCNConv, when everything else is fixed (including the computational cost). Moreover, GraphGym can help you easily do hyper-parameter search, and visualize what design choices are better. In sum, GraphGym can greatly facilitate your GNN research.


Figure 3: Evaluation of a given GNN design dimension
(BatchNorm here).

Installation

Requirements

  • CPU or NVIDIA GPU, Linux, Python3
  • PyTorch, various Python packages; Instructions for installing these dependencies are found below

1. Python environment (Optional): We recommend using Conda package manager

conda create -n graphgym python=3.7
source activate graphgym

2. Pytorch: Install PyTorch. We have verified GraphGym under PyTorch 1.8.0, and GraphGym should work with PyTorch 1.4.0+. For example:

# CUDA versions: cpu, cu92, cu101, cu102, cu101, cu111
pip install torch==1.8.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html

3. Pytorch Geometric: Install PyTorch Geometric, follow their instructions. For example:

# CUDA versions: cpu, cu92, cu101, cu102, cu101, cu111
# TORCH versions: 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0
CUDA=cu101
TORCH=1.8.0
pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-cluster -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-spline-conv -f https://pytorch-geometric.com/whl/torch-${TORCH}+${CUDA}.html
pip install torch-geometric

4. GraphGym and other dependencies:

git clone https://github.com/snap-stanford/GraphGym
cd GraphGym
pip install -r requirements.txt
pip install -e .  # From latest verion
pip install graphgym # (Optional) From pypi stable version

5. Test the installation

Run a single experiment. Run a test GNN experiment using GraphGym run_single.sh. Configurations are specified in example.yaml. The experiment is about node classification on Cora dataset (random 80/20 train/val split).

cd run
bash run_single.sh # run a single experiment

Run a batch of experiments. Run a batch of GNN experiments using GraphGym run_batch.sh. Configurations are specified specified in example.yaml (controls the basic architecture) and example.txt (controls how to do grid search). The experiment examines 96 models in the recommended GNN design space, on 2 graph classification datasets. Each experiment is repeated 3 times, and we set that 8 jobs can be concurrently run. Depending on your infrastructure, finishing all the experiments may take a long time; you can quit the experiment by Ctrl-C (GraphGym will properly kill all the processes).

cd run
bash run_batch.sh # run a batch of experiments 

(Optional) Run GraphGym with CPU backend. GraphGym supports cpu backend as well -- you only need to add one line device: cpu to the .yaml file. Here we provide an example.

cd run
bash run_single_cpu.sh # run a single experiment using CPU backend

(Optional) Run GraphGym with PyG backend. Run GraphGym with Pytorch Geometric (PyG) backend run_single_pyg.sh and run_batch_pyg.sh, instead of the default DeepSNAP backend. The PyG backend follows the native PyG implementation, and is slightly more efficient than the DeepSNAP backend. Currently the PyG backend only supports user-provided dataset splits, such as PyG native datasets or OGB datasets.

cd run
bash run_single_pyg.sh # run a single experiment using PyG backend
bash run_batch_pyg.sh # run a batch of experiments using PyG backend 

GraphGym In-depth Usage

1 Run a single GNN experiment

A full example is specified in run/run_single.sh.

1.1 Specify a configuration file. In GraphGym, an experiment is fully specified by a .yaml file. Unspecified configurations in the .yaml file will be populated by the default values in graphgym/config.py. For example, in run/configs/example.yaml, there are configurations on dataset, training, model, GNN, etc. Concrete description for each configuration is described in graphgym/config.py.

1.2 Launch an experiment. For example, in run/run_single.sh:

python main.py --cfg configs/example.yaml --repeat 3

You can specify the number of different random seeds to repeat via --repeat.

1.3 Understand the results. Experimental results will be automatically saved in directory run/results/${CONFIG_NAME}/; in the example above, it is run/results/example/. Results for different random seeds will be saved in different subdirectories, such as run/results/example/2. The aggregated results over all the random seeds are automatically generated into run/results/example/agg, including the mean and standard deviation _std for each metric. Train/val/test results are further saved into subdirectories, such as run/results/example/agg/val; here, stats.json stores the results after each epoch aggregated across random seeds, best.json stores the results at the epoch with the highest validation accuracy.

2 Run a batch of GNN experiments

A full example is specified in run/run_batch.sh.

2.1 Specify a base file. GraphGym supports running a batch of experiments. To start, a user needs to select a base architecture --config. The batch of experiments will be created by perturbing certain configurations of the base architecture.

2.2 (Optional) Specify a base file for computational budget. Additionally, GraphGym allows a user to select a base architecture to control the computational budget for the grid search, --config_budget. The computational budget is currently measured by the number of trainable parameters; the control is achieved by auto-adjust the hidden dimension size for GNN. If no --config_budget is provided, GraphGym will not control the computational budget.

2.3 Specify a grid file. A grid file describes how to perturb the base file, in order to generate the batch of the experiments. For example, the base file could specify an experiment of 3-layer GCN for Cora node classification. Then, the grid file specifies how to perturb the experiment along different dimension, such as number of layers, model architecture, dataset, level of task, etc.

2.4 Generate config files for the batch of experiments, based on the information specified above. For example, in run/run_batch.sh:

python configs_gen.py --config configs/${DIR}/${CONFIG}.yaml \
  --config_budget configs/${DIR}/${CONFIG}.yaml \
  --grid grids/${DIR}/${GRID}.txt \
  --out_dir configs

2.5 Launch the batch of experiments. For example, in run/run_batch.sh:

bash parallel.sh configs/${CONFIG}_grid_${GRID} $REPEAT $MAX_JOBS

Each experiment will be repeated for $REPEAT times. We implemented a queue system to sequentially launch all the jobs, with $MAX_JOBS concurrent jobs running at the same time. In practice, our system works great when handling thousands of jobs.

2.6 Understand the results. Experimental results will be automatically saved in directory run/results/${CONFIG_NAME}_grid_${GRID_NAME}/; in the example above, it is run/results/example_grid_example/. After running each experiment, GraphGym additionally automatically averages across different models, saved in run/results/example_grid_example/agg. There, val.csv represents validation accuracy for each model configuration at the final epoch; val_best.csv represents the results at the epoch with the highest average validation error; val_best_epoch.csv represents the results at the epoch with the highest validation error, averaged over different random seeds. When test set split is provided, test.csv represents test accuracy for each model configuration at the final epoch; test_best.csv represents the test set results at the epoch with the highest average validation error; test_best_epoch.csv represents the test set results at the epoch with the highest validation error, averaged over different random seeds.

3 Analyze the results

We provides a handy tool to automatically provide an overview of a batch of experiments in analysis/example.ipynb.

cd analysis
jupyter notebook
example.ipynb   # automatically provide an overview of a batch of experiments

4 User customization

A highlight of GraphGym is that it allows users to easily register their customized modules. The supported customized modules are provided in directory graphgym/contrib/, including:

Within each directory, (at least) an example is provided, showing how to register user customized modules. Note that new user customized modules may result in new configurations; in these cases, new configuration fields can be registered at graphgym/contrib/config/.

Note: Applying to your own datasets. A common use case will be applying GraphGym to your favorite datasets. To do so, you may follow our example in graphgym/contrib/loader/example.py. GraphGym currently accepts a list of NetworkX graphs or PyG datasets.

Use case: Design Space for Graph Neural Networks (NeurIPS 2020 Spotlight)

Reproducing experiments in Design Space for Graph Neural Networks, Jiaxuan You, Rex Ying, Jure Leskovec, NeurIPS 2020 Spotlight. You may refer to the paper or project webpage for more details.

# NOTE: We include the raw results with GraphGym
# If you run the following code, the results will be overridden.
cd run/scripts/design/
bash run_design_round1.sh   # first round experiments, on a design space of 315K GNN designs
bash run_design_round2.sh   # second round experiments, on a design space of 96 GNN designs
cd ../analysis
jupyter notebook
design_space.ipynb   # reproducing all the analyses in the paper

Figure 4: Overview of the proposed GNN design space and task space.

Use case: Identity-aware Graph Neural Networks (AAAI 2021)

Reproducing experiments in Identity-aware Graph Neural Networks, Jiaxuan You, Jonathan Gomes-Selman, Rex Ying, Jure Leskovec, AAAI 2021. You may refer to the paper or project webpage for more details.

# NOTE: We include the raw results for ID-GNN in analysis/idgnn.csv
cd run/scripts/IDGNN/
bash run_idgnn_node.sh   # Reproduce ID-GNN node-level results
bash run_idgnn_edge.sh   # Reproduce ID-GNN edge-level results
bash run_idgnn_graph.sh   # Reproduce ID-GNN graph-level results

Figure 5: Overview of Identity-aware Graph Neural Networks (ID-GNN).

Use case: Relational Multi-Task Learning: Modeling Relations between Data and Tasks (ICLR 2022 Spotlight)

Reproducing experiments in Relational Multi-Task Learning: Modeling Relations between Data and Tasks, Kaidi Cao*, Jiaxuan You*, Jure Leskovec, ICLR 2022.

# NOTE: We include the raw results for ID-GNN in analysis/idgnn.csv
git checkout meta_link
cd run/scripts/MetaLink/
bash run_metalink.sh.sh   # Reproduce MetaLink results for graph classification

Figure 5: Overview of Identity-aware Graph Neural Networks (ID-GNN).

Use case: ROLAND: Graph Learning Framework for Dynamic Graphs (KDD 2022)

ROLAND: Graph Learning Framework for Dynamic Graphs, Jiaxuan You, Tianyu Du, Jure Leskovec, KDD 2022. ROLAND forks GraphGym implementation. Please checkout the corresponding repository for ROLAND.

Contributors

Jiaxuan You initiates the project and majorly contributes to the entire GraphGym platform. Rex Ying contributes to the feature augmentation modules. Jonathan Gomes Selman enables GraphGym to have OGB support.

GraphGym is inspired by the framework of pycls. GraphGym adopts DeepSNAP as the default data representation. Part of GraphGym relies on Pytorch Geometric functionalities.

Contributing

We warmly welcome the community to contribute to GraphGym. GraphGym is particularly designed to enable contribution / customization in a simple way. For example, you may contribute your modules to graphgym/contrib/ by creating pull requests.

Citing GraphGym

If you find GraphGym or our paper useful, please cite our paper:

@InProceedings{you2020design,
  title = {Design Space for Graph Neural Networks},
  author = {You, Jiaxuan and Ying, Rex and Leskovec, Jure},
  booktitle = {NeurIPS},
  year = {2020}
}

graphgym's People

Contributors

guangxuan-xiao avatar jiaxuanyou avatar joneswong avatar rdyro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

graphgym's Issues

Root node not considered for message-passing in GeneralConv

Hi,

this is not technically an issue but more of a general question, I apologize if this is not the best place to ask.

Looking at the GeneralConv layer, it seems that the representation for a given node is, by default, only computed from the neighbours and not the node itself:

return self.propagate(edge_index, x=x, norm=norm,
edge_feature=edge_feature)

This is only true if normalize=False in the layer, because the norm staticmethod has a call to add_remaning_self_loop that effectively adds nodes into their own neighbourhood:

edge_index, edge_weight = add_remaining_self_loops(
edge_index, edge_weight, fill_value, num_nodes)

If normalize=False the call to norm is skipped and so a node is not neighbour of itself.
I was wondering if this is intentional, and if so what is the rationale behind this design choice.

I am re-implementing this layer for another GNN library and I need to make a choice about whether to process nodes as part of their neighbourhood.
I know that 1) in the paper and 2) in the code there is no ambiguity about this fact, but it sounds strange to me so I wanted to double-check with you :D

Thanks

run_batch.sh

Capture

Hello!

I'm experiencing an issue when running a batch of experiements, indeed after the end of each run I obtain only the stats.json file and not the best.json file.
This does not happen if I run a single experiment. I have attached the error message, hope this will help!

Thank you in advance :)

Readme

When installing PyTorch, no check for missing g++ compiler.

Loading the fixed data split from the original PyTorch Geometric dataset (masks)

Hello,

I am trying to load a dataset and to keep the dataset split, as masks already exist.

I realized there exists an argument that controls this:

@staticmethod
    def pyg_to_graphs( dataset, verbose: bool = False, fixed_split: bool = False, tensor_backend: bool = False, netlib=None ) -> List[Graph]:
        r"""
        Transform a :class: torch_geometric.data.Dataset object to a 
        list of :class:deepsnap.grpah.Graph  objects.

        Args:
            dataset (:class:`torch_geometric.data.Dataset`): A 
                :class:`torch_geometric.data.Dataset` object that will be 
                transformed to a list of :class:`deepsnap.grpah.Graph` 
                objects.
            verbose (bool): Whether to print information such as warnings.
            fixed_split (bool): Whether to load the fixed data split from 
                the original PyTorch Geometric dataset.
            tensor_backend (bool): `True` will use pure tensors for graphs.
            netlib (types.ModuleType, optional): The graph backend module. 
                Currently DeepSNAP supports the NetworkX and SnapX (for 
                SnapX only the undirected homogeneous graph) as the graph 
                backend. Default graph backend is the NetworkX.

        Returns:
            list: A list of :class:`deepsnap.graph.Graph` objects.
        """

However, when I run it, it always propagates with errors. Now I'm not sure whether it is implemented until the end or it is yet to be done.

I would appreciate your help and instructions on how can I accomplish this.
Best,

About the MetaLink code

Hi, in the paper "RELATIONAL MULTI-TASK LEARNING: MODELING RELATIONS BETWEEN DATA AND TASKS", the authors said the code has been released in this repository, but we could not find it. Could we know which part of the code is about MetaLink? Thanks.

error when running test script:

I had installed the graph gym step by step as readme mentioned, But I got this error:
Traceback (most recent call last):
File "main.py", line 5, in
from torch_geometric import seed_everything
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_geometric/init.py", line 1, in
import torch_geometric.utils
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_geometric/utils/init.py", line 3, in
from .scatter import scatter
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_geometric/utils/scatter.py", line 7, in
import torch_geometric.typing
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_geometric/typing.py", line 37, in
import torch_sparse # noqa
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_sparse/init.py", line 40, in
from .tensor import SparseTensor # noqa
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_sparse/tensor.py", line 13, in
class SparseTensor(object):
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch/jit/_script.py", line 974, in script
_compile_and_register_class(obj, _rcb, qualified_name)
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch/jit/_script.py", line 67, in _compile_and_register_class
torch._C._jit_script_class_compile(qualified_name, ast, defaults, rcb)
RuntimeError:
Tried to access nonexistent attribute or method 'crow_indices' of type 'Tensor'.:
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_sparse/tensor.py", line 109
def from_torch_sparse_csr_tensor(self, mat: torch.Tensor,
has_value: bool = True):
rowptr = mat.crow_indices()
~~~~~~~~~~~~~~~~ <--- HERE
col = mat.col_indices()

How Can I solve this issue?

Just the validation performance reported?

I see that 7.2 Experimental Setup in paper

For all the experiments in Sections 7.3 and 7.4, we use a consistent setup, where results on three random 80%/20% train/val splits are averaged, and the validation performance in the final epoch is reported.

In Sections 7.3 and 7.4, the performance used in ranking analysis are all validation performance.
First, the validation performance in the final epoch meanss it will run out of all epochs, which is final epoch, am I right?
Second, I am wondering why we don't use the early-stop and use the test performance mentioned below, which is the best validation epoch test performance.

how to report the performance (e.g., final epoch or the best validation epoch) in section 7.1.

What's the problem?

image

What is the problem here?

Same problem with run_single.sh as well as other run.sh.

running simple example on CPU

Hi @JiaxuanYou,

Thank you for making this code base available!

I do not have an NVIDIA GPU. So I tried to run the example on CPU. However, I ran into errors because the logger was calling the get_current_gpu_usage() function in device.py .

To fix this, I wrapped this function in the same conditional as the auto_select_device() function like so:

def get_current_gpu_usage():
    if cfg.device != 'cpu' and torch.cuda.is_available():
        result = subprocess.check_output(
            [
                'nvidia-smi', '--query-compute-apps=pid,used_memory',
                '--format=csv,nounits,noheader'
            ], encoding='utf-8')
        current_pid = os.getpid()
        used_memory = 0
        for line in result.strip().split('\n'):
            line = line.split(', ')
            if current_pid == int(line[0]):
                used_memory += int(line[1])
        return used_memory
    else:
        return 0

Just wanted to let you know. You may have a better way to do it :)

Kyle

Issue with registering loader in separate py file

Hey there,

it seems like there is a weird flow-of-control issue when registering data loaders.

If you simply register another loader function in contrib/loader/example.py

def load_dataset_example2(format, name, dataset_dir):
    pass

register_loader('example2', load_dataset_example2)

everything works fine and the register.loader_dict is properly populated in loader.load_dataset.

However, if instead one creates a separate file contrib/loader/myloader.py with the same contents, the loader is not available in load_dataset.

Stepping through with the debugger shows that register_loader is indeed called both times for each loader, but `loader_dict´ is actually empty when either loader is added.

A wild guess is that register.py is ran twice and as such register_dict is being reset to {}.

Can anyone reproduce this? Or am I misunderstanding something?

Cheers,
Ben

Crash at end of regression runs

When I make a run with cfg.dataset.task_type = 'regression', the code crashes at the end of the run. The error message is:

Traceback (most recent call last):
 File "main_pyg.py", line 55, in <module>
   agg_runs(cfg.out_dir, cfg.metric_best)
 File "~/Code/GraphGym/graphgym/utils/agg_runs.py", line 100, in agg_runs
   [stats[metric] for stats in stats_list])
 File "~/Code/GraphGym/graphgym/utils/agg_runs.py", line 100, in <listcomp>
   [stats[metric] for stats in stats_list])
KeyError: 'accuracy'

The problem seems to be that accuracy is not a metric logged for regression tasks. Here are the relevant lines in agg_runs.py:

                if metric_best == 'auto':
                    metric = 'auc' if 'auc' in stats_list[0] else 'accuracy'

Here's a fix:

                if metric_best == 'auto':
                    if cfg.dataset.task_type == 'classification':
                        metric = 'auc' if 'auc' in stats_list[0] else 'accuracy'
                    elif cfg.dataset.task_type == 'regression':
                        metric = 'mse'

Testing a single experiment: tried to access nonexistent attribute or method.

-- EDIT --

Hi, I experienced an issue when following your instructions and testing the installation.
I'm using Ubuntu 20.04 and installed the CPU version of Pytorch.

When testing the single experiment (bash run_single_cpu.sh), I got the following error:

Traceback (most recent call last):
  File "main.py", line 5, in <module>
    from torch_geometric import seed_everything
  File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/__init__.py", line 4, in <module>
    import torch_geometric.data
  File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
    from .data import Data
  File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/data/data.py", line 9, in <module>
    from torch_sparse import SparseTensor
  File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_sparse/__init__.py", line 41, in <module>
    from .tensor import SparseTensor  # noqa
  File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_sparse/tensor.py", line 13, in <module>
    class SparseTensor(object):
  File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch/jit/_script.py", line 974, in script
    _compile_and_register_class(obj, _rcb, qualified_name)
  File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch/jit/_script.py", line 67, in _compile_and_register_class
    torch._C._jit_script_class_compile(qualified_name, ast, defaults, rcb)
RuntimeError: 
Tried to access nonexistent attribute or method 'crow_indices' of type 'Tensor'.:
  File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_sparse/tensor.py", line 109
    def from_torch_sparse_csr_tensor(self, mat: torch.Tensor,
                                     has_value: bool = True):
        rowptr = mat.crow_indices()
                 ~~~~~~~~~~~~~~~~ <--- HERE
        col = mat.col_indices()

I found a seemingly closely related problem, see this post: rusty1s/pytorch_sparse#207
and tried to resolve it by downgrading to an older torch-sparse version:

pip install torch-sparse==0.6.12

This seemed to fix the issue.

Error when run "base run_single.sh" at step 6 Test the installation

I have so far tried it with torch1.4.0, 1.5.0 and lately 1.7.0 on my ubuntu18.04. I successfully installed them, but faced errors when I ran the step 6 Test the installation. The 1.7.0 is a step closer but it still had the following error when 'bash run_single.sh'. I searched and found nowhere to see libcusparse.so.

Traceback (most recent call last):
  File "main.py", line 11, in <module>
    from graphgym.loader import create_dataset, create_loader
  File "/home/hdd2nd/dev/projects/gnn/GraphGym/graphgym/loader.py", line 6, in <module>
    from deepsnap.dataset import GraphDataset
  File "/home/hdd2nd/dev/projects/gnn/DeepSNAP/deepsnap/__init__.py", line 5, in <module>
    import deepsnap.graph
  File "/home/hdd2nd/dev/projects/gnn/DeepSNAP/deepsnap/graph.py", line 9, in <module>
    from torch_geometric.utils import to_undirected
  File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/__init__.py", line 5, in <module>
    import torch_geometric.data
  File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
    from .data import Data
  File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/data/data.py", line 8, in <module>
    from torch_sparse import coalesce, SparseTensor
  File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_sparse/__init__.py", line 15, in <module>
    f'{library}_{suffix}', [osp.dirname(__file__)]).origin)
  File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch/_ops.py", line 105, in load_library
    ctypes.CDLL(path)
  File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/ctypes/__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcusparse.so.10: cannot open shared object file: No such file or directory

I also listed the installation steps as follows for reference. I successfully installed all by following closely the instructions on https://github.com/snap-stanford/GraphGym. Note that I installed torch-geometric following https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html because the commands from GraphGym didn't work.

>conda create -n graphgym python=3.7
>conda activate graphgym
>pip install torch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0
>(graphgym) $ pip install torch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0
>(graphgym) $ pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html
>(graphgym) $ pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html
>(graphgym) $ pip install torch-cluster -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html
>(graphgym) $ pip install torch-spline-conv -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html
>(graphgym) $ pip install torch-geometric
>(graphgym) $ git clone https://github.com/snap-stanford/DeepSNAP
>(graphgym) $ cd DeepSNAP
>(graphgym) $ pip install -e .
>(graphgym) $ cd ..
>(graphgym) $ git clone https://github.com/snap-stanford/GraphGym
>(graphgym) $ cd GraphGym/
>(graphgym) $ pip install -r requirements.txt
>(graphgym) $ pip install -e .

Thank you

Question about ID-GNN

Hello,

I am interested in reproducing the IDGNN's results on node classification. I looked at the code and had a few quick questions

  1. When doing node classification, all configurations are in (https://github.com/snap-stanford/GraphGym/blob/master/run/grids/IDGNN/graph.txt )?

  2. Is it correct here that dataset.augment_feature feature = 'node_identity' means ID-GNN fast?

  3. If 2 is true, May I know where the node identity, i.e. cycle information, is used (position in the code) and how to use in code level?

Thank you so much for taking the time out of your busy schedule.

Some questions in config.py and MetaLink's datasets

Thank you for the nice library. but i got some troubles in MetaLink, i sincerely wish you to give some help:

  1. Some default settings in run/config/MetaLink/mol_classification.yaml can not be found in graphgym/config.py, such like cfg.dataset.subgraph = False, cfg.kg.xxx = xxx. I copied some 'kg' settings from graphgym/contrib/config/metalink.py to graphgym/config.py for now, but i wonder if there is another config.py for MetaLink?

  2. I have downloaded several datasets like tox21 and toxcast, but it seems not suitble, i don't have some csv files that are required in graphgym/contrib/loader/molecule.py, line 311, the function load_mol_datasets(). May I ask if you have more details about the datasets processing especially about the csv file 'tox21.csv'? Where can i download the suitable format datasets?

Thank you so much, looking forward to hear from you. @JiaxuanYou

Error producing results

The execution goes pretty smoothly and then at the very end outputs this

train: {'epoch': 399, 'eta': 0.0, 'loss': 0.6025, 'lr': 0.0, 'params': 265218, 'time_iter': 0.0837, 'accuracy': 0.688, 'precision': 0.5556, 'recall': 0.0324, 'f1': 0.0612, 'auc': 0.6239}
val: {'epoch': 399, 'loss': 0.6162, 'lr': 0, 'params': 265218, 'time_iter': 0.0465, 'accuracy': 0.6667, 'precision': 0.6, 'recall': 0.0361, 'f1': 0.0682, 'auc': 0.6527}
test: {'epoch': 399, 'loss': 0.6214, 'lr': 0, 'params': 265218, 'time_iter': 0.0448, 'accuracy': 0.6545, 'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'auc': 0.6245}
Check point saved: results/example/1/ckpt/399.ckpt
Task done, results saved in results/example/1
359
{'epoch': 359, 'loss': 0.6233, 'lr': 0, 'params': 265218, 'time_iter': 0.0455, 'accuracy': 0.6585, 'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'auc': 0.6213}
{'epoch': 359, 'eta': 210.5677, 'loss': 0.6055, 'lr': 0.0003, 'params': 265218, 'time_iter': 0.0848, 'accuracy': 0.689, 'precision': 0.6667, 'recall': 0.0194, 'f1': 0.0377, 'auc': 0.6138}
{'epoch': 359, 'loss': 0.6175, 'lr': 0, 'params': 265218, 'time_iter': 0.0472, 'accuracy': 0.6707, 'precision': 0.75, 'recall': 0.0361, 'f1': 0.069, 'auc': 0.655}
239
{'epoch': 239, 'eta': 2.2639, 'loss': 0.0281, 'lr': 0.0035, 'params': 632328, 'time_iter': 0.0143, 'accuracy': 0.9991}
{'epoch': 239, 'loss': 0.4795, 'lr': 0, 'params': 632328, 'time_iter': 0.0079, 'accuracy': 0.8819}
359
{'epoch': 359, 'eta': 0.573, 'loss': 0.0139, 'lr': 0.0003, 'params': 632328, 'time_iter': 0.0154, 'accuracy': 1.0}
{'epoch': 359, 'loss': 0.4917, 'lr': 0, 'params': 632328, 'time_iter': 0.0073, 'accuracy': 0.8875}
Traceback (most recent call last):
  File "main.py", line 60, in <module>
    agg_runs(get_parent_dir(out_dir_parent, args.cfg_file), cfg.metric_best)
  File "/nfs/data/patients_networks/olga_scripts/GraphGym/graphgym/utils/agg_runs.py", line 110, in agg_runs
    results[key][i] = agg_dict_list(results[key][i])
  File "/nfs/data/patients_networks/olga_scripts/GraphGym/graphgym/utils/agg_runs.py", line 47, in agg_dict_list
    value = np.array([dict[key] for dict in dict_list])
  File "/nfs/data/patients_networks/olga_scripts/GraphGym/graphgym/utils/agg_runs.py", line 47, in <listcomp>
    value = np.array([dict[key] for dict in dict_list])
KeyError: 'precision'

Results folder is produced, but there are only accuracy metrics and no "test" folder. I suspect that maybe it's related to the fact that the precision is equal to 0 on the test set.

Issue with create_dataset() in loader.py

Hi! Thank you for the great tool for working with GNN! However, it seems to me that there is probably an issue with the create_dataset() function in loader.py. Specifically, when calling GraphDataset(), it assigns "cfg.dataset.resample_disjoint" to "resample_disjoint". However, the GraphDataset does not have an attribute "resample_disjoint". I am wondering whether that should be "edge_train_mode" instead.

Lack of support for Generative Model

I want to apply GNN to new applications i.e. Scenario 2 (Scenario 2: You want to apply GNN to your exciting applications.)

I see all the models are for predictive tasks.

I am wondering whether you are planning to include generative models in the future?

For example, whether there are any plans to include any of the following models:

Also, it lacks examples of any Graph2Seq based models. It would be awesome to include consider any of the following Graph2Seq based generative models.

Including any other generative model would be a great starting point as well.

I am really interested to know your thoughts in this regard.

question about IDGNN

Hello,

I am interested in reproducing the IDGNN's results on graph classification. I looked at the code and had a few quick questions

  1. are all configurations listed in https://github.com/snap-stanford/GraphGym/blob/master/run/grids/IDGNN/graph_enzyme.txt? i.e., the major arguments I need to change is dataset.augment_feature? I am mainly interested in reproducing results at Table 6.

  2. there are ID-GNN and ID-GNN-Fast. Are both implemented in this repository?

  3. how is the heterogeneous message passing is implemented?

Thank you very much!!

PRED vectors are wrongly flattened when there are trailing dimensions in TRUE vectors.

if true.ndim > 1 and cfg.model.loss_fun == 'cross_entropy':
pred, true = torch.flatten(pred), torch.flatten(true)
pred = pred.squeeze(-1) if pred.ndim > 1 else pred
true = true.squeeze(-1) if true.ndim > 1 else true

When the shape of the label vector true is [node_num, 1] instead of [node_num], lines 22-23 will misinterpret the task as a multi-task binary classification task rather than a multi-class classification task. So the pred vector will be wrongly flattened. The correct code should be

    pred = pred.squeeze(-1) if pred.ndim > 1 else pred
    true = true.squeeze(-1) if true.ndim > 1 else true
    if true.ndim > 1 and cfg.model.loss_fun == 'cross_entropy':
        pred, true = torch.flatten(pred), torch.flatten(true)

, which moves lines 24-25 to the front of lines 22-23.

Error

I don't know why the configuration files generated by using grid search are different from the initial settings. I didn't enumerate their changes.

Custom PyG dataset usage

Hi! I'm a little confused if it's actually possible to use my own dataset in a PyG format or not.
load_pyg function kind of suggests that the name of a dataset can only be one of the fixed ones (CiteSeer, PPI, Cora, etc) while the readme states that a PyG upload should be possible: "GraphGym currently accepts a list of NetworkX graphs or PyG datasets.".
If PyG format for the custom data is not possible then I guess it must be networkx. Is there any example how those graphs should look like? In particular, I'm interested in graph classification, if that matters.

Olga

Test the installation

when i run (bash bash run_single.sh), some error appear, but when i run(bash run_batch.sh), it is ok. i am not sure if it is a successful installation

Stuck when testing the installation

After following the installation instruction and running bash run_single.sh in terminal, the program gave no responce, and after terminating it manually, here's what I got
image
But the shell script run_single_pyg.sh worked. What should I do?
The environment is attached down here.
image
image

An error occurs when importing graphgym.config

Hi,
When I run the script 'run_metalink.sh' (in brach meta_link), an ImportError occurs. The traceback messages are shown as follows.

Traceback (most recent call last): File "main.py", line 8, in <module> from graphgym.config import cfg, dump_cfg, load_cfg, set_out_dir, set_run_dir ImportError: cannot import name 'set_out_dir' from 'graphgym.config' (/usr/local/lib/python3.8/dist-packages/graphgym/config.py)

How can I fix the issue?

About Scenario 2

Thanks for the exciting program!

I am suffering from finding the optimal GNN model on the node classification task. This problem is caused by too much freedom of choice within and between layers. In other words, there are too many models to choose from and too many hyperparameters to optimize.

Referring to ogb's leaderboard to find the optimal model is a potential solution, but as the paper showed,

the best GNN designs for different tasks differ drastically.

From my understanding, GraphGym has provided the idea that similar tasks can share the optimal model design.

As mentioned,

GraphGym provides a simple interface to try out thousands of GNNs in parallel and understand the best designs for your specific task.
GraphGym also recommends a "go-to" GNN design space, after investigating 10 million GNN model-task combinations.

I would like to know if there are off-the-shelf model-task combinations that I can use directly, without using the interface to try out GNN designs.

CUDA out-of-memory error

I tried to execute run_batch.sh. It seems that there is no mechanism to block the emerging of trials, as there are more and more processes running in my GPUs. And I observed the out-of-memory error of CUDA.

about configs_gen.py

Hello
Thank you for Machine Learning on Graph courses. I identify GraphGym through it.

I have a question about installing GraphGym.
I try on colab
I follow the instruction, but I get this error:

image

What change should I make?

Thank you.

Documentation for configuration options and dataset registration.

Hello!

This project is truly amazing, thank you. That said I'm finding it difficult to apply it to my own datasets. Naturally, I would like to customize the grid search however, I'm not sure what the valid options are for each field in the configuration. The valid options I know are thanks to the examples configs and grids in the repo, but a comprehensive list for each field would be greatly appreciated. Is there any existing documentation on this matter?

I'm also unsure about how to register my datasets. At which point in the pipeline should the customized version of graphgym/contrib/loader/example.py be run? I'm guessing before the config generation script as the configs must include the dataset information. Still, I'm unsure about how this piece of code fits in the pipeline.

Thank you in advance.

Is GraphGym okay with windows10?

Hello!
Thank you for producing awesome library.
I'm new to Graph Neural Network, and i am using windows 10 now.
I found some files in guideline is not available to windows (like .sh files)
Is there any constraint for actually using GraphGym in windows?

batch training ogb-molhiv

I am running a graph classification model on ogb-molhiv.

My model's forward function is being passed a batch with the following fields/shapes.

Batch(G=[128], batch=[3512], edge_feature=[7498, 3], edge_index=[2, 7498], edge_label_index=[2, 7498], graph_label=[128, 1], node_feature=[3512, 128], node_label_index=[3512], task=[128])

This dataset has ~40,000 graphs with ~25 nodes per graph. The batch info is present in the batch.batch tensor, but how do I do batch training here, processing each batch independently i.e. predict y0 given nodeset0 and edgeset0?

I forked the repo and include a comparison here: master...jkamalu:gmt

Error using TU_IMDB dataset

Hi,

I'm getting the following error when I try to use the TU_IMDB dataset:

Traceback (most recent call last):
  File "/Users/psanchez/Documents/GitHub/transformer_message_passing/run/main.py", line 42, in <module>
    datasets = create_dataset()
  File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/graphgym-0.3.1-py3.9.egg/graphgym/loader.py", line 197, in create_dataset
    graphs = load_dataset()
  File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/graphgym-0.3.1-py3.9.egg/graphgym/loader.py", line 111, in load_dataset
    graphs = load_pyg(name, dataset_dir)
  File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/graphgym-0.3.1-py3.9.egg/graphgym/loader.py", line 74, in load_pyg
    graphs = GraphDataset.pyg_to_graphs(dataset_raw)
  File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/deepsnap/dataset.py", line 1276, in pyg_to_graphs
    return [
  File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/deepsnap/dataset.py", line 1277, in <listcomp>
    Graph.pyg_to_graph(
  File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/deepsnap/graph.py", line 2027, in pyg_to_graph
    Graph.add_node_attr(G, key, value)
  File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/deepsnap/graph.py", line 1911, in add_node_attr
    attr_dict = dict(zip(node_list, node_attr))
TypeError: 'int' object is not iterable

I'm running the main.py with the following dataset configuration file:

out_dir: results
dataset:
  format: PyG
  name: TU_IMDB
  task: graph
  task_type: classification
  transductive: False
  split: [0.8, 0.2]
  augment_feature: []
  augment_feature_dims: [10]
  augment_feature_repr: position
  augment_label: ''
  augment_label_dims: 5
  transform: none
train:
  batch_size: 32
  eval_period: 20
  ckpt_period: 100
model:
  type: gnn
  loss_fun: cross_entropy
  edge_decoding: dot
  graph_pooling: add
gnn:
  layers_pre_mp: 1
  layers_mp: 2
  layers_post_mp: 1
...

Any idea why this might be happening?

Thanks a lot in advance.

error in configs/idgnn/graph_enzyme.yaml

Thank you for the nice library. I noticed that the under configs/IDGNN/graph_enzyme.yaml, the name of the dataset is ba500. I guess this is a bug? Would you like to update the yaml file? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.