Code Monkey home page Code Monkey logo

wwl's Introduction

Wasserstein Weisfeiler-Lehman Graph Kernels

This repository contains the accompanying code for the NeurIPS 2019 paper Wasserstein Weisfeiler-Lehman Graph Kernels available here. The repository contains both the package that implements the graph kernels (in src) and scripts to reproduce some of the results of the paper (in experiments).

Dependencies

WWL relies on the following dependencies:

  • numpy
  • scikit-learn
  • POT
  • cython

Installation

The easiest way is to install WWL from the Python Package Index (PyPI) via

$ pip install Cython numpy 
$ pip install wwl

Usage

The WWL package contains functions to generate a n x n kernel matrix between a set of n graphs.

The API also allows the user to directly call the different steps described in the paper, namely:

  • generate the embeddings for the nodes of both discretely labelled and continuously attributed graphs,
  • compute the pairwise distance between a set of graphs

Please refer to the src README for detailed documentation.

Experiments

You can find some experiments in the experiments folder. These will allow you to reproduce results from the paper on 2 datasets.

Contributors

WWL is developed and maintained by members of the Machine Learning and Computational Biology Lab:

Citation

Please use the following BibTeX citation when using our method or comparing against it:

@InCollection{Togninalli19,
  author    = {Togninalli, Matteo and Ghisu, Elisabetta and Llinares-L{\'o}pez, Felipe and Rieck, Bastian and Borgwardt, Karsten},
  title     = {Wasserstein Weisfeiler--Lehman Graph Kernels},
  booktitle = {Advances in Neural Information Processing Systems~32~(NeurIPS)},
  year      = {2019},
  editor    = {Wallach, H. and Larochelle, H. and Beygelzimer, A. and d'Alch\'{e}{-}Buc, F. and Fox, E. and Garnett, R.},
  publisher = {Curran Associates, Inc.},
  pages     = {6436--6446},
}

wwl's People

Contributors

pseudomanifold avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

wwl's Issues

Different edge weigt get same distance?

We input 3 full connected graphs, and want to classify those graphs based on edge weight (continual attribute). The distance between them is 0, 0, 0.

graphs = [ig.read(g) for g in retrieve_graph_filenames(os.path.join(data_folder))]
#Embed and compute distance
#node_features = np.load(os.path.join(data_folder, dataset, 'node_features.npy'))
dist = pairwise_wasserstein_distance(graphs, num_iterations=2)

Understandings of WWL

After reading the WWL paper, I do not quite understand the WWL kernel, can u please help me with the following questions?

  1. the wwl kernel is calculated among graphs. however, the graph classification usually take the single graph embedding as input, how the wwl kernel update the single graph embedding?
  2. For a set of graphs, there is no edge between two graphs, how can the wwl calc the Laplacian matrix?

It will be a great help for having your reply.

ENZYMES data differs from the source

Hi,

While experimenting with your code I noticed that the .gml files does not exactly match the original TU ENZYMES dataset here:
https://ls11-www.cs.tu-dortmund.de/staff/morris/graphkerneldatasets

The total nodes in the original is 19,580 while here they add up to 19,474.

These are the graphs with differing number of nodes :
#37, #117, #294, #295, #296, #500, #540, #599

As I have not found a detailed explanation of how the dataset is constructed, I wonder if there were some pre-processing to cut the nodes or it is merely from a mistake.

Thank you so much for providing a running implementation of your valuable research.

AttributeError

$python main.py --dataset MUTAG

Generating results for MUTAG...
Generating discrete embeddings for MUTAG.
/home/cma/anaconda3/lib/python3.9/site-packages/numpy/lib/npyio.py:528: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
arr = np.asanyarray(arr)
Embeddings for MUTAG computed, saved to output/MUTAG/MUTAG_wl_discrete_embeddings_h2.npy.

Computing the Wasserstein distances...
Traceback (most recent call last):
File "/home/cma/Documents/WWL-master/experiments/main.py", line 180, in
main()
File "/home/cma/Documents/WWL-master/experiments/main.py", line 65, in main
wasserstein_distances = compute_wasserstein_distance(label_sequences, h, sinkhorn=sinkhorn,
File "/home/cma/Documents/WWL-master/experiments/wwl.py", line 192, in compute_wasserstein_distance
costs = ot.dist(labels_1, labels_2, metric=ground_distance)
File "/home/cma/anaconda3/lib/python3.9/site-packages/ot/utils.py", line 237, in dist
return cdist(x1, x2, metric=metric, w=w)
File "/home/cma/anaconda3/lib/python3.9/site-packages/scipy/spatial/distance.py", line 2929, in cdist
return cdist_fn(XA, XB, out=out, **kwargs)
File "/home/cma/anaconda3/lib/python3.9/site-packages/scipy/spatial/distance.py", line 1673, in call
XA, XB, typ, kwargs = _validate_cdist_input(
File "/home/cma/anaconda3/lib/python3.9/site-packages/scipy/spatial/distance.py", line 205, in _validate_cdist_input
kwargs = _validate_kwargs((XA, XB), mA + mB, n, **kwargs)
File "/home/cma/anaconda3/lib/python3.9/site-packages/scipy/spatial/distance.py", line 225, in _validate_hamming_kwargs
if w.ndim != 1 or w.shape[0] != n:
AttributeError: 'NoneType' object has no attribute 'ndim'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.