Code Monkey home page Code Monkey logo

simgnn's Introduction

SimGNN

PWC codebeat badge Maintenance

Keras implementation of SimGNN: A Neural Network Approach to Fast Graph Similarity Computation*

image

*This includes only upper layer ( Attention mechanism implementation )

Abstract

Graph similarity search is among the most important graph-based applications, e.g. finding the chemical compounds that are most similar to a query compound. Graph similarity/distance computation, such as Graph Edit Distance (GED) and Maximum Common Subgraph (MCS), is the core operation of graph similarity search and many other applications, but very costly to compute in practice. Inspired by the recent success of neural network approaches to several graph applications, such as node or graph classification, we propose a novel neural network based approach to address this classic yet challenging graph problem, aiming to alleviate the computational burden while preserving a good performance. The proposed approach, called SimGNN, combines two strategies. First, we design a learnable embedding function that maps every graph into an embedding vector, which provides a global summary of a graph. A novel attention mechanism is proposed to emphasize the important nodes with respect to a specific similarity metric. Second, we design a pairwise node comparison method to sup plement the graph-level embeddings with fine-grained node-level information. Our model achieves better generalization on unseen graphs, and in the worst case runs in quadratic time with respect to the number of nodes in two graphs. Taking GED computation as an example, experimental results on three real graph datasets demonstrate the effectiveness and efficiency of our approach. Specifically, our model achieves smaller error rate and great time reduction compared against a series of baselines, including several approximation algorithms on GED computation, and many existing graph neural network based models. Our study suggests SimGNN provides a new direction for future research on graph similarity computation and graph similarity search.

Resources

Paper can be found here: Link.
Tensorflow implementation from author: Link
Another Implementation: Link
Medium article : Link

Results

image

Above results on test data, that is error of order 8 x 10-3, were obtained when trained on synthetic data (11k graph pairs that is 0.224 times original data) (200 epochs) with training loss of order 10-4.

Method Test Error (Synth Data) Test Error (SimGNN paper)
SimpleMean 1.15x10-2 3.749x10-3
SimGNN (Att) 8.80x10-3 1.455x10-3
Difference 0.0027 0.002294
Ratio 1.307 2.5766

simgnn's People

Contributors

pulkit1joshi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

simgnn's Issues

hi, I got an error

hi~ I had an error following... Could you know the clue of this error?,,

and,, i wanna know your tensor version.

thank you advance!!

==============================================

Traceback (most recent call last):
File "./src/main.py", line 81, in
main()
File "./src/main.py", line 60, in main
model = simgnn(parser)
File "D:\ys_wj\SimGNN-keras\SimGNN-main\src\simgnn.py", line 28, in simgnn
x = shared_attention(x[0])
File "C:\Users\numa97\Anaconda3\envs\gnn\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 721, in call
base_layer_utils.create_keras_history(inputs)
File "C:\Users\numa97\Anaconda3\envs\gnn\lib\site-packages\tensorflow_core\python\keras\engine\base_layer_utils.py", line 189, in create_keras_history
_, created_layers = _create_keras_history_helper(tensors, set(), [])
File "C:\Users\numa97\Anaconda3\envs\gnn\lib\site-packages\tensorflow_core\python\keras\engine\base_layer_utils.py", line 260, in _create_keras_history_helper
layer_inputs, op.outputs)
File "C:\Users\numa97\Anaconda3\envs\gnn\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 2032, in _add_inbound_node
input_tensors)
File "C:\Users\numa97\Anaconda3\envs\gnn\lib\site-packages\tensorflow_core\python\util\nest.py", line 569, in map_structure
structure[0], [func(*x) for x in entries],
File "C:\Users\numa97\Anaconda3\envs\gnn\lib\site-packages\tensorflow_core\python\util\nest.py", line 569, in
structure[0], [func(*x) for x in entries],
File "C:\Users\numa97\Anaconda3\envs\gnn\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py", line 2031, in
inbound_layers = nest.map_structure(lambda t: t._keras_history.layer,
AttributeError: 'tuple' object has no attribute 'layer'

Does this not work for any shape of graph?

Hey,

I tried testing it with only one training data set:

{
"labels_1": ["11", "11", "9"], 
"labels_2": ["8", "11", "5"], 
"graph_2": [[0,1],[1,2]], 
"ged": 11, 
"graph_1": [[0,1],[1,3]]
}

Which results in the following error:

ValueError: Dimensions must be equal, but are 8 and 16 for '{{node functional_1/graph_conv/MatMul_1}} = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false](functional_1/graph_conv/Reshape, functional_1/graph_conv/Reshape_1)' with input shapes: [3,8], [16,64].

So how does this work. I tried to wrap my head around this but it is not clear to me what graph pairs work and which doesn't

Synthetic Data-set creation

Currently, the model is using the IMDB dataset in the format of JSON files. The next task is to generate a synthetic data-set for better performance checking.

Data format :

  1. Edge list
  2. Node labels
  3. Graph Edit Distance

The graph needs to be connected and exported in JSON format. GED can be calculated using any standard algorithm for 16 nodes.

Model not getting saved.

Issue: The error before save and after saving do not match. The problem might be with the custom layers used. This needs to be checked.

Import Error

ImportError: cannot import name 'parameter_parser' from 'parser' (unknown location)
how can I fix it?

Updating current network.

Currently, the network is fully established, however, few out-put dimensions (NTN layer and Dense layer) are not up to the paper given in read me. (Check SimGNN paper, experimental data). So the network needs be changed to get exactly same results as given in the paper.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.