rusty1s / deep-graph-matching-consensus Goto Github PK

View Code? Open in Web Editor NEW

254.0 10.0 47.0 3.59 MB

Implementation of "Deep Graph Matching Consensus" in PyTorch

Home Page: https://openreview.net/forum?id=HyeJf1HKvS

License: MIT License

Python 100.00%

pytorch geometric-deep-learning graph-neural-networks graph-matching neighborhood-consensus

deep-graph-matching-consensus's Introduction

Founding Engineer 🤓 @ kumo.ai - PhD from TU Dortmund University

👨🏼‍💻 I love coding and learning new things
🤩 I'm interested in

deep-graph-matching-consensus's People

Contributors

Stargazers

Watchers

Forkers

study2014 jingmouren janericlenssen gaopeng5 tongcu tigerneil tarekbouamer ihaeyong mamat-jumana mesax1 lujiayu0807 lifanghe youjibiying cxxszz swpper freegliboracle roozbehsanaei ffzhang1231 keechang-choi anyuanay chenjun815 zhanlgu hjmark2010 ziqichai nmadali97 hiccgoal xublack kento-yang xiaobai0419 pharrellyhy mightycrane frydump cipher982 qmwz518 stjordanis cosmoshua mr852 cryptowealth-technology moguizhizi guoyang-xie leron33 fclairec xun468 flaviagaia fanliaoooo smujiang

deep-graph-matching-consensus's Issues

Hi
Thanks for your great work!
Now I'm experimenting in an environment as close as possible to the one in Section 4.1,
but I'm not getting the high precision results shown in Figure 2.
Could you please tell me how you generated the random graph?
It would also be helpful if you could tell me how you set the hyperparameters such as the number of epochs.

How to locate key point on a unlabeled image?

Thank you for share such great paper and code!

Somehow I am still a little bit confused after reading the code. You generate the key point features of the training and test dataset using a VGG16 network, this could be done because all datasets key points are labeled.

Let's assume there is an unlabeled image, say, an image which has a person in it, is there any way to use your model to infer where those body key points are? Or, Do I need to assume that the key points have already been located but the label name of the points are unknown, thus we can use your model to match the label name to each key point?

some simple questions about the paper and the code

Hi,

Thanks for the great work. I'm just reading the paper and wondering why the initial loss only uses correct mappings?

Besides, could you explain more about the node function space, e.g., L(G_s) and hat(x_s)?

Different results, when testing with different batch sizes

When I use the synthetic dataset not only for training, but also for testing and then change the batch size for testing only, I get different results. I think the test batch size should not have any influence on the final accuracy.

Modified examples/pascal_pf.py to reproduce problem:

import os.path as osp
import random

import argparse
import torch
from torch_geometric.data import Data, DataLoader
import torch_geometric.transforms as T

from dgmc.models import DGMC, SplineCNN

parser = argparse.ArgumentParser()
parser.add_argument('--dim', type=int, default=256)
parser.add_argument('--rnd_dim', type=int, default=64)
parser.add_argument('--num_layers', type=int, default=2)
parser.add_argument('--num_steps', type=int, default=10)
args = parser.parse_args()


class RandomGraphDataset(torch.utils.data.Dataset):
    def __init__(self, min_inliers, max_inliers, min_outliers, max_outliers,
                 min_scale=0.9, max_scale=1.2, noise=0.05, transform=None, len=32):

        self.min_inliers = min_inliers
        self.max_inliers = max_inliers
        self.min_outliers = min_outliers
        self.max_outliers = max_outliers
        self.min_scale = min_scale
        self.max_scale = max_scale
        self.noise = noise
        self.transform = transform
        self.len = len

    def __len__(self):
        return self.len

    def __getitem__(self, idx):

        # get always different but reproducible instance
        random.seed(idx)
        torch.manual_seed(idx)

        num_inliers = random.randint(self.min_inliers, self.max_inliers)
        num_outliers = random.randint(self.min_outliers, self.max_outliers)

        pos_s = 2 * torch.rand((num_inliers, 2)) - 1
        pos_t = pos_s + self.noise * torch.randn_like(pos_s)

        y_s = torch.arange(pos_s.size(0))
        y_t = torch.arange(pos_t.size(0))

        pos_s = torch.cat([pos_s, 3 - torch.rand((num_outliers, 2))], dim=0)
        pos_t = torch.cat([pos_t, 3 - torch.rand((num_outliers, 2))], dim=0)

        data_s = Data(pos=pos_s, y_index=y_s)
        data_t = Data(pos=pos_t, y=y_t)

        if self.transform is not None:
            data_s = self.transform(data_s)
            data_t = self.transform(data_t)

        data = Data(num_nodes=pos_s.size(0))
        for key in data_s.keys:
            data['{}_s'.format(key)] = data_s[key]
        for key in data_t.keys:
            data['{}_t'.format(key)] = data_t[key]

        return data


transform = T.Compose([
    T.Constant(),
    T.KNNGraph(k=8),
    T.Cartesian(),
])

path = osp.join('..', 'data', 'PascalPF')
test_dataset = RandomGraphDataset(30, 60, 0, 20, transform=transform, len=64)

device = 'cuda' if torch.cuda.is_available() else 'cpu'
psi_1 = SplineCNN(1, args.dim, 2, args.num_layers, cat=False, dropout=0.0)
psi_2 = SplineCNN(args.rnd_dim, args.rnd_dim, 2, args.num_layers, cat=True,
                  dropout=0.0)
model = DGMC(psi_1, psi_2, num_steps=args.num_steps).to(device)


@torch.no_grad()
def test(dataset, batch_size=1):
    model.eval()

    test_loader = DataLoader(dataset, batch_size=batch_size, shuffle=False,
                             follow_batch=['x_s', 'x_t'])

    correct = num_examples = 0
    for i, data in enumerate(test_loader):
        data = data.to(device)
        S_0, S_L = model(data.x_s, data.edge_index_s, data.edge_attr_s,
                         data.x_s_batch, data.x_t, data.edge_index_t,
                         data.edge_attr_t, data.x_t_batch)
        y = torch.stack([data.y_index_s, data.y_t], dim=0)
        correct += model.acc(S_L, y, reduction='sum')
        num_examples += y.size(1)

    return correct / num_examples


# no training for minimal example
test_acc1 = 100 * test(test_dataset, 1)
test_acc4 = 100 * test(test_dataset, 4)

print(f'Acc1: {test_acc1:.2f}, Acc4: {test_acc4:.2f}')

Output:

Acc1: 3.92, Acc4: 3.75

But Acc1 and Acc4 should actually be the same. I guess this problem is related to other datasets as well

Regarding semi-supervised training

Hello Matthias @rusty1s,

Thanks for an amazing code base here and with PyTorch Geometric.

I have a small query regarding using the code and not an issue as such: my use case concerns semi-supervised training, where I have two sets of graphs, with each graph containing roughly 300 nodes each but GT correspondence is available for a subset of nodes only (<20). As inspiration, I began to look at how the dbp15k dataset class is written since this concerns semi supervised setting as well.

Am I right in believing that essentially I need to pass the indices of the nodes for which GT correspondence is available, in the 2 x N y_train tensor variable? So something like y_train = [[12, 23, 90, 310, 5, ...], [4, 23, 87, 21, 98, ...]] would work where the first row contains the index of the node in the moving graph and the second row is the index of the node in the fixed graph, or am I misunderstanding how this variable is constructed?

Thank you!

Generating Visualizations

Hi! I was wondering if you have a script to generate visualizations like in the qualitative results section of the paper? I am struggling to recover the original images from the torch-geometric dataset.

'GlobalStorage' object has no attribute 'face'

Context :

(pop) sahmed9@alice:~/reps/deep-graph-matching-consensus/examples$ CUDA_VISIBLE_DEVICES=3 python pascal.py 

Processing...

Done!
Traceback (most recent call last):
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/data/storage.py", line 48, in __getattr__
    return self[key]
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/data/storage.py", line 68, in __getitem__
    return self._mapping[key]
KeyError: 'face'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pascal.py", line 37, in <module>
    train_datasets += [ValidPairDataset(dataset, dataset, sample=True)]
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/dgmc-1.0.0-py3.8.egg/dgmc/utils/data.py", line 80, in __init__
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/dgmc-1.0.0-py3.8.egg/dgmc/utils/data.py", line 84, in __compute_pairs__
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/data/dataset.py", line 199, in __getitem__
    data = data if self.transform is None else self.transform(data)
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/transforms/compose.py", line 20, in __call__
    data = transform(data)
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/transforms/face_to_edge.py", line 19, in __call__
    if data.face is not None:
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/data/data.py", line 345, in __getattr__
    return getattr(self._store, key)
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/data/storage.py", line 50, in __getattr__
    raise AttributeError(
AttributeError: 'GlobalStorage' object has no attribute 'face'

env :

# packages in environment at /home/sahmed9/anaconda3/envs/pop:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             4.5                       1_gnu  
blas                      1.0                         mkl  
brotlipy                  0.7.0           py38h497a2fe_1001    conda-forge
bzip2                     1.0.8                h7b6447c_0  
ca-certificates           2021.5.30            ha878542_0    conda-forge
certifi                   2021.5.30        py38h578d9bd_0    conda-forge
cffi                      1.14.6           py38ha65f79e_0    conda-forge
chardet                   4.0.0            py38h578d9bd_1    conda-forge
colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
cryptography              3.4.7            py38ha5dfef3_0    conda-forge
cudatoolkit               11.1.74              h6bb024c_0    nvidia
decorator                 5.1.0              pyhd8ed1ab_0    conda-forge
ffmpeg                    4.2.2                h20bf706_0  
freetype                  2.10.4               h5ab3b9f_0  
gmp                       6.2.1                h2531618_2  
gnutls                    3.6.15               he1e5248_0  
googledrivedownloader     0.4                pyhd3deb0d_1    conda-forge
idna                      2.10               pyh9f0ad1d_0    conda-forge
intel-openmp              2021.3.0          h06a4308_3350  
jinja2                    3.0.1              pyhd8ed1ab_0    conda-forge
joblib                    1.0.1              pyhd8ed1ab_0    conda-forge
jpeg                      9b                   h024ee3a_2  
lame                      3.100                h7b6447c_0  
lcms2                     2.12                 h3be6417_0  
ld_impl_linux-64          2.35.1               h7274673_9  
libffi                    3.3                  he6710b0_2  
libgcc-ng                 9.3.0               h5101ec6_17  
libgfortran-ng            7.5.0               h14aa051_19    conda-forge
libgfortran4              7.5.0               h14aa051_19    conda-forge
libgomp                   9.3.0               h5101ec6_17  
libidn2                   2.3.2                h7f8727e_0  
libopus                   1.3.1                h7b6447c_0  
libpng                    1.6.37               hbc83047_0  
libstdcxx-ng              9.3.0               hd4cf53a_17  
libtasn1                  4.16.0               h27cfd23_0  
libtiff                   4.2.0                h85742a9_0  
libunistring              0.9.10               h27cfd23_0  
libuv                     1.40.0               h7b6447c_0  
libvpx                    1.7.0                h439df22_0  
libwebp-base              1.2.0                h27cfd23_0  
lz4-c                     1.9.3                h295c915_1  
markupsafe                2.0.1            py38h497a2fe_0    conda-forge
mkl                       2021.3.0           h06a4308_520  
mkl-service               2.4.0            py38h7f8727e_0  
mkl_fft                   1.3.0            py38h42c9631_2  
mkl_random                1.2.2            py38h51133e4_0  
ncurses                   6.2                  he6710b0_1  
nettle                    3.7.3                hbbd107a_1  
networkx                  2.5                        py_0    conda-forge
ninja                     1.10.2               hff7bd54_1  
numpy                     1.20.3           py38hf144106_0  
numpy-base                1.20.3           py38h74d4b33_0  
olefile                   0.46               pyhd3eb1b0_0  
openh264                  2.1.0                hd408876_0  
openjpeg                  2.4.0                h3ad879b_0  
openssl                   1.1.1k               h7f98852_0    conda-forge
pandas                    1.3.1            py38h1abd341_0    conda-forge
pillow                    8.3.1            py38h2c7a002_0  
pip                       21.2.2           py38h06a4308_0  
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pyg                       2.0.1           py38_torch_1.8.0_cu111    pyg
pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pysocks                   1.7.1            py38h578d9bd_3    conda-forge
python                    3.8.11          h12debd9_0_cpython  
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-louvain            0.15               pyhd3deb0d_0    conda-forge
python_abi                3.8                      2_cp38    conda-forge
pytorch                   1.8.2           py3.8_cuda11.1_cudnn8.0.5_0    pytorch-lts
pytorch-cluster           1.5.9           py38_torch_1.8.0_cu111    pyg
pytorch-scatter           2.0.8           py38_torch_1.8.0_cu111    pyg
pytorch-sparse            0.6.12          py38_torch_1.8.0_cu111    pyg
pytorch-spline-conv       1.2.1           py38_torch_1.8.0_cu111    pyg
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
pyyaml                    5.4.1            py38h497a2fe_0    conda-forge
readline                  8.1                  h27cfd23_0  
requests                  2.25.1             pyhd3deb0d_0    conda-forge
scikit-learn              0.24.2           py38ha9443f7_0  
scipy                     1.7.1            py38h292c36d_2  
setuptools                58.0.4           py38h06a4308_0  
six                       1.16.0             pyhd3eb1b0_0  
sqlite                    3.36.0               hc218d9a_0  
threadpoolctl             2.2.0              pyh8a188c0_0    conda-forge
tk                        8.6.10               hbc83047_0  
torchaudio                0.8.2                      py38    pytorch-lts
torchvision               0.9.2                py38_cu111    pytorch-lts
tqdm                      4.62.3             pyhd8ed1ab_0    conda-forge
typing_extensions         3.10.0.2           pyh06a4308_0  
urllib3                   1.26.6             pyhd8ed1ab_0    conda-forge
wheel                     0.37.0             pyhd3eb1b0_1  
x264                      1!157.20191217       h7b6447c_0  
xz                        5.2.5                h7b6447c_0  
yacs                      0.1.6                      py_0    conda-forge
yaml                      0.2.5                h516909a_0    conda-forge
zlib                      1.2.11               h7b6447c_3  
zstd                      1.4.9                haebb681_0

Reproducing MLP performance on PascalVOC

Hi, I am interested in training a weak feature matching model and want to reproduce the MLP's performance in Table1.

Specifically, I want to reproduce the performance without using DGMC, the row of MLP isotropic L=0.

I have reused the feature loading and training code in pascal.py. However, my performance is only 43.99.

The following is my MLP implementation. Would you give some advice on improving the performance? Thanks!

class MLP(nn.Module):
    def __init__(self, in_dim, out_dim, hid_dim, dropout=0.5, num_layer=3):
        super(MLP, self).__init__()
        assert num_layer >= 1
        if num_layer == 1:
            hid_dim = out_dim

        self.layers = nn.ModuleList([])
        self.layers.append(nn.Dropout(dropout))
        self.layers.append(nn.Linear(in_dim, hid_dim))
        for i in range(num_layer - 1):
            self.layers.append(nn.ReLU())
            self.layers.append(nn.Dropout(dropout))
            if i == num_layer - 2:
                self.layers.append(nn.Linear(hid_dim, out_dim))
            else:
                self.layers.append(nn.Linear(hid_dim, hid_dim))
        self.residual = nn.Linear(in_dim, out_dim)

    def forward(self, x_s, x_t, batch_s, batch_t):
        h_s = x_s
        h_t = x_t
        for layer in self.layers:
            h_s = layer(h_s)
            h_t = layer(h_t)
        h_s = h_s + self.residual(x_s)
        h_t = h_t + self.residual(x_t)

        h_s, s_mask = to_dense_batch(h_s, batch_s, fill_value=0)
        h_t, t_mask = to_dense_batch(h_t, batch_t, fill_value=0)
        (B, N_s, C_out), N_t = h_s.size(), h_t.size(1)
        S_hat = h_s @ h_t.transpose(-1, -2)  # [B, N_s, N_t, C_out]
        S_mask = s_mask.view(B, N_s, 1) & t_mask.view(B, 1, N_t)
        S = masked_softmax(S_hat, S_mask, dim=-1)[s_mask]
        return S

Experiment on node addition or removal

Hi! Congratulations on your paper, I successfully reproduced your experiment in section 4.1, and the result are exactly like mentioned in paper (sometimes even better), the refinement step has a crucial impact on increasing the convergence speed and improving Hits@1. The promising result gave me a huge encouragement, so I try to conduct the experiment "Robustness towards node addition or removal", the parameters remain unchanged like experiment aforementioned, since there are more nodes in the target graph than in the source graph, I assume that the size of both y and y_t should be the number of nodes in the target graph. So how to set y of newly added node rise the issue, no matter how I set y and y_t, I only got Hits@1 around 0.6 (when q = 0.1 and epoch = 1930). Could you please share more details on how you conduct this experiment, I will show you how I set y and y_t in the end, any help from you will be appreciated.
And there is another issue, when V_s = 100, since the hidden dimensional of all MLP is set to 32, the in_channels of psi1 should be 32 too (if I'm not mistaken), and when we have 100 nodes, it seems that with one-hot degree embedding (I used torch to implement this part), the dimension of num_node_features cannot be 32, otherwise I got "RuntimeError: Class values must be smaller than num_classes", what I did is simply increase the dimension of num_node_features, up to like 50, it works perfectly, but still I am curious, have you ever encountered this problem and if so, could you please tell me how did you solve it?

Thank your for reading. I would appreciate any help you could provide. Have a nice day!

# just modify the size of y and y_t
y = torch.arange(num_node_t)
y_t = torch.arange(num_node_t)
y = torch.stack([y, y_t], dim=0)

# set y of extra nodes to -1
y_t = torch.arange(num_node_t)
if num_node_t >  num_node_s:
    list_node = [i for i in range(num_node_s)]
    for i in range(num_node_t - num_node_s):
        list_node.append(-1)
        y = torch.tensor(list_node)
else:
    y = torch.arange(num_node_t)
y = torch.stack([y, y_t], dim=0)

Sorting requirement for __include_gt__

Hello, I notice that if the ground truth y[0] is not sorted, __include_gt__ does not behave properly.
It might be worth mentioning this in the documentation.

Code to reproduce:

def test(self):
    h_s = torch.randn(1, 10, 20)
    h_t = torch.randn(1, 10, 20)
    s_mask = torch.ones(1, 10, dtype=torch.bool)
    y = torch.as_tensor([[2, 0, 1], [3, 4, 5]])
    # make sure top k doesn't include ground truth
    h_s[0, y[0]] = 100
    h_t[0, y[1]] = -100
    self.k = 1
    S_idx = self.__top_k__(h_s, h_t)
    S_rnd_idx = torch.zeros(1, 10, 1, dtype=torch.long)
    S_idx = torch.cat([S_idx, S_rnd_idx], dim=-1)
    S_idx = self.__include_gt__(S_idx, s_mask, y)
    mask = S_idx[0, y[0]] == y[1].view(-1, 1)
    print(mask.any(dim=-1))

CUDA out of memory for DBP15K example

There is a "CUDA out of memory" error arising when we run the DBP15K example. The system hint that the error happens in the "__top_k__" function of the dgmc.py file. We find that this function needs to establish a matrix space with the shape of (1,n_s,n_t,d), where n and d are the node number and dimension respectively. This is too huge to create. When we modify the function to standard matrix multiplication, this error is repaired. Although I think this modification is equivalent to the original function, I don't know whether this violates your original code design.

About the downloading datesets

TypeError: inc() takes 3 positional arguments but 4 were given

Context :


Traceback (most recent call last):
  File "pascal.py", line 116, in <module>
    loss = train()
  File "pascal.py", line 73, in train
    for data in train_loader:
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 517, in __next__
    data = self._next_data()
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 557, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/loader/dataloader.py", line 39, in __call__
    return self.collate(batch)
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/loader/dataloader.py", line 19, in collate
    return Batch.from_data_list(batch, self.follow_batch,
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/data/batch.py", line 63, in from_data_list
    batch, slice_dict, inc_dict = collate(
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/data/collate.py", line 76, in collate
    value, slices, incs = _collate(attr, values, data_list, stores,
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/data/collate.py", line 142, in _collate
    incs = get_incs(key, values, data_list, stores)
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/data/collate.py", line 187, in get_incs
    repeats = [
  File "/home/sahmed9/anaconda3/envs/pop/lib/python3.8/site-packages/torch_geometric/data/collate.py", line 188, in <listcomp>
    data.__inc__(key, value, store)
TypeError: __inc__() takes 3 positional arguments but 4 were given

Seeking for suggestions

Dear authors,
Sorry to bother you. This is a wonderful work and it gives me much inspiration!
Recently, I have a question about the details of the paper. I notice a statement in Section 4.4 of your paper that "We retrieve monolingual FASTTEXT embeddings for each language separately, and align those into the same vector space afterwards. We use the sum of word embeddings as the final entity input representation. ". However, recent work [1] has clarified that utilizing (monolingual) name embeddings as side information is tricky for entity alignment. Hence, I try to remove the the initial entity embedding obtained by names in this line of code. Here is my modification:

class MyEmbedding(object):
    def __call__(self, data):
        vocab_size1, vocab_dim1 = data.x1.shape[0], data.x1.shape[2]
        vocab_size2, vocab_dim2 = data.x2.shape[0], data.x2.shape[2]
        data.x1, data.x2 = nn.init.xavier_uniform_(torch.zeros(vocab_size1, vocab_dim1)), nn.init.xavier_uniform_(torch.zeros(vocab_size2, vocab_dim2)) 
        return data


data = DBP15K(path, args.category, transform=MyEmbedding())[0].to(device)

psi_1 = RelCNN(data.x1.size(-1), args.dim, args.num_layers, batch_norm=False,
               cat=True, lin=True, dropout=0.5)
psi_2 = RelCNN(args.rnd_dim, args.rnd_dim, args.num_layers, batch_norm=False,
               cat=True, lin=True, dropout=0.0)
model = DGMC(psi_1, psi_2, num_steps=None, k=args.k).to(device)

optimizer_grouped_parameters = [
    {"params": [data.x1, data.x2], 'lr': 0.001},
    {"params": [p for p in model.parameters()], 'lr': 0.001},
]
optimizer = torch.optim.Adam(optimizer_grouped_parameters, lr=0.001)

We use random initialization in MyEmbedding(). Furthermore, we add the entity embeddings (i.e., data.x1 and data.x2) as parameters into the optimizer. However, the obtained results are really bad(nearly zero in Hits@1 on all datasets, including ZH->EN, JA->EN and FR->EN). I want to know whether the obtained results are reasonable. I also give some guesses on the results.
I think the reason is that the proposed model in this paper adopts a two-stage neural architecture. The first stage will obtain an initial ranking of soft correspondences between entities in two knowledge graphs, while the second stage heavily relies on the output of the first stage. If the initial entity embeddings are randomly initialized, the output of the first stage will be meaningless. And the second stage cannot further refine the structural correspondences between knowledge graphs. I am not sure about this. Could you give me any suggestions? Thanks in advance!

[1] A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

Reproduce Synthetic Experiment (paper sec. 4.1)

Hi,

Thanks for the amazing paper and code! This is not really an issue but I wonder if the authors can share instructions or code on how to reproduce the synthetic experiments section 4.1 (with Sinkhorn iterations). Any help or pointer are appreciated! Thanks!

experiment on undirected graph

Hi:

Thank you for your great work!

Now I'm trying to run your code on a unfeatured, undirected graph with a feature vector generated from Deepwalk. But the performance is really poor. I'm wondering if you can please provide some ideas on how should I set up a feature vector? Or how could I improve the performace?

Thanks!

Reproducing Paper Scores

Hi guys!

Would you mind providing example files that reproduce the scores from the paper?
The example files are not made for this, are they? The resulting scores are terrible :-P

Many thanks!
Dominik

Doubt about Multiple Graphs

When I run dgmc on batch multi-graphs, the S_L get the wrong size and values. But I look the test_dgmc and believe that the model support it. So I want to know that wether it can run the batch data on multi-graph

FaceToEdge Transform for 3D data

Hello Matthias @rusty1s,

Thanks for a very nice code base. I wanted to use the Delaunay and FaceToEdge Transform for building a graph with edges out of my 3D point cloud data data (size N x 3, each row is the x y z coordinates).

While composing a transform like the following:

transform = T.Compose([
    T.Delaunay(),
    T.FaceToEdge(),
    T.Cartesian(),
])

I see an output message like below:

File "/mnt/Softwares/miniconda3/envs/dgmcEnv/lib/python3.7/site-packages/torch_geometric/transforms/face_to_edge.py", line 21, in __call__
    edge_index = torch.cat([face[:2], face[1:], face[::2]], dim=1)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 2 but got size 3 for tensor number 1 in the list.

This error message goes away if I were to use a point cloud arising from just the x y coordinates. Would you have an intuition on why this message appears and how I can make my 3D point cloud data work with the Delaunay and FaceToEdge transforms?

Thank you very much!

A minor mistake？

Hi!
Thank you for sharing your code. I notice a minor mistake in your code when you extract features from a pre-trained VGG16.

Source code of pascal.py

vgg16_outputs = []
def hook(module, x, y):
     vgg16_outputs.append(y)

vgg16.features[20].register_forward_hook(hook)  # relu4_2
vgg16.features[25].register_forward_hook(hook)  # relu5_1

out1 = F.interpolate(vgg16_outputs[0], (256, 256), mode='bilinear',align_corners=False)
out2 = F.interpolate(vgg16_outputs[0], (256, 256), mode='bilinear', align_corners=False)

https://github.com/rusty1s/pytorch_geometric/blob/df0612bca76b14464b13c86759eec0c56ebc8154/torch_geometric/datasets/pascal.py#L226

The out2 is edge features from vgg16_outputs[1] ?

Anyway your code helps me a lot. ^_^

Thanks,
Qiafanqie