muhanzhang / igmc Goto Github PK
View Code? Open in Web Editor NEWInductive graph-based matrix completion (IGMC) from "M. Zhang and Y. Chen, Inductive Matrix Completion Based on Graph Neural Networks, ICLR 2020 spotlight".
License: MIT License
Inductive graph-based matrix completion (IGMC) from "M. Zhang and Y. Chen, Inductive Matrix Completion Based on Graph Neural Networks, ICLR 2020 spotlight".
License: MIT License
Hi there,
Very interesting paper. However, When I train it, it shows ' 'RGCNConv' object has no attribute 'att'', When I look at the documentation in pytorch, there is no such attribute as well. Please check it, cheers.
Regards
'IGMC' is very awesome paper about recommendation system!
I'm a 'DGL' user , so I want to transform to this library to "DGL".
Is it available? or Do you have a plan about production of this paper which 'DGL' version.
Hi,I I wonder if how RGCN work on the user-item Heterogeneous graph. user-item graph have two type of nodes,and they have differet feature.
May I ask if this project supports DDP training & inference on multi GPUS and nodes?
Hi,
I've used 'visualize' option to view the subgraphs and predicted ratings. I wanted to know how to get the user and item ids corresponding to the 'red' and 'blue' nodes in the visualization.pdf so that it can be identified which target user-item pair this enclosing subgraph has been built around.
I am using this code for a new dataset, and I want to annotate the nodes in the visualization with the actual row and column indices so that the rating matrix can be tallied with the corresponding subgraph. Thanks.
Traceback (most recent call last):
File "Main.py", line 338, in <module>
multiply_by=multiply_by)
File "/Users/vasfern/Downloads/IGMC-master/models.py", line 146, in __init__
super(IGMC, self).__init__(dataset, GCNConv, latent_dim, regression, adj_dropout, force_undirected)
File "/Users/vasfern/Downloads/IGMC-master/models.py", line 20, in __init__
self.convs.append(gconv(dataset.num_features, latent_dim[0]))
File "/Users/vasfern/Downloads/IGMC-master/.venv/lib/python3.6/site-packages/torch_geometric/data/dataset.py", line 117, in num_features
return self.num_node_features
File "/Users/vasfern/Downloads/IGMC-master/.venv/lib/python3.6/site-packages/torch_geometric/data/dataset.py", line 112, in num_node_features
return self[0].num_node_features
File "/Users/vasfern/Downloads/IGMC-master/.venv/lib/python3.6/site-packages/torch_geometric/data/dataset.py", line 188, in __getitem__
data = self.get(self.indices()[idx])
TypeError: 'tuple' object is not callable
I met same problem with #9 on large graph , related code in preprocessing.py on line 156:
labels = np.full((num_users, num_items), neutral_rating, dtype=np.int32)
Could you please give some suggestions? Thanks.
Dear authors,
Recently I am doing subgraph-based GNN experiments on large-scale graphs. But I found that it is difficult to pre-process subgraphs around edges on a large-scale bipartite graph.
Could you please give some suggestions? Thanks.
Regards,
Dear Dr.Zhang
First I'd like to say big thanks to this great work and the paper.
I tried to run this program on my windows pc.
After I run "python Main.py --data-name yahoo_music --epochs 1 --testing --ensemble" in the PyCharm terminal (as well on douban and flixster dataset) , the program always got finally stuck at a place, raising runtime error like below:
And the data folder ended as empty, but files were generated under results folder
Part of the output from the terminal was copied into a google doc, and from the output can see that this runtime error happens all the way during the program running, however, only got stuck at the last place.
https://docs.google.com/document/d/1EV-qGCuVi0t6q-jh1K5Dvnj9UASOLYq-djh6lD8FM7g/edit?usp=sharing
The python and torch environment can be seen in below picture
I will try to run it again under the Python 3.8.1 + PyTorch 1.4.0 + PyTorch_Geometric 1.4.2. environment to see what will happen.
Thanks so much and best regards
Jiyang
Hi, Thanks for constantly maintaining this repo. I understand that this paper mainly focuses on the situation without side information, but in the experiments, IGMC seems to outperform many other models with side features. If we do have some side information (say, we have user features but not item features), can we add those features in a meaningful way to IGMC? Or could you recommend some other recent works in this regard (inductive recommendation w/ some side features)? Thanks in advance!
Hello,
I encountered an error when trying to construct 2-hop subgraphs by running python Main.py --data-name flixster --epochs 40 --testing --ensemble --hop 2
See error message below. Let me know if you need any further information. Thank you!
Namespace(ARR=0.001, adj_dropout=0.2, asin_pop_thres=25, batch_size=50, continue_from=None, data_appendix='', data_name='flixster', data_seed=1234, debug=False, dynamic_test=False, dynamic_train=False, dynamic_val=False, ensemble=True, epochs=40, force_undirected=False, hop='2', keep_old=False, lr=0.001, lr_decay_factor=0.1, lr_decay_step_size=50, max_nodes_per_hop=10000, max_test_num=None, max_train_num=None, max_val_num=None, multiply_by=1, no_train=False, num_relations=5, ratio=1.0, reprocess=False, sample_ratio=1.0, save_appendix='', save_interval=10, seed=1, standard_rating=False, test_freq=1, testing=True, transfer='', use_features=False, user_thres=140, visualize=False)
Command line input: python Main.py --data-name flixster --epochs 40 --testing --ensemble --hop 2
is saved.
number of users = 2341
number of item = 2956
User features shape: (3000, 3000)
Item features shape: (3000, 3000)
All ratings are:
[0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. ]
#train: 23556, #val: 4712, #test: 2617
Processing...
Enclosing subgraph extraction begins...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [00:03<00:00, 9.79it/s]multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/igmc/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/ubuntu/anaconda3/envs/igmc/lib/python3.6/multiprocessing/pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/home/ubuntu/proj/IGMC/util_functions.py", line 217, in subgraph_extraction_labeling
v_fringe, u_fringe = neighbors(u_fringe, Arow), neighbors(v_fringe, Acol)
File "/home/ubuntu/proj/IGMC/util_functions.py", line 302, in neighbors
return set(A[list(fringe)].indices)
File "/home/ubuntu/proj/IGMC/util_functions.py", line 61, in __getitem__
indices = np.concatenate(self.indices[col_selector])
File "<__array_function__ internals>", line 6, in concatenate
ValueError: need at least one array to concatenate
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "Main.py", line 337, in <module>
max_num=args.max_train_num
File "/home/ubuntu/proj/IGMC/util_functions.py", line 91, in __init__
super(MyDataset, self).__init__(root)
File "/home/ubuntu/anaconda3/envs/igmc/lib/python3.6/site-packages/torch_geometric/data/in_memory_dataset.py", line 53, in __init__
pre_filter)
File "/home/ubuntu/anaconda3/envs/igmc/lib/python3.6/site-packages/torch_geometric/data/dataset.py", line 93, in __init__
self._process()
File "/home/ubuntu/anaconda3/envs/igmc/lib/python3.6/site-packages/torch_geometric/data/dataset.py", line 166, in _process
self.process()
File "/home/ubuntu/proj/IGMC/util_functions.py", line 106, in process
self.class_values, self.parallel)
File "/home/ubuntu/proj/IGMC/util_functions.py", line 190, in links2subgraphs
results = results.get()
File "/home/ubuntu/anaconda3/envs/igmc/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
File "/home/ubuntu/anaconda3/envs/igmc/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/ubuntu/anaconda3/envs/igmc/lib/python3.6/multiprocessing/pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/home/ubuntu/proj/IGMC/util_functions.py", line 217, in subgraph_extraction_labeling
v_fringe, u_fringe = neighbors(u_fringe, Arow), neighbors(v_fringe, Acol)
File "/home/ubuntu/proj/IGMC/util_functions.py", line 302, in neighbors
return set(A[list(fringe)].indices)
File "/home/ubuntu/proj/IGMC/util_functions.py", line 61, in __getitem__
indices = np.concatenate(self.indices[col_selector])
File "<__array_function__ internals>", line 6, in concatenate
ValueError: need at least one array to concatenate
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [00:03<00:00, 9.64it/s]
python Main.py --data-name flixster --epochs 40 --testing --ensemble --hop 2 12.19s user 1.34s system 130% cpu 10.410 total
Hello, I'm a newbie with GNN and interested in Recommender system using Graphs.
Thanksfully I found a nice paper IGMC.
I 'm trying to understand 'generating sub graph' of this code,
What I was doing was:
row = np.array([0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 6])
col = np.array([0, 1, 2, 4, 8, 4, 6, 7, 1, 3, 4, 6, 0, 6, 7, 8, 2, 3, 5, 7, 9, 1, 3, 5, 8, 0, 2, 5, 7, 9])
rat = np.array([1, 2, 5, 2, 4, 4, 5, 2, 1, 4, 3, 5, 5, 5, 1, 2, 1, 2, 5, 5, 4, 1, 4, 4, 1, 5, 4, 3, 2, 5])
rat = rat - 1 # value to index
ACsr = ssp.csr_matrix((rat, (row, col)))
※ these values are just same with picture of your example rating table.
(<7x10 sparse matrix of type ''
with 30 stored elements in Compressed Sparse Row format>,
array([[0, 1, 4, 0, 1, 0, 0, 0, 3, 0],
[0, 0, 0, 0, 3, 0, 4, 1, 0, 0],
[0, 0, 0, 3, 2, 0, 4, 0, 0, 0],
[4, 0, 0, 0, 0, 0, 4, 0, 1, 0],
[0, 0, 0, 1, 0, 4, 0, 4, 0, 3],
[0, 0, 0, 3, 0, 3, 0, 0, 0, 0],
[4, 0, 3, 0, 0, 2, 0, 1, 0, 4]]))
Batch(x=[120, 4], edge_index=[2, 112], y=[30], edge_type=[112], batch=[120], ptr=[31])
edge_index (2, 116)
y (30,)
edge_type (116,)
edge_index [[ 1 0 2 3 4 7 9 8 10 11 13 12 14 15 17 16 18 19
21 20 22 23 25 24 26 27 29 28 30 31 33 32 34 35 37 36
38 39 41 40 42 43 45 44 46 47 49 49 50 51 53 53 54 55
57 56 58 59 65 64 66 67 69 68 70 71 73 72 74 75 77 76
78 79 81 80 81 82 83 83 85 85 86 87 89 88 89 90 91 91
93 94 97 96 98 99 101 100 102 103 104 105 107 107 109 108 110 111
112 115 117 116 117 118 119 119]
[ 2 3 1 0 7 4 10 11 9 8 14 15 13 12 18 19 17 16
22 23 21 20 26 27 25 24 30 31 29 28 34 35 33 32 38 39
37 36 42 43 41 40 46 47 45 44 50 51 49 49 54 55 53 53
58 59 57 56 66 67 65 64 70 71 69 68 74 75 73 72 78 79
77 76 82 83 83 81 80 81 86 87 85 85 90 91 91 89 88 89
94 93 98 99 97 96 102 103 101 100 107 107 104 105 110 111 109 108
115 112 118 119 119 117 116 117]]
y [1. 2. 5. 2. 4. 4. 5. 2. 1. 4. 3. 5. 5. 5. 1. 2. 1. 2. 5. 5. 4. 1. 4. 4.
1. 5. 4. 3. 2. 5.]
edge_type [3 3 3 3 3 3 2 0 2 0 2 0 2 0 0 0 0 0 1 0 1 0 3 2 3 2 0 3 0 3 0 2 0 2 0 1 0
1 0 2 0 2 3 2 3 2 3 0 3 0 3 0 3 0 3 3 3 3 3 3 3 3 2 2 2 2 1 0 1 0 0 0 0 0
3 3 0 3 3 0 0 2 0 2 0 2 3 0 2 3 1 1 2 2 2 2 3 3 3 3 3 2 3 2 3 3 3 3 1 1 2
0 3 2 0 3]
ADJ (120, 120) [[0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0]
[0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0]
[0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0]
[4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0]
Can you provide how ADJ is acquired ?
Hello,
I tried running
python Main.py --data-name flixster --epochs 40 --testing --ensemble
and got:
File "Main.py", line 11, in
from util_functions import *
File "C:\Users\PCOvice\Downloads\IGMC-master\IGMC-master\util_functions.py", line 13, in
from torch_geometric.data import Data, Dataset, InMemoryDataset
File "C:\Users\PCOvice\anaconda3\lib\site-packages\torch_geometric_init_.py", line 2, in
import torch_geometric.nn
File "C:\Users\PCOvice\anaconda3\lib\site-packages\torch_geometric\nn_init_.py", line 2, in
from .data_parallel import DataParallel
File "C:\Users\PCOvice\anaconda3\lib\site-packages\torch_geometric\nn\data_parallel.py", line 5, in
from torch_geometric.data import Batch
File "C:\Users\PCOvice\anaconda3\lib\site-packages\torch_geometric\data_init_.py", line 1, in
from .data import Data
File "C:\Users\PCOvice\anaconda3\lib\site-packages\torch_geometric\data\data.py", line 8, in
from torch_sparse import coalesce, SparseTensor
File "C:\Users\PCOvice\anaconda3\lib\site-packages\torch_sparse_init_.py", line 13, in
library, [osp.dirname(file)]).origin)
File "C:\Users\PCOvice\anaconda3\lib\site-packages\torch_ops.py", line 105, in load_library
ctypes.CDLL(path)
File "C:\Users\PCOvice\anaconda3\lib\ctypes_init_.py", line 364, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found
Does someone know how to solve that?
Thank you in advance
Hello,
Thank you very much the nice paper and code. I was wondering when you compare the IGMC with other methods, how do you replicate the results from previous work? Is there any good library to try them all easily, or you have to test it one by one? Thank you!
Hi, according to the paper on node labeling, the target user and item of subgraph should be labeled as 0 and 1. But the codes as
Line 213 in 5fa0e3c
The repository as of now lacks a requirements.txt
. It'd be nice to have the explicit PyTorch Geometric version in order to run the experiments. Hopefully, this will prevents errors like #7 from taking place.
Having a requirements.txt
file would make reproducibility and the onboarding process much easier.
One advantage of a node-embedding-based link predictor is that it's relatively easy to define item-to-item or user-to-user similarity from the embeddings (for recommender systems). Since your approach doesn't use node embeddings, is there any analogous way of defining similarity between nodes within one of the parts?
Have you tried loss functions and evaluations for ranking and implicit feedback instead of rating prediction and explicit feedback?
When i run the model with the ml_1m and met this error
添加参数--user_features, 运行ml_1m数据集,测试的时候报错,Batch没有u_feature属性...也就是test数据没有u_feature属性
The jupyter notebook cannot read the file and the reproduce get stuck.
I encounter a small problem during running this command:
python Main.py --data-name ml_10m --save-appendix _mnhp100 --data-appendix _mnph100 --max-nodes-per-hop 100 --testing --epochs 40 --save-interval 5 --adj-dropout 0 --lr-decay-step-size 20 --ensemble --dynamic-dataset
The training just stuck here for hours:
I made certain that Python ==3.8.1 Pytorch==1.4.0, Torch Geometric == 1.4.2. Not certain what is wrong. Thanks for your help in advance!
Such as the implement of PinSage.
hi,although the model use differet label to distinguish user node and item node,the two type of node have different feature.how RCGN aggregate two type of feature?besides if the label also is feature that take part in convolution operation?thank you.
Can you add the required software? Like pip freeze > requirements.txt
I am trying to solve a binary matrix completion problem using IGMC Algorithm. I have kept only 2 ratings [0,1]. When I feed the data of 0 and 1 labels, create a train test split, and run the code end to end it works fine but provided the epoch I limit it to 20. so when the epoch is kept at 20, the algorithm also works and the visualize also works fine and graphs are generated, but as the epoch is increased to 30 or 40 the training ensemble code runs fine and in the edge attributes there are different labels assigned apart from 0 and 1 such as 2, 3, 4 due to which the visualize option is failing with index error
edge_types = [class_values[edge_types[x]] for x in g.edges()]
IndexError: index 2 is out of bounds for axis 0 with size 2
Please let me know if I am missing something or need to understand some code behavior or if can I apply such a binary matrix completion problem with the IGMC algorithm in the first place.
Hi,
Thanks for sharing your code and your interesting paper.
I could not install the prior version of pytorch geometric on my machine but I did install
torch==1.7.0
torch-cluster==1.5.8
torch-geometric==1.6.3
torch-scatter==2.0.5
torch-sparse==0.6.8
and then fix the code in train_eval.py by updating the attributes names by
- gconv.att,
- gconv.basis.view(gconv.num_bases, -1)
+ gconv.comp,
+ gconv.weight.view(gconv.num_bases, -1)
Then, I launched the command on ml_100k (without dynamic train to speed up the training)
python Main.py --data-name ml_100k --save-appendix _mnph200 --data-appendix _mnph200 --epochs 80 --max-nodes-per-hop 200 --testing --ensemble
Epoch 80, train loss 0.847587, test rmse 0.921295: 99%|██████████████████████████████████████████████████████▎| 79/80 [4:13:22<03:10, 190.49s/it]Saving model states...
Epoch 80, train loss 0.847587, test rmse 0.921295: 100%|███████████████████████████████████████████████████████| 80/80 [4:13:22<00:00, 190.03s/it]
Test Once RMSE: 0.922065, Duration: 72.088108
However, the performance is much lower compared to the ones reported in the paper. Is there any additional hyperparameters to change ? Have you tried your code with more recent torch and torch_geometric installation ?
Thanks for your help !
I am trying to apply this code for two different settings:
Has this method been tested for predicting ratings for the above cases? If yes, what are your suggestions?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.