nnzhan / graph-wavenet Goto Github PK

graph wavenet

License: MIT License

Python 100.00%

graph-wavenet's Introduction

Graph WaveNet for Deep Spatial-Temporal Graph Modeling

This is the original pytorch implementation of Graph WaveNet in the following paper: [Graph WaveNet for Deep Spatial-Temporal Graph Modeling, IJCAI 2019] (https://arxiv.org/abs/1906.00121). A nice improvement over GraphWavenet is presented by Shleifer et al. paper code.

Requirements

python 3
see requirements.txt

Data Preparation

Step1: Download METR-LA and PEMS-BAY data from Google Drive or Baidu Yun links provided by DCRNN.

Step2: Process raw data

# Create data directories
mkdir -p data/{METR-LA,PEMS-BAY}

# METR-LA
python generate_training_data.py --output_dir=data/METR-LA --traffic_df_filename=data/metr-la.h5

# PEMS-BAY
python generate_training_data.py --output_dir=data/PEMS-BAY --traffic_df_filename=data/pems-bay.h5

Train Commands

python train.py --gcn_bool --adjtype doubletransition --addaptadj  --randomadj

graph-wavenet's People

Contributors

Stargazers

Watchers

Forkers

xiepeng21 daigenan vc12301 dlwbm123 shiruipan gcorso uzeroj mindis vcjy2017 gaoli1537 davidham3 hercodes yelianjin jacob-heglund yanlirock sshleifer cslele amoliu jdc08161063 jiahuisun shawn-nau yunwontae lonelykid96 tungk coolsunxu beyer-martin subbaraomanchala shengcc-cmvs yueyedeai kasimte yaoxy2010 frank19-lab neighbourbasedrec kiminh etarakci-hvl theonll lincanli98 buffoon-n yangbing668 coder-lhj danilecug kungtalon anhvaut bowenxu sklin93 relevation-143 robbieearle dongyann jarobyte91 hujilin1229 idevede gramerules ibulu xiang526 wfccross xiaolinhan holiday321 jaygao1219 cygong sine-zhow vivi-der xiangs18 yuanchangxu xhfei1224 qianye1019 hey-bear cylin-gn ynnusl coolgiserz captainsparrow11 mc-o lijunsun fnxiang zhaoyuanm celestialized dantodor drownfish19 shuowang-ai kialanpillay lpeti69 jnupython wmx1129 r0oup1iao xrosliang christine-tinguo simontopp guille495 moghadas76 zhwdzh j1o2h3n jiehu-cv zhanfengdog leonardyoung vgsatorras simonvino crashooo kiddj statmixedml eugemiran degula326

graph-wavenet's Issues

gated TCN中一个用的是Conv-2d，一个用的是Conv-1d，是出于什么考虑呢

你好，我想向您请教一个代码里的细节问题，谢谢
为何在实现gated TCN中一个用的是Conv-2d，一个用的是Conv-1d，是出于什么考虑呢？谢谢

How to modify the feature_dimension of the output from `gwnet` to be 2, such us (64,12,207,2)

Hello! If I want the feature_dimension in the output of gwnet in model.py to be 2 (the default feature_dimension=1 in the source code), how should I modify the gwnet part?
Additionally, I have a question: I understand that the convolution operations in gwnet are performed along the feature dimension, and the output shape of the model is [batch_size, feature_dim, num_nodes, seq_len] where feature_dimension=12 and seq_len=1. However, in line 18 of engine.py, the output is transposed directly, changing the feature dimension to the seq_len dimension. I wonder if this handling is reasonable.

Question

how to solve this question?
AttributeError: 'gwnet' object has no attribute 'nodevec1'
Thanks a lot!!!!

Some confusion about the use of conv1d function in the gwnet class

Hello,
Thank you for sharing the code of Graph WaveNet.

I have some confusion about the conv1d function using in your code.
Why use conv1d in gate_convs and how to understand that the kernel_size is a tuple (1, 2)?
Thanks

Bug in class gwnet(nn.Module)

len(supports) in self.gconv.append(gcn(dilation_channels,residual_channels,dropout,support_len=len(supports))) should be replaced as self.supports_len.
Otherwise when no supports are passed in, since self.supports is an empty list, it will cause the dimension c_in of the mlp in gcn to not match the data. I think this is why the author specifically maintained the variable self.supports_len earlier and +1

Question

AttributeError: 'NoneType' object has no attribute 'seek'.You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

How to solve this problem!Please!!!!!For helping !!!!!!

Hi! Great work! I have a question about the code. Have you mentioned adj_mx.pkl? There is no such file.

实验结果：为什么table2里的结果要比table3里的低很多？

为什么table2里的结果要比table3里的低很多，我的疑问是：table3在做消融实验，table3中加粗的最好的结果不应该放在table2 里面吗？为什么现在不是这样的呢？请各位大神看到后相助，十分感谢

论文中的图怎么得到的呢？代码在哪

Create GWN model in message passing way

Such as pytorch_geometric's MessagePassing. Thx!

FileNotFoundError: [Errno 2] No such file or directory: './garage/metr_epoch_1_3.38.pth'

Thanks for your coding!
I run the train.py and found this error.
Who can help me ?
Thanks

Some questions for implemention of gcn in model.py.

Hi, I have a question: we utilze GCN with AXW, but in your model.py, I find it become WXA, Why?

Can this code yields future predictions

How to get the road network distance in PeMS ?

Hi , Thanks for sharing the code. Could you please tell me how to get the road network distance in PeMS ? I'm confused for a week.

AttributeError: 'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

Traceback (most recent call last):
File "test.py", line 111, in
main()
File "test.py", line 50, in main
model.load_state_dict(torch.load(args.checkpoint))
File "/public/home/hpc0919170203/anaconda3/envs/GWN2/lib/python3.7/site-packages/torch/serialization.py", line 386, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/public/home/hpc0919170203/anaconda3/envs/GWN2/lib/python3.7/site-packages/torch/serialization.py", line 548, in _load
_check_seekable(f)
File "/public/home/hpc0919170203/anaconda3/envs/GWN2/lib/python3.7/site-packages/torch/serialization.py", line 194, in _check_seekable
raise_err_msg(["seek", "tell"], e)
File "/public/home/hpc0919170203/anaconda3/envs/GWN2/lib/python3.7/site-packages/torch/serialization.py", line 187, in raise_err_msg
raise type(e)(msg)
AttributeError: 'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

my loss getting bigger and bigger when I train

Hi,I have a question. When I train, my loss is getting bigger and bigger, and I don't know what's causing it

Namespace(addaptadj=True, adjdata='data/sensor_graph/adj_mx.pkl', adjtype='doubletransition', aptonly=False, batch_size=64, data='data/METR-LA', device='cuda:0', dropout=0.3, epochs=100, expid=1, gcn_bool=True, in_dim=2, learning_rate=0.001, nhid=32, num_nodes=207, print_every=50, randomadj=True, save='./garage/metr', seq_length=12, weight_decay=0.0001)
start training...
Iter: 000, Train Loss: 11.5344, Train MAPE: 0.3075, Train RMSE: 13.9952
Iter: 050, Train Loss: 377.5500, Train MAPE: 7.0916, Train RMSE: 630.7411
Iter: 100, Train Loss: 352.2685, Train MAPE: 6.7022, Train RMSE: 619.5583
Iter: 150, Train Loss: 376.2829, Train MAPE: 7.7274, Train RMSE: 651.6628
Iter: 200, Train Loss: 415.7277, Train MAPE: 8.6364, Train RMSE: 692.0403
Iter: 250, Train Loss: 427.4500, Train MAPE: 8.3187, Train RMSE: 706.2341
Iter: 300, Train Loss: 403.1333, Train MAPE: 7.4553, Train RMSE: 686.9592
Iter: 350, Train Loss: 421.0964, Train MAPE: 8.3744, Train RMSE: 704.3835
Epoch: 001, Inference Time: 4.2194 secs
Epoch: 001, Train Loss: 388.0823, Train MAPE: 7.5629, Train RMSE: 660.2726, Valid Loss: 409.1663, Valid MAPE: 8.0573, Valid RMSE: 683.0002, Training Time: 1063.4692/epoch
Iter: 000, Train Loss: 383.0825, Train MAPE: 7.3515, Train RMSE: 672.6279
Iter: 050, Train Loss: 436.3956, Train MAPE: 7.9620, Train RMSE: 721.0291
Iter: 100, Train Loss: 423.3630, Train MAPE: 7.6969, Train RMSE: 715.8647
Iter: 150, Train Loss: 414.0448, Train MAPE: 9.1630, Train RMSE: 713.6893
Iter: 200, Train Loss: 396.0155, Train MAPE: 7.4647, Train RMSE: 699.6281
Iter: 250, Train Loss: 433.5888, Train MAPE: 8.1940, Train RMSE: 735.1317
Iter: 300, Train Loss: 445.8810, Train MAPE: 8.4676, Train RMSE: 747.1703
Iter: 350, Train Loss: 430.3223, Train MAPE: 8.1599, Train RMSE: 734.9741
Epoch: 002, Inference Time: 3.6677 secs
Epoch: 002, Train Loss: 423.8275, Train MAPE: 8.3188, Train RMSE: 721.4823, Valid Loss: 434.8144, Valid MAPE: 8.5525, Valid RMSE: 726.0127, Training Time: 122.6467/epoch
Iter: 000, Train Loss: 429.0443, Train MAPE: 7.9226, Train RMSE: 733.3237
Iter: 050, Train Loss: 417.4486, Train MAPE: 8.1126, Train RMSE: 723.7980
Iter: 100, Train Loss: 428.7359, Train MAPE: 8.0147, Train RMSE: 734.3259
Iter: 150, Train Loss: 443.9696, Train MAPE: 8.9476, Train RMSE: 747.8458
Iter: 200, Train Loss: 431.4466, Train MAPE: 8.3825, Train RMSE: 737.4367
Iter: 250, Train Loss: 418.8755, Train MAPE: 7.8954, Train RMSE: 726.3869
Iter: 300, Train Loss: 438.4908, Train MAPE: 8.5767, Train RMSE: 744.3538
Iter: 350, Train Loss: 421.3578, Train MAPE: 8.0956, Train RMSE: 728.9158
Epoch: 003, Inference Time: 4.0353 secs
Epoch: 003, Train Loss: 433.9845, Train MAPE: 8.4935, Train RMSE: 739.3757, Valid Loss: 437.7613, Valid MAPE: 8.6092, Valid RMSE: 730.9322, Training Time: 120.3957/epoch

Generate_training_data

Hi,
I notice that in generate_training_data.py, it first uses a sliding window to reshape the training data into (num_samples - seq_len * 2 + 1, seq_len, num_nodes, num_feature), then splits the training data with num_train = round(num_samples * 0.7), but this would cause data breach problem. Maybe it's better to split the dataset and then using a sliding window.

I need some help! Expected 2D (unbatched) or 3D (batched) input to conv1d

Traceback (most recent call last):
File "C:\Users\Administrator\Desktop\Graph-WaveNet-master\train.py", line 177, in
main()
File "C:\Users\Administrator\Desktop\Graph-WaveNet-master\train.py", line 87, in main
metrics = engine.train(trainx, trainy[:,0,:,:])
File "C:\Users\Administrator\Desktop\Graph-WaveNet-master\engine.py", line 17, in train
output = self.model(input)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\Administrator\Desktop\Graph-WaveNet-master\model.py", line 175, in forward
gate = self.gate_convsi
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\conv.py", line 307, in forward
return self._conv_forward(input, self.weight, self.bias)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\conv.py", line 303, in _conv_forward
return F.conv1d(input, weight, bias, self.stride,

RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [64, 32, 207, 13]

Process finished with exit code 1

RuntimeError: size of dimension does not match previous size, operand 1, dim 0.

I don't konw how to solve this problem,><,谢谢大家了
Traceback (most recent call last):
File "train.py", line 177, in
main()
File "train.py", line 88, in main
metrics = engine.train(trainx, trainy[:,0,:,:])
File "/home/mist/Graph-WaveNet-master/engine.py", line 17, in train
output = self.model(input)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mist/Graph-WaveNet-master/model.py", line 192, in forward
x = self.gconv[i](x, new_supports)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mist/Graph-WaveNet-master/model.py", line 36, in forward
x1 = self.nconv(x,a)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/mist/Graph-WaveNet-master/model.py", line 13, in forward
x = torch.einsum('ncvl,vw->ncwl',(x,A))
File "/usr/local/lib/python3.6/dist-packages/torch/functional.py", line 342, in einsum
return einsum(equation, *_operands)
File "/usr/local/lib/python3.6/dist-packages/torch/functional.py", line 344, in einsum
return _VF.einsum(equation, operands) # type: ignore
RuntimeError: size of dimension does not match previous size, operand 1, dim 0

Replicating Paper Results

I ran the Forward Backward Adaptive Command:

python train.py --device cuda:0 --gcn_bool --adjtype doubletransition --addaptadj  --randomadj  --epoch 100 $ep --expid $expid

and got what I think are slightly worse results than Table 2 and 3 of the paper.

Table 3/METR-LA:

MAE, RMSE, MAPE = 3.04, 6.09, 8.23%
My results: 3.0737, 6.1674, 8.30%

Does that sound like a normal amount of error, wrong command, or bug?


Training finished
The valid loss on best model is 2.7565
Evaluate best model on test data for horizon 1, Test MAE: 2.2372, Test MAPE
: 0.0533, Test RMSE: 3.8697
Evaluate best model on test data for horizon 2, Test MAE: 2.5196, Test MAPE
: 0.0626, Test RMSE: 4.6753
Evaluate best model on test data for horizon 3, Test MAE: 2.7171, Test MAPE
: 0.0695, Test RMSE: 5.2287
Evaluate best model on test data for horizon 4, Test MAE: 2.8760, Test MAPE
: 0.0754, Test RMSE: 5.6681
Evaluate best model on test data for horizon 5, Test MAE: 3.0037, Test MAPE
: 0.0803, Test RMSE: 6.0149
Evaluate best model on test data for horizon 6, Test MAE: 3.1157, Test MAPE
: 0.0844, Test RMSE: 6.3154
Evaluate best model on test data for horizon 7, Test MAE: 3.2154, Test MAPE
: 0.0882, Test RMSE: 6.5706
Evaluate best model on test data for horizon 8, Test MAE: 3.3002, Test MAPE: 0.0913, Test RMSE: 6.7903
Evaluate best model on test data for horizon 9, Test MAE: 3.3777, Test MAP$: 0.0941, Test RMSE: 6.9856
Evaluate best model on test data for horizon 10, Test MAE: 3.4449, Test MA$E: 0.0965, Test RMSE: 7.1507
Evaluate best model on test data for horizon 11, Test MAE: 3.5081, Test MA$E: 0.0989, Test RMSE: 7.2993
Evaluate best model on test data for horizon 12, Test MAE: 3.5691, Test MA$E: 0.1011, Test RMSE: 7.4404

On average over 12 horizons, Test MAE: 3.0737, Test MAPE: 0.0830, Test RMS$: 6.1674
Total time spent: 4299.2252

table2中预测15mins,30mins 和60mins的值，其output length 一直等于12就可以吗？

你好，想请教一下各位大神，table2中，是不是将output length =12 一次性输出整个12个值，然后把第3，6，12处的值填进table2中就可以了？
还是output length =12 只是60mins时刻，我们需要将output length 分别设为 3 和 6 才能得到表格中的 15mins的值和30mins的值？
英语怕说不清楚，写的中文。希望能解答我的疑惑，万分谢谢！！！

_pickle.UnpicklingError: the STRING opcode argument must be quoted

Please help me，I don’t know how to solve it，the loation of the error code is as below

with open(pickle_file, 'rb') as f:
pickle_data = pickle.load(f)

The garage directory will not been created automatically

The arg.save parameter will not been created successfully. Please open a branch so that I could fix it. Thanks

What is the version of the dependent library, please? such as CUDA, python pytorch and etc. THX！

RuntimeError: Given groups=1, weight of size [32, 2, 1, 1], expected input[64, 3, 207, 13] to have 2 channels, but got 3 channels instead

RuntimeError maybe due to torch version can anyone send me help pls?

test: https://colab.research.google.com/drive/1VltsQGVO81P6lc9RKGnlxjXFp6VkY4YK?usp=sharing

question about padding

Thanks for your wonderful code! I am confused that why you use nn.functional.pad() on training and eval input but not on test input. Can you explain the reason?

AttributeError: 'gwnet' object has no attribute 'nodevec1'

test.py run result:
model load successfully
Adaptive adjacency matrix generation is skipped as 'nodevec1' or 'nodevec2' is not initialized.
Evaluate best model on test data for horizon 1, Test MAE: 2.9515, Test MAPE: 0.0718, Test RMSE: 5.1257
Evaluate best model on test data for horizon 2, Test MAE: 3.4770, Test MAPE: 0.0869, Test RMSE: 6.4396
Evaluate best model on test data for horizon 3, Test MAE: 3.9241, Test MAPE: 0.1013, Test RMSE: 7.3645
Evaluate best model on test data for horizon 4, Test MAE: 4.3461, Test MAPE: 0.1159, Test RMSE: 8.1222
Evaluate best model on test data for horizon 5, Test MAE: 4.7537, Test MAPE: 0.1299, Test RMSE: 8.7606
Evaluate best model on test data for horizon 6, Test MAE: 5.1344, Test MAPE: 0.1433, Test RMSE: 9.3124
Evaluate best model on test data for horizon 7, Test MAE: 5.5282, Test MAPE: 0.1584, Test RMSE: 9.8222
Evaluate best model on test data for horizon 8, Test MAE: 5.8895, Test MAPE: 0.1739, Test RMSE: 10.2632
Evaluate best model on test data for horizon 9, Test MAE: 6.2030, Test MAPE: 0.1886, Test RMSE: 10.6280
Evaluate best model on test data for horizon 10, Test MAE: 6.5186, Test MAPE: 0.2051, Test RMSE: 11.0065
Evaluate best model on test data for horizon 11, Test MAE: 6.7955, Test MAPE: 0.2194, Test RMSE: 11.3738
Evaluate best model on test data for horizon 12, Test MAE: 7.0273, Test MAPE: 0.2315, Test RMSE: 11.7210
On average over 12 horizons, Test MAE: 5.2124, Test MAPE: 0.1522, Test RMSE: 9.1616
Traceback (most recent call last):
File "D:\lunwen\daima\Graph WaveNet\test.py", line 127, in
main()
File "D:\lunwen\daima\Graph WaveNet\test.py", line 107, in main
adp = F.softmax(F.relu(torch.mm(model.nodevec1, model.nodevec2)), dim=1)
File "D:\Users\haoruochen\anaconda3\envs\py36env\lib\site-packages\torch\nn\modules\module.py", line 1178, in getattr
type(self).name, name))
AttributeError: 'gwnet' object has no attribute 'nodevec1'

How can i solve this problem??

About inverse_transform

Hi, I have run your code on some datasets that I constructed, and I found an problem in computing the MAPE metrics.

For the ground truth data, you normalize them into a small scale, and do the inserver normalize when evaluating the model. That will change some value like 0 to a small value like 1e-5. The computation of MAPE will take that small value into account and generate a very large MAPE value because mask only filter the value equals zero instead of these small value.

Why output dimension is only select the first?

The code metrics = engine.train(trainx, trainy[:, 0, :, :]) in line84 of train.py seems only predict one dimension of total D dimension.
But the paper wrote the output dimension is D.

adaptive adj problem

WaveNet is a very good paper for ST prediction, recently I take some experiments about the adaptive adj matrix. However, I found the adaptive matrix learned by the GWN model is not like the paper.

the adaptive adj listed in the paper is like this

but i get the results like this

I load the model from a performance-good .pth file.

I cannot find why for this problem, any experts can explain this error, thank you very very much!

There is one thing I don't fully understand from the code.

In this line of code (test.py line 100-104):

y12 = realy[:,99,11].cpu().detach().numpy()
yhat12 = scaler.inverse_transform(yhat[:,99,11]).cpu().detach().numpy()

y3 = realy[:,99,2].cpu().detach().numpy()
yhat3 = scaler.inverse_transform(yhat[:,99,2]).cpu().detach().numpy()

Can anyone explain to me what they are doing in this piece of code? And what the difference is between y12 and y3?

I really hope someone can help me! :)

Thanks in advance!

No such file or directory: 'data/sensor_graph/adj_mx.pkl'

Hello,

First, thank you for sharing the code!

Where can we find the adj_mx.pkl for METR-LA?

Thanks.