Code Monkey home page Code Monkey logo

da-rnn's Introduction

PyTorch Implementation of DA-RNN

PRs Welcome contributions welcome HitCount Open In Colab

Get hands-on experience of implementation of RNN (LSTM) in Pytorch;
Get familiar with Finacial data with Deep Learning;

Stargazers over time

Stargazers over time

Table of Contents

Dataset

Download

NASDAQ 100 stock data

Description

This dataset is a subset of the full NASDAQ 100 stock dataset used in [1]. It includes 105 days' stock data starting from July 26, 2016 to December 22, 2016. Each day contains 390 data points except for 210 data points on November 25 and 180 data points on Decmber 22.

Some of the corporations under NASDAQ 100 are not included in this dataset because they have too much missing data. There are in total 81 major coporations in this dataset and we interpolate the missing data with linear interpolation.

In [1], the first 35,100 data points are used as the training set and the following 2,730 data points are used as the validation set. The last 2,730 data points are used as the test set.

Usage

Train

usage: main.py [-h] [--dataroot DATAROOT] [--batchsize BATCHSIZE]
               [--nhidden_encoder NHIDDEN_ENCODER]
               [--nhidden_decoder NHIDDEN_DECODER] [--ntimestep NTIMESTEP]
               [--epochs EPOCHS] [--lr LR]

PyTorch implementation of paper 'A Dual-Stage Attention-Based Recurrent Neural
Network for Time Series Prediction'

optional arguments:
  -h, --help            show this help message and exit
  --dataroot DATAROOT   path to dataset
  --batchsize BATCHSIZE
                        input batch size [128]
  --nhidden_encoder NHIDDEN_ENCODER
                        size of hidden states for the encoder m [64, 128]
  --nhidden_decoder NHIDDEN_DECODER
                        size of hidden states for the decoder p [64, 128]
  --ntimestep NTIMESTEP
                        the number of time steps in the window T [10]
  --epochs EPOCHS       number of epochs to train [10, 200, 500]
  --lr LR               learning rate [0.001] reduced by 0.1 after each 10000
                        iterations

An example of training process is as follows:

python3 main --lr 0.0001 --epochs 50

Result

Training process

Training Loss

Prediction

DA-RNN

In the paper "A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction".

They proposed a novel dual-stage attention-based recurrent neural network (DA-RNN) for time series prediction. In the first stage, an input attention mechanism is introduced to adaptively extract relevant driving series (a.k.a., input features) at each time step by referring to the previous encoder hidden state. In the second stage, a temporal attention mechanism is introduced to select relevant encoder hidden states across all time steps.

For the objective, a square loss is used. With these two attention mechanisms, the DA-RNN can adaptively select the most relevant input features and capture the long-term temporal dependencies of a time series. A graphical illustration of the proposed model is shown in Figure 1.

Figure 1: Graphical illustration of the dual-stage attention-based recurrent neural network.

The Dual-Stage Attention-Based RNN (a.k.a. DA-RNN) model belongs to the general class of Nonlinear Autoregressive Exogenous (NARX) models, which predict the current value of a time series based on historical values of this series plus the historical values of multiple exogenous time series.

LSTM

Recursive Neural Network model has been used in this paper. RNN models are powerful to exhibit quite sophisticated dynamic temporal structure for sequential data. RNN models come in many forms, one of which is the Long-Short Term Memory (LSTM) model that is widely applied in language models.

Attention Mechanism

Attention mechanism performs feature selection as the paper mentioned, the model can keep only the most useful information at each temporal stage.

Model

DA-RNN model includes two LSTM networks with attention mechanism (an encoder and a decoder).

In the encoder, they introduced a novel input attention mechanism that can adaptively select the relevant driving series. In the decoder, a temporal attention mechanism is used to automatically select relevant encoder hidden states across all time steps.

Experiments and Parameters Settings

NASDAQ 100 Stock dataset

In the NASDAQ 100 Stock dataset, we collected the stock prices of 81 major corporations under NASDAQ 100, which are used as the driving time series. The index value of the NASDAQ 100 is used as the target series. The frequency of the data collection is minute-by-minute. This data covers the period from July 26, 2016 to December 22, 2016, 105 days in total. Each day contains 390 data points from the opening to closing of the market except that there are 210 data points on November 25 and 180 data points on December 22. In our experiments, we use the ๏ฌrst 35,100 data points as the training set and the following 2,730 data points as the validation set. The last 2,730 data points are used as the test set. This dataset is publicly available and will be continuously enlarged to aid the research in this direction.

Training procedure & Parameters Settings

Category Description
Optimization method minibatch stochastic gradient descent (SGD) together with the Adam optimizer
number of time steps in the window T T = 10
size of hidden states for the encoder m m = p = 64, 128
size of hidden states for the decoder p m = p = 64, 128
Evaluation Metrics $$O(y_T , \hat{y_T} ) = \frac{1}{N} \sum \limits_{i=1}^{N} (y_T^i , \hat{y_T}^i)^2 $$

References

[1] Yao Qin, Dongjin Song, Haifeng Chen, Wei Cheng, Guofei Jiang, Garrison W. Cottrell. "A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction". arXiv preprint arXiv:1704.02971 (2017).
[2] Chandler Zuo. "A PyTorch Example to Use RNN for Financial Prediction". (2017).
[3] YitongCU. "Dual Staged Attention Model for Time Series prediction".
[4] Pytorch Forum. "Why 3d input tensors in LSTM?".

da-rnn's People

Contributors

dan-r95 avatar imgbotapp avatar zhenye-na avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

da-rnn's Issues

Please reply

Dear Zhenye,
Could you please send me an email (should be visible on my profile). I compiled a detailed email to share with you, then realized your email wasn't public.
thanks,
Daniel

abut Eq 8

I found that the Eq 8 in the paper is different from the coding function in your program. The tanh function is not used, but your decoding function is consistent with the paper, Thanks!!!!!

I encountered an issue where the input and tags did not match when I ran your code.

I need some help. This is error!
File "F:/็ ”็ฉถ็”Ÿ่ต„ๆ–™/่ฎบๆ–‡/ๆณจๆ„ๅŠ›ๆœบๅˆถ/DA-RNN-master-pytorch/DA-RNN-master/src/main.py", line 55, in
model.train()

File "F:\็ ”็ฉถ็”Ÿ่ต„ๆ–™\่ฎบๆ–‡\ๆณจๆ„ๅŠ›ๆœบๅˆถ\DA-RNN-master-pytorch\DA-RNN-master\src\model.py", line 285, in train
loss = self.criterion(y_pred, y_true)

File "G:\Users\ll\Anaconda3\envs\tensorflow_gpu\lib\site-packages\torch\nn\modules\module.py", line 491, in call
result = self.forward(*input, **kwargs)

File "G:\Users\ll\Anaconda3\envs\tensorflow_gpu\lib\site-packages\torch\nn\modules\loss.py", line 372, in forward
return F.mse_loss(input, target, size_average=self.size_average, reduce=self.reduce)

File "G:\Users\ll\Anaconda3\envs\tensorflow_gpu\lib\site-packages\torch\nn\functional.py", line 1569, in mse_loss
input, target, size_average, reduce)

File "G:\Users\ll\Anaconda3\envs\tensorflow_gpu\lib\site-packages\torch\nn\functional.py", line 1537, in _pointwise_loss
return lambd_optimized(input, target, size_average, reduce)

RuntimeError: input and target shapes do not match: input [128 x 1], target [128] at c:\users\administrator\downloads\new-builder\win-wheel\pytorch\aten\src\thnn\generic/MSECriterion.c:13

one bug when running the code

When I ran the code, there was a bug:''LSTM' object has no attribute 'flatten_parameters'',
Here is the code:
self.encoder_lstm.flatten_parameters()
_, final_stage = self.encoder_lstm(
x_tilde.unsqueeze(0), (h_n, s_n))
h_n = final_state[0]
s_n = final_stage[1]

I am not sure if this bug happened due to the naming issue

Defining a train method will break the model.eval() call.

Pytorch has a built in method which sets the model into evaluation mode. The method is called torch.nn.Module.eval(). In this method the train method is called with a parameter. Due to the case the your rnn also implements a train method python overwrites the default train method and the code will crash.
How to fix:
I recommend to rename your method into fit() or train_epochs() or something similiear.

Regardless of some minor issues, your model seems nice.

Encoder error

Shouldn't the input to the encoder be from 1 to T instead of T-1?
As the number LSTM in stack for encoder is to T and decoder is 1 to T- 1,as per my understanding of the reference paper.
Thank you and hope we can discuss and resolve this issue

It seems that the training model failed.

Hi, Zhenye, thank you for excellent code. When i ran the code, it seemed that i got the wrong prediction result.
image Could you please give some advice on how to overcome it? Thanks a lot.

How can I enable cuda to accelerate training?

I run the program as follows, but it seems cuda is not enabled.
python3 main.py --cuda

So, I edited the part of source codes as following,
model = DA_rnn(X, y, opt.ntimestep, opt.nhidden_encoder, opt.nhidden_decoder, opt.batchsize, opt.lr, opt.epochs, opt.cuda)
I added "opt.cuda" parameter.

However, there are still problems like this:

/home/tatsuhiro/PycharmProjects/untitled/DA-RNN-master/src/model.py:80: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
alpha = F.softmax(x.view(-1, self.input_size))
/home/tatsuhiro/PycharmProjects/untitled/DA-RNN-master/src/model.py:145: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
x.view(-1, 2 * self.decoder_num_hidden + self.encoder_num_hidden)).view(-1, self.T - 1))
Traceback (most recent call last):
File "main.py", line 57, in
model.train()
File "/home/tatsuhiro/PycharmProjects/untitled/DA-RNN-master/src/model.py", line 260, in train
loss = self.train_forward(x, y_prev, y_gt)
File "/home/tatsuhiro/PycharmProjects/untitled/DA-RNN-master/src/model.py", line 316, in train_forward
loss = self.criterion(y_pred, y_true)
File "/home/tatsuhiro/PycharmProjects/untitled/venv/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/tatsuhiro/PycharmProjects/untitled/venv/lib/python3.5/site-packages/torch/nn/modules/loss.py", line 372, in forward
return F.mse_loss(input, target, size_average=self.size_average, reduce=self.reduce)
File "/home/tatsuhiro/PycharmProjects/untitled/venv/lib/python3.5/site-packages/torch/nn/functional.py", line 1569, in mse_loss
input, target, size_average, reduce)
File "/home/tatsuhiro/PycharmProjects/untitled/venv/lib/python3.5/site-packages/torch/nn/functional.py", line 1537, in _pointwise_loss
return lambd_optimized(input, target, size_average, reduce)
RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument #2 'target'

How can I solve this problem?

why use companies' stock price to predict NASDAQ-100 Index?

your code in utils.py read_data(), the code of getting data is below, why do you use companies' stock price to predict NASDAQ-100 Index?
X = df.loc[:, [x for x in df.columns.tolist() if x != 'NDX']].as_matrix()
y = np.array(df.NDX)

NDX should be calculated by these stock prices, isnโ€™t it? why u have to learn the calculation formula by RNN?
The DA-RNN paper gives a time series predicting model, right? But where is your time series predicting? I am confusion.

That's what I found when I read the code repeatedly, If I got wrong or missed something, please tell me.
Thank you.

about Eq 8

I found that the Eq 8 in the paper is different from the coding function in your program. The tanh function is not used, but your decoding function is consistent with the paper, Thanks!!!!!

Only Report Training Loss?

It seems the train_val_test_split function in ops.py file is never called. The model is trained on all data points, and the training loss is reported.

GPU is not used

Thank you for your contribution, but I can't call GPU when I run this code. Here is my configuration.
Ubuntu18.04, CUDA9.0, cudnn7.0, pytorch0.3.0.

Is da-rnn useful for mult-step forcasting?

I tried da-rnn for multi-step forcasting but got very bad performance. I cut the dataset by the setting step length. And the strategy for multi-step is Multiple-Output. Is the attention mechanism not good for multi-step forcasting?

time step in y_prev y_gt

Thanks for the awesome code, I'm still learning DARNN, and a little bit confused in your code.

I've noticed that in training session, your y_prev is defined as

for bs in range(len(indices)):
y_prev[bs, :] = self.y[indices[bs]:(indices[bs] + self.T - 1)]

but y_gt is

y_gt = self.y[indices + self.T]

The question is,

If batchsize is 1, T is 10,

You get y[0:9](the 1st to the 9th number, let's say [0, 1, 2, 3, 4, 5, 6, 7, 8]) as input but predict y[10] as target(which is the 11th number, let's say [10] ).

Doesn't that means the code skips the 10th value as target? (value [9] )

Or I'm just wrong in numpy indexing.

RuntimeError: invalid argument 3: out of range

This is my data which I repaced in ../nasdaq/data.csv directory.
I need to train model for price detection, this is how my data look like:
image

For CPU, I run the python3 main.py --lr 0.0001 --epochs 50, but I got this error:

File "/content/drive/My Drive/Spot-Forcasting/paper-methods2019/DA-RNN/DA-RNN/src/model.py", line 77, in forward
x = torch.cat((h_n.repeat(self.input_size, 1, 1).permute(1, 0, 2),
RuntimeError: invalid argument 3: out of range at /pytorch/aten/src/TH/generic/THTensor.cpp:392

image

Any idea how to solve it?
Thanks

A problem about input data

Hello, author.

The input data includes drving series and previous values of the target series according to this paper. I am a little puzzeld about the drving series $X$ When I prepare to apply this model in my data. My data only have the previous values of the target series, How can I ready for the drving series ?

Please help me crack my quandary at your nearest free time.

Thanks !!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.