kulbear / stock-prediction Goto Github PK

View Code? Open in Web Editor NEW

268.0 268.0 162.0 169 KB

Stock price prediction with recurrent neural network. The data is from the Chinese stock.

License: MIT License

Jupyter Notebook 97.25% Python 2.75%

stock-prediction's Introduction

Stock Prediction with Recurrent Neural Network

Stock price prediction with RNN. The data we used is from the Chinese stock.

Requirements

Python 3.5
TuShare 0.7.4
Pandas 0.19.2
Keras 1.2.2
Numpy 1.12.0
scikit-learn 0.18.1
TensorFlow 1.0 (GPU version recommended)

I personally recommend you to use Anaconda to build your virtual environment. And the program probably cost a significant time if you are not using the GPU version Tensorflow.

Get Data

You can run fetch_data.py to get a piece of test data. Without changing the script, you can get two seperated csv file named:

000002-from-1995-01-01.csv =====> Contains general data for stock 000002 from 1995-01-01 to today.
000002-3-year.csv =====> Contains candlestick chart data for stock 000002 (万科A) for the most recent 3 years.

You are expected to see results look like (the first DataFrame contains general data where the the second contains detailed candlestick chart data):

$ python3 fetch_data.py
[Getting data:]#########################################################################################
Saving DataFrame:
     open   high    low      volume        amount  close
0  20.64  20.64  20.37  16362363.0  3.350027e+08  20.56
1  20.92  20.92  20.60  21850597.0  4.520071e+08  20.64
2  21.00  21.15  20.72  26910139.0  5.628396e+08  20.94
3  20.70  21.57  20.70  64585536.0  1.363421e+09  21.02
4  20.60  20.70  20.20  45886018.0  9.382043e+08  20.70

Saving DataFrame:
     open   high    low     volume  price_change  p_change     ma5    ma10  \
0  20.64  20.64  20.37  163623.62         -0.08     -0.39  20.772  20.721
1  20.92  20.92  20.60  218505.95         -0.30     -1.43  20.780  20.718
2  21.00  21.15  20.72  269101.41         -0.08     -0.38  20.812  20.755
3  20.70  21.57  20.70  645855.38          0.32      1.55  20.782  20.788
4  20.60  20.70  20.20  458860.16          0.10      0.48  20.694  20.806

     ma20      v_ma5     v_ma10     v_ma20  close
0  20.954  351189.30  388345.91  394078.37  20.56
1  20.990  373384.46  403747.59  411728.38  20.64
2  21.022  392464.55  405000.55  426124.42  20.94
3  21.054  445386.85  403945.59  473166.37  21.02
4  21.038  486615.13  378825.52  461835.35  20.70

Demo

Reference

stock-prediction's People

Contributors

Stargazers

Watchers

Forkers

ravi-code-ranjan dukdorei picopoco bahdor karangautam zenny parrondo argowang tetraquard liutong1991 philmcc sujana99 varadarajan77 justperson94 tacoo vuminhquang yigal yinsenm zoonono zhivko gs91ting verigibest stevenxxu doran-teaches-code onealgorithm zhucer2003 olaitandoublekay ianmadlenya scholltan shubhampachori12110095 limingbei goleo8 andrealbh alzayats richard27yang sgoal batermj alexgoal liutc123 vampypandya muhmaz3 tonylibing a20180502 germanf minwh luoting123321 powxoper ben-kwh niroshank 0xdaksh tanglemontree haoliu1706 eong2012 xc-mezcal zli69 westonplatter aaron8tang lxj0276 zengrz redsuncmx sbluhm romanodebortoli jeremytian2019 vskynet alexbuce meghatrao jacoverster saraswathykrk 52nlp lincrawler mikehibbert databill86 xilongpei ywang021 ideaplexus zhcf jsyzeng lipan118 bismarkstoney bl0nn yinhao1501 gutical jsfwerp lincanli98 robertselders1 vantuan5644 gabrioo laspesi andhra-1771553 lcui2018 afcarl merrytomy 3liv qiqzhang webclinic017 habibmrad aiocheer luckyxu versionhost2 damonclifford

stock-prediction's Issues

input_dim, output_dim not working in updated keras tensorflow version

Hi,

I am working on a stock prediction assignment and was trying to get some help from your script. But I am facing some issues in updated keras / tensorflow version. Seems like input_dim, output_dim functions has deprecated. Could you help me to update that code for updated version.

Also, as I can see in your code, you are sending 3 dimensional input data like (None, 20, 6) for model training. When I use same approach, I am getting 0.000 accuracy in every epoch. I do not understand what I am doing wrong.

Please have a look at my model code and correct me if anything wrong:

model = Sequential()
model.add(LSTM(512, input_shape=(X_train.shape[1], X_train.shape[2]), return_sequences=True))
model.add(Dropout(0.4))
model.add(LSTM(256, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(128, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(64, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(32))
model.add(Dense(1, activation='linear'))

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
model.summary() 

history = model.fit(X_train, y_train, epochs=5, batch_size=64, validation_split=0.10)

Train on 3458 samples, validate on 385 samples
Epoch 1/5
3458/3458 [==============================]3458/3458 [==============================] - 93s 27ms/step - loss: 0.0753 - acc: 0.0000e+00 - val_loss: 0.4183 - val_acc: 0.0000e+00

Epoch 2/5
3458/3458 [==============================]3458/3458 [==============================] - 78s 22ms/step - loss: 0.0145 - acc: 0.0000e+00 - val_loss: 0.1113 - val_acc: 0.0000e+00

Epoch 3/5
3458/3458 [==============================]3458/3458 [==============================] - 80s 23ms/step - loss: 0.0114 - acc: 0.0000e+00 - val_loss: 0.1157 - val_acc: 0.0000e+00

Epoch 4/5
3458/3458 [==============================]3458/3458 [==============================] - 97s 28ms/step - loss: 0.0097 - acc: 0.0000e+00 - val_loss: 0.0560 - val_acc: 0.0000e+00

Epoch 5/5
3458/3458 [==============================]3458/3458 [==============================] - 79s 23ms/step - loss: 0.0097 - acc: 0.0000e+00 - val_loss: 0.0951 - val_acc: 0.0000e+00

Regards,
Ankit Aggarwal

Predition is not done for next data?

Hi,

The prediction was only with the actual data and there is no data predicted for next few data or days.

What you done is "If the data was 1000 then the predicted also 1000".

But I'm asking is predicted data for next n number of days.

For example:
if we have 1000 data and we need to predict for next 10 days then it should be predicted for 1010 data.

how to revert pred values to normal?

Hi
Can you please help with scale back those prediction value function?

I tried this but values are bit different
Note: made preprocessor as global variable

def normalize_data(x, y):
global preprocessor
linear_x = x.reshape((x.shape[0], x.shape[1] * x.shape[2]))
#print("x.shape", x.shape)
#print("linear_x.shape", linear_x.shape)
xy = np.c_[ linear_x, y, y, y, y, y, y]
#print("xy.shape", xy.shape)
xy_scaled = preprocessor.inverse_transform(xy, copy=True)
#print("xy_scaled.shape", xy_scaled.shape)
#print(xy_scaled[:, -1].tolist())

return xy_scaled[:, -1].tolist()

Will be waiting for your reply

Using alternative stock/currency rates

In your notebook, you use values that are initially in the ranges of 20 and then scale them down to between 1 and 0.

If I wanted to plug in values that were already lower than 1 and above 0, (like 0.00000690 for example), would they still need scaling?

If so what do you recommend?

ValueError: list.remove(x): x not in list(run fetch_data.py)

ValueError Traceback (most recent call last)
in
----> 1 get_all_history('000002', start='1995-01-01')
2
3 get_3_years_history('000002')

in get_all_history(stock_index, start, autype)
10 """
11 df = ts.get_h_data(stock_index, start = start, autype = autype)
---> 12 df = wash(df)
13 print('\nSaving DataFrame: \n', df.head(5))
14 df.to_csv('{}-from-{}.csv'.format(stock_index, start), index=False)

in wash(df, target)
11 df = df.reset_index(drop=True)
12 col_list = df.columns.tolist()
---> 13 col_list.remove(target)
14 col_list.append(target)
15 return df[col_list]

ValueError: list.remove(x): x not in list

Error 456?

I am getting the following error when running fetch_data.py

[Getting data:]HTTP Error 456:
HTTP Error 456:
HTTP Error 456:
Traceback (most recent call last):
File "fetch_data.py", line 52, in
get_all_history('000002', start='1995-01-01')
File "fetch_data.py", line 47, in get_all_history
df = ts.get_h_data(stock_index, start=start, autype=autype)
File "/usr/local/lib/python3.5/dist-packages/tushare/stock/trading.py", line 432, in get_h_data
retry_count, pause)
File "/usr/local/lib/python3.5/dist-packages/tushare/stock/trading.py", line 572, in _parse_fq_data
raise IOError(ct.NETWORK_URL_ERROR_MSG)
OSError: 获取失败，请检查网络.

Any advice?

HTTP Error 456 (run fetch_data.py)

python fetch_data.py
/home/smilewater/anaconda3/envs/tensorflow-gpu/lib/python3.6/importlib/_bootstrap.py:205: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
[Getting data:]###################################################HTTP Error 456:
#HTTP Error 456:
HTTP Error 456:
HTTP Error 456:
Traceback (most recent call last):
File "fetch_data.py", line 52, in
get_all_history('000002', start='1995-01-01')
File "fetch_data.py", line 47, in get_all_history
df = ts.get_h_data(stock_index, start=start, autype=autype)
File "/home/smilewater/anaconda3/envs/tensorflow-gpu/lib/python3.6/site-packages/tushare/stock/trading.py", line 440, in get_h_data
retry_count, pause)
File "/home/smilewater/anaconda3/envs/tensorflow-gpu/lib/python3.6/site-packages/tushare/stock/trading.py", line 572, in _parse_fq_data
raise IOError(ct.NETWORK_URL_ERROR_MSG)
OSError: 获取失败，请检查网络.

How to set it up?

In the readme you simply list dependencies and then say:

I personally recommend you to use Anaconda to build your virtual environment.

Can you provide an example config/install instructions for that? Because setting up all those dependencies looks rather complicated, and if you already have an anaconda package definition/config file/etc from your development it'd be great if you could add those to the repo