Code Monkey home page Code Monkey logo

stocks's People

Contributors

ngyb avatar yibinng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stocks's Issues

XGBoost: Shouldn't data source of model.predict is from past data?

Hi @yibinng,

I love reading your articles and your publication of time series prediction codes. All of them are wonderful and I enjoy playing with your projects very much.

While I was experimenting with XGBoost Stock Prediction, I've noticed in the Final Model section, the model.predict is taking in X_sample_scaled as input parameter. If you trace the assignment of X_sample_scaled (= test_scaled[features]), it is feature of the Test dataframe, which the associate label is the Test dataframe is what we intend to predict.

Since test_scaled[features] is derived from lag data, it means this is generated based on Test dataframe i.e. future data of dev/cv dataframe! It explains why the predicted data is lagged by actual data for few cycles.

How we could confirm this issue is by removing X_sample_scaled from the equation and solely relying on y_sample (= test[target]) as the golden result.

The fix is simple, in the final model, replace X_sample_scaled with X_cv_scaled, where X_cv_scaled is the present dataframe before Test dataframe was created in real scenario.

I'm seeing the prediction is similar to expected y_sample but with larger RMSE and MAPE values.

Please let me know if you think else.

New complementary tool

My name is Luis, I'm a big-data machine-learning developer, I'm a fan of your work, and I usually check your updates.

I was afraid that my savings would be eaten by inflation. I have created a powerful tool that based on past technical patterns (volatility, moving averages, statistics, trends, candlesticks, support and resistance, stock index indicators).
All the ones you know (RSI, MACD, STOCH, Bolinger Bands, SMA, DEMARK, Japanese candlesticks, ichimoku, fibonacci, williansR, balance of power, murrey math, etc) and more than 200 others.

The tool creates prediction models of correct trading points (buy signal and sell signal, every stock is good traded in time and direction).
For this I have used big data tools like pandas python, stock market libraries like: tablib, TAcharts ,pandas_ta... For data collection and calculation.
And powerful machine-learning libraries such as: Sklearn.RandomForest , Sklearn.GradientBoosting, XGBoost, Google TensorFlow and Google TensorFlow LSTM.

With the models trained with the selection of the best technical indicators, the tool is able to predict trading points (where to buy, where to sell) and send real-time alerts to Telegram or Mail. The points are calculated based on the learning of the correct trading points of the last 2 years (including the change to bear market after the rate hike).

I think it could be useful to you, to improve, I would like to share it with you, and if you are interested in improving and collaborating I am also willing, and if not file it in the box.

StockPricePrediction_v2a_prophet.ipynb in [140] my predictions are not on val_size

Dear,
I am following you directions.
My data is 567 rows (train size = 400, Val_size=167, H=30)
When I reach step 140 in your notebook, ,i get multiple errors, due to the predictions falling in H (after Train_val_size) and do not have corresponding actual data.
another error I am getting is different dataframe size .

Compute error metrics

preds_list = forecast['yhat'][train_val_size:train_val_size+H]
print("For forecast horizon %d, predicting on day %d, date %s, the RMSE is %f" % (H, i, df['date'][i-1]+ timedelta(days = 1), get_rmse(df[i:i+H]['y'], preds_list)))
print("For forecast horizon %d, predicting on day %d, date %s, the mean MAPE is %f" % (H, i, df['date'][i-1]+ timedelta(days = 1), get_mape(df[i:i+H]['y'], preds_list)))
print("For forecast horizon %d, predicting on day %d, date %s, the mean MAE is %f" % (H, i, df['date'][i-1]+ timedelta(days = 1), get_mae(df[i:i+H]['y'], preds_list)))

your graph is showing that your predictions are falling within your actual data, and you are fine tuning the parameters accordingly.

Should the preds_list=forecast['yhat'][train size:train_val_size]?
Thank you for your time and consideration

RuntimeWarning in the "Final Model" section

When running

est_list = get_preds_mov_avg(df, 'adj_close', N_opt, 0, num_train+num_cv)
test['est' + '_N' + str(N_opt)] = est_list
print("RMSE = %0.3f" % math.sqrt(mean_squared_error(est_list, test['adj_close'])))
print("MAPE = %0.3f%%" % get_mape(test['adj_close'], est_list))
test.head()

The following warnings occur:

ipykernel_launcher.py:21: RuntimeWarning: invalid value encountered in less
ipykernel_launcher.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

i am having problem with StockPricePrediction_v6d_xgboost

hello sir, when I try to run the algorithm with a different dataset I am having this error
[Errno 2] No such file or directory: './out/v6d_val_rmse_bef_tuning_2016-02-09.pickle'

the code line that creates this error is the following

results = defaultdict(list)
ests = {} # the predictions**
date_list = ['2016-01-04',
'2016-02-09',
'2016-06-07',
'2016-08-22',
'2016-11-07',
'2017-01-23',
'2017-04-10',
'2017-09-07',
'2017-11-29',
'2018-03-05',
'2018-05-07',
'2018-09-04']

for date in date_list:
results['date'].append(date)
results['val_rmse_bef_tuning'].append(pickle.load(open( "./out/v6d_val_rmse_bef_tuning_" + date + ".pickle", "rb")))
results['val_rmse_aft_tuning'].append(pickle.load(open( "./out/v6d_val_rmse_aft_tuning_" + date + ".pickle", "rb")))
results['test_rmse_bef_tuning'].append(pickle.load(open( "./out/v6d_test_rmse_bef_tuning_" + date + ".pickle", "rb")))
results['test_rmse_aft_tuning'].append(pickle.load(open( "./out/v6d_test_rmse_aft_tuning_" + date + ".pickle", "rb")))
results['test_mape_bef_tuning'].append(pickle.load(open( "./out/v6d_test_mape_bef_tuning_" + date + ".pickle", "rb")))
results['test_mape_aft_tuning'].append(pickle.load(open( "./out/v6d_test_mape_aft_tuning_" + date + ".pickle", "rb")))
results['test_mae_bef_tuning'].append(pickle.load(open( "./out/v6d_test_mae_bef_tuning_" + date + ".pickle", "rb")))
results['test_mae_aft_tuning'].append(pickle.load(open( "./out/v6d_test_mae_aft_tuning_" + date + ".pickle", "rb")))
ests[date] = pickle.load(open( "./out/v6d_test_est_aft_tuning_" + date + ".pickle", "rb"))

results = pd.DataFrame(results)
results_>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.