Time series with tensorflow

Fundamentals of a Time series forecasting problem using Tensorflow.

In this repository we will be building a Bitcoin price predictor (BitPredict 💰📈) using it's historical price over the years.

What is a time series problem ?

Time series problems deal with data over time.

Such as, the number of staff members in a company over 10-years, sales of computers for the past 5-years, electricity usage for the past 50-years.

The timeline can be short (seconds/minutes) or long (years/decades). And the problems you might investigate using can usually be broken down into two categories.

Problem Type	Examples	Output
Classification	Anomaly detection, time series identification (where did this time series come from?)	Discrete (a label)
Forecasting	Predicting stock market prices, forecasting future demand for a product, stocking inventory requirements	Continuous (a number)

In both above cases, a supervised learning approach is often used. Meaning, you'd have some example data and a label associated with that data.

For example, in forecasting the price of Bitcoin, your data could be the historical price of Bitcoin for the past month and the label could be today's price (the label can't be tomorrow's price because that's what we'd want to predict).

Things that I've learned

Get time series data (the historical price of Bitcoin)
- Load in time series data using pandas/Python's CSV module
Format data for a time series problem
- Creating training and test sets (the wrong way)
- Creating training and test sets (the right way)
- Visualizing time series data
- Turning time series data into a supervised learning problem (windowing)
- Preparing univariate and multivariate (more than one variable) data
Evaluating a time series forecasting model
Setting up a series of deep learning modelling experiments
- Dense (fully-connected) networks
- Sequence models (LSTM and 1D CNN)
- Ensembling (combining multiple models together)
- Multivariate models
- Replicating the N-BEATS algorithm using TensorFlow layer subclassing
Creating a modelling checkpoint to save the best performing model during training
Making predictions (forecasts) with a time series model
Creating prediction intervals for time series model forecasts
Discussing two different types of uncertainty in machine learning (data uncertainty and model uncertainty)
Demonstrating why forecasting in an open system is BS (the turkey problem)

Exercises

Does scaling the data help for univariate/multivariate data? (e.g. getting all of the values between 0 & 1)
- Try doing this for a univariate model (e.g. model_1) and a multivariate model (e.g. model_6) and see if it effects model training or evaluation results.
Get the most up to date data on Bitcoin, train a model & see how it goes (our data goes up to May 18 2021).
- You can download the Bitcoin historical data for free from coindesk.com/price/bitcoin and clicking "Export Data" -> "CSV".
For most of our models we used WINDOW_SIZE=7, but is there a better window size?
- Setup a series of experiments to find whether or not there's a better window size.
- For example, you might train 10 different models with HORIZON=1 but with window sizes ranging from 2-12.
Create a windowed dataset just like the ones we used for model_1 using tf.keras.preprocessing.timeseries_dataset_from_array() and retrain model_1 using the recreated dataset.
For our multivariate modelling experiment, we added the Bitcoin block reward size as an extra feature to make our time series multivariate.
- Are there any other features you think you could add?
- If so, try it out, how do these affect the model?
Make prediction intervals for future forecasts. To do so, one way would be to train an ensemble model on all of the data, make future forecasts with it and calculate the prediction intervals of the ensemble just like we did for model_8.
For future predictions, try to make a prediction, retrain a model on the predictions, make a prediction, retrain a model, make a prediction, retrain a model, make a prediction (retrain a model each time a new prediction is made). Plot the results, how do they look compared to the future predictions where a model wasn't retrained for every forecast (model_9)?
Throughout this notebook, we've only tried algorithms we've handcrafted ourselves. But it's worth seeing how a purpose built forecasting algorithm goes.
- Try out one of the extra algorithms listed in the modelling experiments part such as:
  - Facebook's Kats library - there are many models in here, remember the machine learning practioner's motto: experiment, experiment, experiment.
  - LinkedIn's Greykite library

saketmunda / time-series-with-tensorflow Goto Github PK