In this project we aim to develop a python
package that can:
- Download raw cryptocurrency data between given dates using Yahoo Finance API (with
yfinance
package) with the following format. - For each date calculate various technical indicators using the raw data.
- Normalize the technical indicator values into interval [0-255].
- Create a
.RIMG
file for each date using normalized technical indicator values. A.RIMG
file is a special file format that contains a grayscale image and the date. - Train a DQN to map the state (grayscale image) into the optimal action (Long, Short, Neutral).
- Validate the models decision using Top/Bottom K and Market Neutral portfolios.
First, clone the repository with your preferred method. For example, to clone using GitHub CLI:
gh repo clone tunaalaygut/reincrypt && cd reincrypt
Then, create a virtual environment, activate it, and install the requirements
virtualenv env
source env/bin/activate
pip install -r requirements.txt
Now you are ready to move on.
This section covers the necessary steps to prepare your data for training.
To download raw cryptocurrency data ticker_downloader.py
module can be used. However, before you do that you need to create a list of currencies you want to download and write them to a .txt
file. For example;
# currencies.txt
KCS-USD
WGR-USD
MYB-USD
...
cd src/data_prep
python ticker_downloader.py currencies.txt
Running the command above will produce an output similar to the following:
[*********************100%***********************] 1 of 1 completed
KCS-USD downloaded and written to file.
[*********************100%***********************] 1 of 1 completed
WGR-USD downloaded and written to file.
[*********************100%***********************] 1 of 1 completed
MYB-USD downloaded and written to file.
...
Downloaded raw data should have the following format for each crpytocurrency:
Date | Open | High | Low | Close | Adj Close | Volume |
---|---|---|---|---|---|---|
2017-11-09 | 0.000224 | 0.000297 | 0.0002119 | 0.000282 | 0.0002 | 8605 |
2017-11-10 | 0.000284 | 0.000292 | 0.0001340 | 0.000264 | 0.0002 | 4201 |
2017-11-11 | 0.000263 | 0.000233 | 0.0001232 | 0.000194 | 0.0001 | 1266 |
... | ... | ... | ... | ... | ... | ... |
|
Ideally, you would have two files training_currencies.txt
and verification_currencies.txt
to download training and verification currency data separately. Or you can separate them later.
Once the raw data is downloaded you would need to create .RIMG
files. image_creator.py
module does exactly that.
It works by calculating 32 pre-defined and clustered technical indicators for 32 different time intervals, normalizing the calculated values between [0, 255]. Hence, creating a 32x32 grayscale image for each day for each currency.
It also adds the date and a scalar. Scalar is the percentage difference of the currency's closing value from the previous day.
A minimized .rimg
file looks like the following:
000 000 001 005 ...
005 008 000 000 ...
012 009 000 000 ...
020 010 011 005 ...
...
$
-4.694593226213477
$
2018-06-20
To visualize the created .rimg
file(s) one can use utility/rimg_viewer.py
module.
To use it:
python rimg_viewer.py <FILENAME>.rimg
Which will produce an image similar to following image:
Once the data is ready, a training configuration needs to be defined in a .JSON
format with the following fields:
num_actions
: Number of actions the DQN will map states intomax_iterations
: Number of iterations the training will takelearning_rate
: Learning rate for the gradient stepepsilon_init
: Initial epsilon valueepsilon_min
: Minimum epsilon valuememory_size
: Size of the experience replay memoryB
: Number of iterations before updating the online network weightsC
: Number of iterations before updating the target network weightsgamma
: Discount factorbatch_size
: Batch sizepenalty
: Penalty to apply for position changepatch_size
: Transformer patch sizeprojection_dim
: Transformer projection dimensionenable_resizing
: Whether to enable image resizingresized_image_size
: Resized image size ifenable_resizing
num_heads
: Number of heads in ViTmlp_head_units
: Definition of MLP units in ViTtransformer_units
: Definition of transformer units in ViTtransformer_layers
: Number of transformer layers in ViTdescription
: Description of the configuration
A sample training configuration is as follows:
{
"num_actions": 3,
"max_iterations": 50000,
"learning_rate": 0.0001,
"epsilon_init": 1.0,
"epsilon_min": 0.1,
"memory_size": 1000,
"B": 10,
"C": 1000,
"gamma": 0.99,
"batch_size": 32,
"penalty": 0.05,
"patch_size": 6,
"projection_dim": 64,
"enable_resizing": false,
"resized_image_size": 36,
"num_heads": 4,
"mlp_head_units": [
2048,
1024
],
"transformer_units": [
128,
64
],
"transformer_layers": 8,
"description": "Using k-means clustered technical indicators."
}
With the configuration created, simply run
python main.py -i <PATH_TO_TRAINING_DATA_DIR> -c <CONFIG_NAME>.json
-i
or--input-path
argument is used to indicate the input data directory.-c
or--config'
argument is used to specify which config file to use.
Output of the trained model will be placed under the directory $WORKSPACE/reincrypt/src/dqn/output/<CONFIG_NAME>
and it will have a structure similar to this:
<CONFIG_NAME>
├── <CONFIG_NAME>_model/
├── training_<TIMESTAMP>.json
└── training_<TIMESTAMP>.png
Where <CONFIG_NAME>_model/
will be the keras model output, .json is the training logs and the .png file is the loss plot of the training. Keras model output can be used to load the model and make inferences.
To create portfolios and validate the model on the verification data set:
python main.py -v -i <PATH_TO_VERIFICATION_DATA_DIR> -c <CONFIG_NAME> -m <PATH_TO_CONFIG_NAME_model_DIR>
-i
or--input-path
argument is used to indicate the input data directory.-v
or--verification
flag is used to indicate the program will be run in verification mode.-c
or--config'
argument is used to specify which config file to use.-m
or--model
argument is used to specify the keras model output created in training phase. Required if-v
flag is set.
After the verification a verification log in JSON
format and a cumulative asset chart (.png
) will be created under the same output directory of the model. Output directories structure will be as following.
<CONFIG_NAME>
├── <CONFIG_NAME>_model/
├── training_<TIMESTAMP>.json
├── training_<TIMESTAMP>.png
├── verification_<VERIFICATION_TIMESTAMP>.json
└── verification_cumulative_assets_<VERIFICATION_TIMESTAMP>.png
An example verification_<VERIFICATION_TIMESTAMP>.json
file looks like this:
{
"portfolio_method": "Market Neutral", // Portfolio used
"verification_start": "2023-12-26 06:26:33.082213", // Verification start timestamp
"verification_end": "2023-12-26 06:27:18.631933", // Verification end timestamp
"verification_duration (m)": 0.0, // How long did the verification take?
"verification_tickers": [ // Currencies used in the verification
"DGC-USD",
"TRC-USD",
"GLC-USD",
"FTC-USD",
"NMC-USD",
"LTC-USD",
"PPC-USD",
"BTC-USD"
],
"num_tickers": 8, // # of currencies
"config": {
// Model's configuration, as defined in training step,
"experiment_name": "experiment_11",
"height": 32,
"width": 32,
"num_days": 366
},
"results": {
"position_change": 225.28317512, // How many times model changed position
"cumulative_asset": 6.14175614, // Cumulative asset at the end
"sharpe_ratio": 2.359828034700596, // Calculated sharpe ratio
"num_days": 366, // # of verification days
"date_begin": "2021-10-30", // Verification begin date
"date_end": "2022-10-30", // Verification end date
"daily_results": [ // Daily movements made by the model
{
"day_index": 0,
"cumulative_asset": 1,
"avg_daily_return": -6.53714439
},
{
"day_index": 1,
"cumulative_asset": 0.93462856,
"avg_daily_return": 2.23261434
},
{...},
]
}
}
And a cumulative asset chart:
Verification process, by default, uses market neutral portfolio. In order to use Top/Bottom-K portfolio you would need to use the -k
(--topbottomk
) argument and specify the K value (percentage) you want to use.
For example, following example runs a verification using Top/Bottom K porfolio with the K value of 0.3:
python main.py -v \
-i <PATH_TO_VERIFICATION_DATA_DIR> \
-c <CONFIG_NAME> \
-m <PATH_TO_CONFIG_NAME_model_DIR> \
-k 0.3