Code Monkey home page Code Monkey logo

rl-portfolio-management's Introduction

Attempting to replicate "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem" by Jiang et. al. 2017 [1].

Note2 (20190525): vermouth1992 improved this environment during their final project, I reccomend you start with their repo. Also check out the sagemaker tutorial which is based on vermouth1992's work.

Note1 (2018): the paper's authors have put the official code for the paper up and it works well

tl;dr I managed to get 8% growth on training data, but it disapeared on test data. So I couldn't replicate it. However, RL papers can be very difficult to replicate due to bugs, framework differences, and hyperparameter sensistivity

About

This paper trains an agent to choose a good portfolio of cryptocurrencies. It's reported that it can give 4-fold returns in 50 days and the paper seems to do all the right things so I wanted to see if I could achieve the same results.

This repo includes an environment for portfolio management (with unit tests). Hopefully others will find this usefull as I am not aware of any other implementations (as of 2017-07-17).

Author: wassname

License: AGPLv3

[1] Jiang, Zhengyao, Dixing Xu, and Jinjun Liang. "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem." arXiv preprint arXiv:1706.10059 (2017).

Results

I have managed to overfit to the training data with no trading costs but it could not generalise to the test data. So far there have been poor results. I have not yet tried hyperparameter optimisation so it could be that parameter tweaking will allow the model to fit, or I may have subtle bugs.

  • VPG model,
    • training: 190% portfolio growth in 50 days
    • testing: 100% portfolio growth in 50 days

This test period is directly after the training period and it looks like the usefullness of the models learned knowledge may decay as it moves away from its training interval.

There are other experiments stored as notebooks in past commits.

Installing

  • git clone https://github.com/wassname/rl-portfolio-management.git
  • cd rl-portfolio-management
  • pip install -r requirements/requirements.txt
  • jupyter-notebook
    • Then open tensorforce-VPG.ipynb in jupyter
    • Or try an alternative agent with tensorforce-PPO.ipynb and train

Using the environment

These environments are dervied from the OpenAI environment class which you can learn about in their documentation.

These environments come with 47k steps of training data and 8k test steps. Each step represents 30 minutes. Thanks to reddit user ARRRBEEE for sharing the data.

There are three output options which you can use as follows:

import gym
import rl_portfolio_management.environments  # this registers them

env = gym.envs.spec('CryptoPortfolioEIIE-v0').make()
print("CryptoPortfolioEIIE has an history shape suitable for an EIIE model (see https://arxiv.org/abs/1706.10059)")
observation = env.reset()
print("shape =", observation["history"].shape)
# shape = (5, 50, 3)

env = gym.envs.spec('CryptoPortfolioMLP-v0').make()
print("CryptoPortfolioMLP history has an flat shape for a dense/multi-layer perceptron model")
observation = env.reset()
print("shape =", observation["history"].shape)
# shape = (750,)

env = gym.envs.spec('CryptoPortfolioAtari-v0').make()
print("CryptoPortfolioAtari history has been padded to represent an image so you can reuse models tuned on Atari games")
observation = env.reset()
print("shape =", observation["history"].shape)
# shape = (50, 50, 3)

Or define your own:

import rl_portfolio_management.environments import PortfolioEnv
df_test = pd.read_hdf('./data/poloniex_30m.hf', key='test')
env_test = PortfolioEnv(
  df=df_test,
  steps=256,
  scale=True,
  augment=0.00,
  trading_cost=0.0025,
  time_cost=0.00,
  window_length=50,
  output_mode='mlp'
)

Lets try it with a random agent and plot the results:

import numpy as np
import gym
import rl_portfolio_management.environments  # this registers them

env = gym.envs.spec('CryptoPortfolioMLP-v0').make()
steps = 150
state = env.reset()
for _ in range(steps):
    # The observation contains price history and portfolio weights
    old_portfolio_weights = state["weights"]

    # the action is an array with the new portfolio weights
    # for out action, let's change the weights by around a 20th each step
    action = old_portfolio_weights + np.random.normal(loc=0, scale=1/20., size=(4,))

    # clip and normalize since the portfolio weights should sum to one
    action = np.clip(action, 0, 1)
    action /= action.sum()

    observation, reward, done, info = env.step(action)

    if done:
        break

# plot
env.render('notebook')

Unsuprisingly, a random agent doesn't perform well in portfolio management. If it had chosen to bet on blue then black if could have outperformed any single asset, but hindsight is 20/20.

Plotting

You can run env.render('notebook') or extract a pandas dataframe and plot how you like. To use pandas: pd.DataFrame(gym.unwrapped.infos).

Tests

We have partial test coverage of the environment, just run:

  • python -m pytest

Files

  • enviroments/portfolio.py - contains an openai environment for porfolio trading
  • tensorforce-PPO-IEET.ipynb - notebook to try a policy gradient agent

Differences in implementation

The main differences from Jiang et. al. 2017 are:

  • The first step in a deep learning project should be to make sure the model can overfit, this provides a sanity check. So I am first trying to acheive good results with no trading costs.
  • I have not used portfolio vector memory. For ease of implementation I made the information available by using the last weights.
  • Instead of DPG (deterministic policy gradient) I tried and DDPG (deep deterministic policy gradient) and VPG (vanilla policy gradient) with generalized advantage estimation and PPO.
  • I tried to replicate the best performing CNN model from the paper and haven't attempted the LSTM or RNN models.
  • instead of selecting 12 assets for each window I chose 3 assets that have existed for the longest time
  • My topology had an extra layer see issue 3 fixed

TODO

See issue #4 and #2 for ideas on where to go from here

rl-portfolio-management's People

Contributors

schneiderl avatar wassname avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rl-portfolio-management's Issues

Error in importing agents, utils, components

In pytorch implementation of the above model, and in 10th cell. You have used following -

from agent import ProximalPolicyOptimization, DisjointActorCriticNet #, DeterministicActorNet, DeterministicCriticNet
from component import GaussianPolicy, HighDimActionReplay, OrnsteinUhlenbeckProcess
from utils import Config, Logger
import gym
import torch
gym.logger.setLevel(logging.INFO)

It is giving me error that agent module does not exist which I think it should because we dont have folder with agent, component name.
I am sorry but I am not able to understand what I am missing here. Please help.
Thanks in advance.

GPU usage

GPU usage is very low, is it alright

Question - Tensorflow-gpu optional?

Hi,

I was wanting to use your program and alter the algorithms behind it (employing reinforcement learning) and largely leave the base code untouched.

Checking 'tensorforce-PPO-IEET' I don't see tensorflow-gpu being used, is this assumption wrong? (Wanting to work on this on OSX which lost support for tfgpu at 1.1)

EDIT: Okay I've tried installing some different versions of the libraries in your requirements as I couldn't get them to install (particularly the tensorflow-gpu)

Reduced the step count significantly along with epsilon requirement.

However when I'm getting to the Test block I'm getting the following error.

`AttributeError                            Traceback (most recent call last)
<ipython-input-23-a605848dcb2f> in <module>()
      1 # Create an agent
      2 agent = PPOAgent(
----> 3     states_spec=environment.states,
      4     actions_spec=actions_spec,
      5     network_spec=network_spec,

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorforce/contrib/openai_gym.py in states(self)
     77     @property
     78     def states(self):
---> 79         return OpenAIGym.state_from_space(space=self.gym.observation_space)
     80 
     81     @staticmethod

AttributeError: 'NoneType' object has no attribute 'observation_space'`

Thank you

Performance Clarification

I'm kinda confused with your notes on performance. In your README, you mention that the VPG under-performs on the test set against baseline. But then in the notebook, you mention that you get an annualized 25X and you are using PPO -- but that doesn't appear to be the case judging by the test performance plot included at the bottom of the notebook.

I'm currently running my own implementation and am waiting for it to finish training, but I'm curious -- were you able to get this working? I know the PPO algo is fairly new so maybe those notes were left over from your initial testing? Any clarification would be helpful :)

Thanks!

Issues running Notebook due to Tensorforce changes

Hi,

Nice work on this project! I much prefer Jupyter to what the paper's authors have used.

I've just installed the dependencies, and have hit some errors:

# sanity check out environment is working state = environment.reset() state, reward, done=environment.execute(env.action_space.sample()) state.shape

gives me:


AttributeError Traceback (most recent call last)
in ()
1 # sanity check out environment is working
2 state = environment.reset()
----> 3 state, reward, done=environment.execute(env.action_space.sample())
4 state.shape

~\Anaconda3\lib\site-packages\tensorforce\contrib\openai_gym.py in execute(self, actions)
67
68 def execute(self, actions):
---> 69 if self.visualize:
70 self.gym.render()
71 # if the actions is not unique, that is, if the actions is a dict

AttributeError: 'TFOpenAIGymCust' object has no attribute 'visualize'

Skipping this cell and going further, I hit the next roadblock.
from tensorforce import Configuration

I have read on Tensorforce repo issues that the entire Configuration object has been scrapped (tensorforce/tensorforce#132), so this breaks the current code here.

'DataFrame' object has no attribute 'mean_market_returns'

First of all, thanks for sharing this code.
I'm trying to run the notebook keras-ddpg and I ran into this error:

During the agent.fit() method there is a callback to TrainEpisodeLoggerPortfolio which evaluates the mean_market_return based on the environment infos attribute:

df = pd.DataFrame(self.env.infos)
self.episode_metrics[episode]=dict(
  max_drawdown=MDD(df.portfolio_value), 
  sharpe=sharpe(df.rate_of_return), 
  accumulated_portfolio_value=df.portfolio_value.iloc[-1],
  mean_market_return=df.mean_market_returns.cumprod().iloc[-1],
  cash_bias=df.weights.apply(lambda x:x[0]).mean()
)

The problem here seems to be that the Environment doesn't have mean_market_returns. Same thing is happening for the cash_bias since self.env.infos doesn't have weights, therefore the error (If I am correct those two attributes should be in the class PortfolioSim).

Am I doing something wrong or this is really the problem here?
This is the complete stack trace:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-20-982fd62f6724> in <module>()
      6                       TrainIntervalLoggerTQDMNotebook(),
      7                       TrainEpisodeLoggerPortfolio(10),
----> 8                       ModelIntervalCheckpoint(save_path, 10*1440, 1)
      9                     ]
     10                  )

/usr/local/lib/python3.4/dist-packages/rl/core.py in fit(self, env, nb_steps, action_repetition, callbacks, verbose, visualize, nb_max_start_steps, start_step_policy, log_interval, nb_max_episode_steps)
    160                         'nb_steps': self.step,
    161                     }
--> 162                     callbacks.on_episode_end(episode, episode_logs)
    163 
    164                     episode += 1

/usr/local/lib/python3.4/dist-packages/rl/callbacks.py in on_episode_end(self, episode, logs)
     55             # If not, fall back to `on_epoch_end` to be compatible with built-in Keras callbacks.
     56             if callable(getattr(callback, 'on_episode_end', None)):
---> 57                 callback.on_episode_end(episode, logs=logs)
     58             else:
     59                 callback.on_epoch_end(episode, logs=logs)

<ipython-input-19-9b518044e729> in on_episode_end(self, episode, logs)
     16             accumulated_portfolio_value=df.portfolio_value.iloc[-1],
     17             #mean_market_return=df.mean_market_returns.cumprod().iloc[-1],
---> 18             cash_bias=df.weights.apply(lambda x:x[0]).mean()
     19         )
     20 

/usr/local/lib/python3.4/dist-packages/pandas/core/generic.py in __getattr__(self, name)
   3079             if name in self._info_axis:
   3080                 return self[name]
-> 3081             return object.__getattribute__(self, name)
   3082 
   3083     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'weights'

1440/|/reward=-0.0000 info=(return: 0.9938, portfolio_value: 0.7632, cost: 0.0000, weights_std: 0.3301, reward: -0.0000, weights_mean: 0.1667, steps: 1441.0000, market_value: 1.2144, rate_of_return: -0.0026, date: 1445472000.0000, log_return: -0.0026, )  0%|| 1440/2000000.0 [03:30<83:43:20,  6.63it/s]

Thanks again, you've done a great job!!

Question about mean market performance

Hi!

I was looking through your readme file and I was wondering how you calculated the "mean market performance" in the first plot to be compared to the portfolio agent performance?

Question about the agent performance

Hi,
I am a pytorch user, so I studied the pytorch version of your codes. Although the reward of the training data is increasing, it fluctuates between positive and negative in the test dataset. It seem the agent can't be able to get a good return on the test data.
If I miunderstood anything plz let me know, thank you!

AttributeError: 'NoneType' object has no attribute 'run'

Hello wassname:
I got a problem with your code, when i run your code of tensorforce-PPO-IEET.ipynb, it got a mistake: AttributeError: 'NoneType' object has no attribute 'run'. The detail shows below:

episodes=1 steps=environment_test.gym.env.env.src.steps*episodes runner_test.run( episodes=episodes, timesteps=steps, deterministic=True, episode_finished=episode_finished, )
--------------------------------error message------------------------------

AttributeError Traceback (most recent call last)
in
5 timesteps=steps,
6 deterministic=True,
----> 7 episode_finished=episode_finished,
8 )

~\Anaconda3\envs\ppo\lib\site-packages\tensorforce\execution\runner.py in run(self, timesteps, episodes, max_episode_timesteps, deterministic, episode_finished)
89 self.start_time = time.time()
90
---> 91 self.agent.reset()
92
93 self.episode = self.agent.episode

~\Anaconda3\envs\ppo\lib\site-packages\tensorforce\agents\agent.py in reset(self)
113 timestep counter, internal states, and resets preprocessors.
114 """
--> 115 self.episode, self.timestep, self.next_internals = self.model.reset()
116 self.current_internals = self.next_internals
117

~\Anaconda3\envs\ppo\lib\site-packages\tensorforce\models\model.py in reset(self)
1232 """
1233 # TODO preprocessing reset call moved from agent
-> 1234 episode, timestep = self.monitored_session.run(fetches=(self.episode, self.timestep))
1235 return episode, timestep, list(self.internals_init)
1236

~\Anaconda3\envs\ppo\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, fetches, feed_dict, options, run_metadata)
516 Same as tf.Session.run().
517 """
--> 518 return self._sess.run(fetches,
519 feed_dict=feed_dict,
520 options=options,

AttributeError: 'NoneType' object has no attribute 'run'

I don't know what's wrong with it, the packages that i installed are entirely consistent with the requirements.txt。Would you please help me?

Env creating error - 'int' object is not iterable

Hi Wassname,

Thanks for open source your code. However, when I tried to run the following code to create an env.

import rl_portfolio_management.environments  # this registers them

env = gym.envs.spec('CryptoPortfolioEIIE-v0').make()
print("CryptoPortfolioEIIE has an history shape suitable for an EIIE model (see https://arxiv.org/abs/1706.10059)")
observation = env.reset()
print("shape =", observation["history"].shape)
# shape = (5, 50, 3)```

I encountered the following error: 
```---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-f00e24a5d49e> in <module>()
      2 import rl_portfolio_management.environments  # this registers them
      3 
----> 4 env = gym.envs.spec('CryptoPortfolioEIIE-v0').make()
      5 print("CryptoPortfolioEIIE has an history shape suitable for an EIIE model (see https://arxiv.org/abs/1706.10059)")
      6 observation = env.reset()

D:\Users\ThinkPad\Anaconda3\lib\site-packages\gym\envs\registration.py in make(self)
     84         else:
     85             cls = load(self._entry_point)
---> 86             env = cls(**self._kwargs)
     87 
     88         # Make the enviroment aware of which spec it came from.

F:\DSBA\T2\Advanced ML\Final\rl-portfolio-management-master\rl-portfolio-management-master\rl_portfolio_management\environments\portfolio.py in __init__(self, df, steps, trading_cost, time_cost, window_length, augment, output_mode, log_dir, scale, scale_extra_cols, random_reset)
    266         nb_assets = len(self.src.asset_names)
    267         self.action_space = gym.spaces.Box(
--> 268             0.0, 1.0, shape=nb_assets + 1)
    269 
    270 

D:\Users\ThinkPad\Anaconda3\lib\site-packages\gym\spaces\box.py in __init__(self, low, high, shape, dtype)
     32         self.low = low.astype(dtype)
     33         self.high = high.astype(dtype)
---> 34         gym.Space.__init__(self, shape, dtype)
     35 
     36     def sample(self):

D:\Users\ThinkPad\Anaconda3\lib\site-packages\gym\core.py in __init__(self, shape, dtype)
    200     """
    201     def __init__(self, shape=None, dtype=None):
--> 202         self.shape = None if shape is None else tuple(shape)
    203         self.dtype = None if dtype is None else np.dtype(dtype)
    204 

TypeError: 'int' object is not iterable

I also tried gym.spaces.Box(0.0, 1.0, shape=5), it has the same error. While then I tried gym.spaces.Box(0.0, 1.0, shape=(5,)), it works well.

However, I tried to update your code with self.action_space = gym.spaces.Box(0.0, 1.0, shape=(nb_assets + 1, )), it still did not work well.

Could you please help me on that?

Thx in advance.

Question about random_reset in PortfolioEnv

Firstly wassname, thank you so much for sharing this excellent piece of work! It has been extremely interesting to work on, and particularly opened my eyes to universal portfolios. Secondly, I have a question about what random_reset does in PortfolioEnv? I cannot use my adapted environment from your work without raising loads of errors. I'd be very grateful if you could clarify the differences in random_reset=False and random_reset=True.

Thanks!
Stewart

Error running notebook

Hi,

I'm very interesting in your code, but I''m having trouble running the two notebooks.

For the tensorforce-VPG notebook, the error happens at block 7, and the error message is

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-7-445b09d9425b> in <module>()
     22 env_test.seed = 0
     23 
---> 24 from tensorforce.environments.openai_gym import OpenAIGym
     25 environment = OpenAIGym('CartPole-v0')
     26 environment.gym = env

ModuleNotFoundError: No module named 'tensorforce.environments.openai_gym'

Anything to do with your tensorforce version? As for now, it's
"use my branch with prioritised ppo, untill mergeed"

https://github.com/wassname/tensorforce/archive/merged_6b.zip

For the tensorforce-PPO notebook, the error happens at in Training session, block 19

episodes = int(6e6 / 30)
runner.run(
    episodes=episodes,
    max_timesteps=200,
    episode_finished=EpisodeFinishedTQDM(
        #steps=1,
        log_intv=100, 
        episodes=episodes,
        log_dir=log_dir,
        session=runner.agent.model.session
    )
)

First time, the error message is

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-2c7dac94069d> in <module>()
      8         episodes=episodes,
      9         log_dir=log_dir,
---> 10         session=runner.agent.model.session
     11     )
     12 )

TypeError: __init__() got an unexpected keyword argument 'episodes'

so I change it to episode which I find in tensroforce file.

Second time, the error message is

TypeError                                 Traceback (most recent call last)
<ipython-input-21-02d9c6e080ce> in <module>()
      8         episode=episodes,
      9         log_dir=log_dir,
---> 10         session=runner.agent.model.session
     11     )
     12 )

TypeError: __init__() missing 1 required positional argument: 'steps'

so I ad an argument steps, and set it to some random constant 1.

Third time, the error message is

If you're reading this message in Jupyter Notebook or JupyterLab, it may mean that the widgets JavaScript is still loading. If this message persists, it likely means that the widgets JavaScript library is either not installed or not enabled. See the Jupyter Widgets Documentation for setup instructions.
If you're reading this message in another notebook frontend (for example, a static rendering on GitHub or NBViewer), it may mean that your frontend doesn't currently support widgets.
TensorBoardLogger started. Run `tensorboard --logdir=/home/jfang16/Dropbox/rl-portfolio-management/logs` to visualize
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-23-e11fc16557ce> in <module>()
      8         episode=episodes,
      9         log_dir=log_dir,
---> 10         session=runner.agent.model.session
     11     )
     12 )

~/anaconda3/envs/rl-portfolio/lib/python3.6/site-packages/tensorforce/execution/runner.py in run(self, episodes, max_timesteps, episode_finished)
    165                 self.agent.save_model(self.save_path)
    166 
--> 167             if episode_finished and not episode_finished(self):
    168                 return
    169             if self.cluster_spec is None:

~/Dropbox/rl-portfolio-management/rl_portfolio_management/callbacks/tensorforce.py in __call__(self, r)
    146         weights = dict(zip(oai_env.src.asset_names, np.round(oai_env.sim.w0, 4).tolist()))
    147         exploration = r.agent.exploration.get('action', lambda x, y: 0)(
--> 148             r.episode, np.sum(r.episode_timesteps))
    149 
    150         desc = "ep reward: {reward: 2.8f} [{rewards_min: 2.8f}, {rewards_max: 2.8f}], portfolio_value: {portfolio_value: 2.4f} mdd={mdd:2.2%} sharpe={sharpe:2.4f}, expl={exploration: 2.2%} eps={episode:} weights={weights:}".format(

AttributeError: 'Runner' object has no attribute 'episode_timesteps'

I'm not sure if I should continue.

Please help,

Thanks,

Network Topology

Hi, quite happy that our work is replicated!

One problem I found is the topology.
In tensorforce-VPG.ipynb In [11].
It seems that a dense layer is added to the network, which is different in original work.

x = dense(x, size=env.action_space.shape[0],activation='relu', l2_regularization=1e-8)

The "Ensemble of Identical Independent Evaluators" will not include any dense layer. Outputs of last convolutional layer will be fed into softmax function directly. That's why we say they are "independent".

Calculation of transaction Cost

Hi,

Congrats for this excellent job.

I have a question about the calculation of the transaction cost in your code.

In (eq 16) in the paper of Jiang et al., the transaction cost is calculated based on the changes of portfolio weights (sum of 1 to m, excluding the cash weight). In the code, the dw1 is of shape of m+1 (including cash weight) and the cost is calculated as:

mu1 = self.cost * (np.abs(dw1 - w1)).sum()

I wonder if this latter calculation would not have an impact by double counting costs for transactions between cash and asset? Should it not ne replaced by :

mu1 = self.cost * (np.abs(dw1[1:] - w1[1:])).sum() ?

Max Draw Down incorrect

Your MDD function is incorrect.

It only finds the Max return and then the trough after that.

You might be up 10% lose 50% and then gain 1000% to lose 20%. I believe its written that it would only find that last 20% loss.

Also as per https://www.investopedia.com/terms/m/maximum-drawdown-mdd.asp

you need to divide by the peak value not the Trough

MDD = (Trough Value – Peak Value) ÷ Peak Value

def MDD(returns):
    """Max drawdown."""
    peak = returns.max()
    i = returns.argmax()
    trough = returns[i:].min()
    return (trough - peak) / trough

I found this

def max_drawdown(X):
    mdd = 0
    peak = X[0]
    for x in X:
        if x > peak: 
            peak = x
        dd = (peak - x) / peak
        if dd > mdd:
            mdd = dd
    return mdd    ```


From https://stackoverflow.com/questions/22607324/start-end-and-duration-of-maximum-drawdown-in-python

Gym

Hello,

I have an issue running the environment command from GYM. Can you please help me address this issue.

Thank you


TypeError Traceback (most recent call last)
in ()
7 window_length = window_length,
8 output_mode='EIIE',
----> 9 random_reset=False)
10 # wrap it in a few wrappers
11 env = ConcatStates(env)

~/Downloads/rl-portfolio-management-master/rl_portfolio_management/environments/portfolio.py in init(self, df, steps, trading_cost, time_cost, window_length, augment, output_mode, log_dir, scale, scale_extra_cols, random_reset)
266 nb_assets = len(self.src.asset_names)
267 self.action_space = gym.spaces.Box(
--> 268 0.0, 1.0, shape=nb_assets + 1)
269
270 # get the history space from the data min and max

~/gym/gym/spaces/box.py in init(self, low, high, shape, dtype)
32 self.low = low.astype(dtype)
33 self.high = high.astype(dtype)
---> 34 gym.Space.init(self, shape, dtype)
35
36 def sample(self):

~/gym/gym/core.py in init(self, shape, dtype)
200 """
201 def init(self, shape=None, dtype=None):
--> 202 self.shape = None if shape is None else tuple(shape)
203 self.dtype = None if dtype is None else np.dtype(dtype)
204

TypeError: 'int' object is not iterable

There's no attribute 'constraints' at object 'Sequential'

Actor network used Sequential() using keras and it goes into DDPG agent.
And at agent.compile(Adam(lr=3e-5), metrics=['mse'])
there says error that 'Sequential' object has no attribute 'constraints'
It seems that I need to set constraints when use Conv2D, but not sure how to do it.

I also went to keras-rl and run ddpg_pendulum.py, but the same error occured

I think there are three possible way to solve it.

Change python version (3.5.0 --> 2.7)
Input constraint into convolutional layer
Change keras version (2.0.8 --> 1.. whatever)

Can you help me to solve this problem?

Thanks.

I'm questioning the paper's validity and/or generalizability

I'm currently doing a course project and the paper has very similar ideas to ours. However our network is not learning much at all. Of course this might be because we trained our network on 110 stocks with 10 in each batch and tested it on another 20 stocks. Training data and test data are all from the S&P 500 over the past 3 years. We have built a complete evaluation pipeline and we are getting crappy results.

Although our results are quite preliminary, it is likely that since in the paper they operated on a certain set of cryptocurrency the network kind of remembers the scores (e.g. how well each currency can perform) and they got very good results with these 'memory' stored in the last few fc layers?

I'm not sure though since I'm new to deep learning and this is my first time using deep reinforcement learning. But it seems to me that data in the finance world has too much noise and the output of neural networks can easily be swayed with its complex model and, please allow me to put it in this way (although it is a continuous setting instead of a classification/discrete problem), decision boundaries. If neural network can solve the portfolio management problem so well probably few students at a less well-known university would not be the first to come up a with a good application in finance.

I'll keep you posted for any later development as our project progress. In the meantime, if you'd like, I am very interested to hear your thoughts on the subject. Thank you!

Do you have any idea why model didn't learn much?

As you know, portfolio weights appear to static at test set.
Why didn't model learn so much?

I am assuming several reasons.

  1. there are no ensemble in DDPG network
  2. input doesn't include previous portfolio weight
  3. there are no online stochastic batch learning when applying network to test set
  4. different universe
  5. no PVM
  6. Used ddpg instead of dpg

because author said that EIIE is there core idea in this paper, I think if I solve 1 and 2, there will be
some improvement.

Will you give some idea of yours?

Thank you very much.

Hello World

Hi, I'm interested in your implementation of the paper. If there's any task pending (It could be even tests), I would love to contribute. Please let me know the state of the project. The paper reported 4 fold returns, what did your tests reveal?

AssertionError: action should be within Box(4,) but is array([ nan, nan, nan, nan])

I am trying out the environment you created using Keras and Q-learning algorithm. I am using this implementation https://github.com/keon/deep-q-learning/blob/master/dqn.py . I am encountering the following error

---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
in ()
5 # env.render()
6 action = agent.act(state)
----> 7 next_state, reward, done, _ = env.step(action)
8 reward = reward if not done else -10
9 #next_state = np.reshape(next_state, [1, state_shape])

C:\Program Files (x86)\Python\New folder\lib\site-packages\gym\core.py in step(self, action)
94 info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
95 """
---> 96 return self._step(action)
97
98 def reset(self):

E:\Pirimid\rl-portfolio-management\rl_portfolio_management\environments\portfolio.py in _step(self, action)
313 # Sanity checks
314 assert self.action_space.contains(
--> 315 action), 'action should be within %r but is %r' % (self.action_space, action)
316 np.testing.assert_almost_equal(
317 np.sum(weights), 1.0, 3, err_msg='weights should sum to 1. action="%s"' % weights)

AssertionError: action should be within Box(4,) but is array([ nan, nan, nan, nan])`

I am not that much familiar with gym environments and deep RL.

Some problem of the torch

when I type the instruction "pip install -r requirements/requirements.txt",
I get the error message:

Could not find a version that satisfies the requirement torch==0.3.0.post4 (from -r requirements/requirements.txt (line 19)) (from versions: 0.1.2, 0.1.2.post1, 0.3.1, 0.4.0)
No matching distribution found for torch==0.3.0.post4 (from -r requirements/requirements.txt (line 19))

what's wrong with it?
how can I solve it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.