sunnyswag / stockrl Goto Github PK

View Code? Open in Web Editor NEW

183.0 4.0 81.0 12.77 MB

在A股(股票)市场上训练强化学习交易智能体

License: GNU General Public License v3.0

Jupyter Notebook 98.95% Python 1.04% Shell 0.01%

reinforcement-learning stock-market stock finrl reward-functions

stockrl's Introduction

强化学习交易股票

在A股市场上，对五个深度强化学习算法进行测试，找到比较合适的深度强化学习智能体，可以搭配 paper 一起食用

回测结果

回测结果展示（时间跨度为2019年1月1日至2021年1月1日）

回测结果分析表（时间跨度为2019年1月1日至2021年1月1日，baseline 为上证50指数）

性能评价指标	上证 50 指数	A2C	DDPG	PPO	SAC	TD3
累计收益率	58.98%	108.49%	121.26%	110.85%	120.61%	120.14%
最大回撤率	-18.22%	-35.83%	-31.45%	-16.75%	-29.24%	-30.69%
Omega比率	1.29	1.31	1.34	1.36	1.35	1.35
Sharpe比率	1.37	1.23	1.50	1.72	1.54	1.52
年化收益率	27.11%	46.37%	50.95%	47.23%	50.72%	50.55%
年化波动率	18.90%	36.25%	30.78%	24.28%	29.48%	29.89%

更加详细的回测结果可以查看 ./plot_traded_result.ipynb

快速开始

在终端中输入

git clone https://github.com/sunnyswag/RL_in_Stock.git
pip install -r requirements.txt

进入 ./learn 文件夹查看详细步骤

修改 config.py 中的相关参数，如：Tushare_Tocken、数据的开始和结束日期等

环境设计思路

state_space 由三部分组成 :

当前的现金
每只股票的持仓量
股票数 * 环境因素（总计15个）

reward 的计算方式：

Reward = 累计收益率 - 当前回撤率

action_space 的空间：

actions ∈ [-x, x]
正数表示买入，负数表示卖出，0 表示不进行买入卖出操作

Reference

FinRL

stockrl's People

Contributors

Stargazers

Watchers

Forkers

jiayingjiebupt longxiongwei smarth265 zerounnet ufda tsxgithub01 a20180502 ishine wukai1120 derrick-www jianbotang zsl3034669 leechunley zlxwl falconerchen wanpixiaozi qwzhong1988 wzhwzhwzhwzh 3380558645 speedfengjun zedrover jackychina66 yingkb jmche cklient wangbyz tjevgerres zhouyn16 zhuzhenping kylixc lvshao1874 purpleyoung tjfuyongjie virtualpeer majiajue jankinnn sernrounder shenghuo8885 howtostu duanbaosheng chenshuguang windskysolo lq-ql yuxi214 1432706118 adatwfx jijingyu uofarc qizhen816 ray202268 popeyer1 marinehuang tjk1898 ldypku jimyzzp lattic qq85609655 fulchou michaelxinyuanzhang l1uw3n hlflyfish debbo2011 c006008 haithink changjingzhi kun2001github jsyzc2019 kexingwestroad 1123612483 algorithmvoyager bellachiao666 makooslee wyz-lxw rovedream poppy007 jw11235 ganquanemail zhenjason lao5-0ss

stockrl's Issues

请问折线图是对上证50只股票进行交易的结果吗？

我看到utils/models.py中对于超参数的设置为：
df = Pull_data(config.SSE_50[:2], save_data=False).pull_data()
stock_dimension = len(df.tic.unique()) # 2
state_space = 1 + 2*stock_dimension + len(config.TECHNICAL_INDICATORS_LIST)*stock_dimension # 23

感觉是加载了前两只股票的数据？那最终得到的折线图是对上证50只股票的交易结果还是对2只股票的交易结果呢？

回测数据发生“时间穿越”

看了你的训练集合（2009-2019）和验证集合(2019-2021)。强化学习在训练的时候部分使用了验证集合，这是由于强化学习本身的性质决定的。实际能取得100%＋的年化收益是发生了“时间穿越”。　我用20210101-20211031范围内的真正测试集合，跑出来收益数据为负。所以用验证集合来跑回测没有意义。

关于agents衰退区间相似的问题

您这里提出，多个agent的回撤区间都差不多，可不可以利用ma-ppo对多个智能体加一个惩罚项呢，就让多个智能体在相同state下输出的action相似时就给他们一个负的reward。我看过很多研报，基本都是在说要尽量训练多个不相似的agent，然后把他们的决策汇总作为最终输出（类似boosting）

关于多个agent衰退区间相似的问题

module 'stable_baselines3.common.logger' has no attribute 'record'

File "E:\software\nutshell\stock_research\强化学习\code\RL_in_Stock\utils\env.py", line 281, in step
return self.return_terminal(reward=self.get_reward())
File "E:\software\nutshell\stock_research\强化学习\code\RL_in_Stock\utils\env.py", line 190, in return_terminal
logger.record("environment/GainLoss_pct", (gl_pct - 1) * 100)
AttributeError: module 'stable_baselines3.common.logger' has no attribute 'record'

大佬想请教一下这个bug如何解决啊？

查了一下是因为stable_baselines3更新了，来源：https://github.com/AI4Finance-LLC/FinRL/issues/239，但是没找到在哪里改，麻烦大佬了！

utils/env.py 所定义的股票环境存在明显的未来函数

FinRL的各种例子都有未来函数的问题，你的代码里也把这部分错误逻辑copy过来了。详见

            self.date_index += 1
            state = (
                [coh] + list(holdings_updated) + self.get_date_vector(self.date_index)
            )

注意啊，date_index应该是要在state更新之后才能递增，因为在T日，只能看到T-1日的股价。去除未来函数之后，再看看你的算法是否还·有效果吧：）

如何应用在实盘？

我想根据截至今天作为训练数据的所有值来预测第二天要购买的资产（即股票）
I would like to predict which asset (i.e., stock) to buy the next day based on all the values up until today as training data.
如何让模型生成第二天的买入或卖出动作？
How to make the model generate a buy or sell action for the next day?

UserWarning: Could not deserialize object lr_schedule. Consider using `custom_objects` argument to replace this object.

/home/xxx/.local/lib/python3.8/site-packages/stable_baselines3/common/save_util.py:166: UserWarning: Could not deserialize object lr_schedule. Consider using custom_objects argument to replace this object.
warnings.warn(
/home/xxx/.local/lib/python3.8/site-packages/stable_baselines3/common/save_util.py:166: UserWarning: Could not deserialize object clip_range. Consider using custom_objects argument to replace this object.
warnings.warn(
Traceback (most recent call last):
File "./trader.py", line 129, in
start_trade()
File "./trader.py", line 126, in start_trade
Trader(model_name = options.model).trade()
File "./trader.py", line 45, in trade
model = self.get_model(agent)
File "./trader.py", line 83, in get_model
model.load(model_dir)
File "/home/xxx/.local/lib/python3.8/site-packages/stable_baselines3/common/base_class.py", line 687, in load
model._setup_model()
File "/home/xxx/.local/lib/python3.8/site-packages/stable_baselines3/ppo/ppo.py", line 158, in _setup_model
self.clip_range = get_schedule_fn(self.clip_range)
File "/home/xxx/.local/lib/python3.8/site-packages/stable_baselines3/common/utils.py", line 88, in get_schedule_fn
assert callable(value_schedule)
AssertionError

请问这个bug要怎么解决

TypeError: learn() got an unexpected keyword argument 'eval_env'

您好，试了几台电脑，也换了python版本，都是报这个错误，请教是什么原因？

[root@TD-SERVER-1 nohup]# tail -100f A2C.log
nohup: 忽略输入
train_file 文件夹已存在!
数据读取成功!
加载数据缓存
数据缓存成功!
加载数据缓存
数据缓存成功!
{'n_steps': 5, 'ent_coef': 0.01, 'learning_rate': 0.0007}
Traceback (most recent call last):
File "/data_1/bak/StockRL-main/learn/./trainer.py", line 124, in
start_train()
File "/data_1/bak/StockRL-main/learn/./trainer.py", line 119, in start_train
Trainer(model_name=options.model,
File "/data_1/bak/StockRL-main/learn/./trainer.py", line 52, in train
model.learn(total_timesteps=self.total_timesteps, tb_log_name=self.model_name, log_interval=10, reset_num_timesteps=False, eval_env=env_trade, eval_freq=1000, n_eval_episodes=10)
TypeError: learn() got an unexpected keyword argument 'eval_env'

关于实盘的想法。

我是刚刚下载代码，刚刚运行成功。
回来感谢一下，sunnywag!!!
是不是应该，用环境来产生预测数据，之后预测资产。

New complementary Tool

My name is Luis, I'm a big-data machine-learning developer, I'm a fan of your work, and I usually check your updates.

I am developer something more complementary and full

testing with 30+- models
threshold evaluation
use 1k technical indicators
method of best feature selection (8, 16, 32..) nice nice
categorical target instead of continuous target variable powerful machine-learning libraries such as: Sklearn.RandomForest , Sklearn.GradientBoosting, XGBoost, Google TensorFlow and Google TensorFlow LSTM.

With the models trained with the selection of the best technical indicators, the tool is able to predict trading points (where to buy, where to sell) and send real-time alerts to Telegram or Mail. The points are calculated based on the learning of the correct trading points of the last 2 years (including the change to bear market after the rate hike).

I think it could be useful to you, to improve, I would like to share it with you, and if you are interested in improving and collaborating I am also willing, and if not file it in the box.

https://github.com/Leci37/LecTrade/tree/develop