Comments (8)
Hello @hlsafin, first of all you should make sure to train your agent for enough training steps in order to see meaningful results. For example, using our default configuration, R2D2 takes about 4M steps to converge on Pong and 10M to reach good performance on Spaceinvaders. Moreover, there are two important h-paramerers you may want to tune for different environments:
- burnin_step: how many steps to use for initializing RNN hidden state, since those steps are not used for training, you could reduce them in those envs where initial steps are the most important.
- learn_unroll_len: length of a trajectory, you could increase this value when having a longer memory is critical.
You can refer to this file to see all the h-parameters that can be tuned: https://github.com/opendilab/DI-engine/blob/main/dizoo/atari/config/serial/pong/pong_r2d2_config.py
Following, I'm going to attach our training log relative to the Pong environment so that you can make a comparison.
from di-engine.
Yeah, that's so weird, I am running the same file as you, but getting different results. Hmm
from di-engine.
at 2.4 million, the mean reward is still around -20
from di-engine.
can you upload your tensorboard event?
from di-engine.
yeah, I don't think I can upload files here. Maybe I can email you? I tried oppo_config, and it converged. Just no luck with r2d2 config.py
from di-engine.
yeah, I don't think I can upload files here. Maybe I can email you? I tried oppo_config, and it converged. Just no luck with r2d2 config.py
You can send email to our email or contact us on slack channel
from di-engine.
yeah, I can't seem to find it. When you ran your config, did you make any changes to the config file? Also, which version are you using?
from di-engine.
slack channel link: https://join.slack.com/t/opendilab/shared_invite/zt-v9tmv4fp-nUBAQEH1_Kuyu_q4plBssQ
from di-engine.
Related Issues (20)
- what algorithm do you use to sovle the overcooked problem? MADDPG? HOT 3
- 代码报错:在配置好conda环境以及将该项目fork到本地后,在运行DI-engine/dizoo/petting_zoo/config/路径下的所有py文件(如ptz_simple_spread_madqn_config.py;ptz_simple_spread_mappo_config.py等)时均出现报错 HOT 3
- H-PPO算法运行失败 HOT 7
- 尝试使用自定义环境出现问题 HOT 2
- gym soccer是否有文档? 其参数设置以及action的类型该如何写 HOT 3
- record a video HOT 2
- Implementation of Mean-Field MARL algorithm HOT 3
- FQF logit computation HOT 3
- 混合动作空间环境,PPO使用gae_estimator报错 HOT 3
- 如何获取每个episode的reward值 HOT 1
- TD3应用混合动作空间报错,AssertionError HOT 1
- how to get the ckpt file? HOT 2
- get "TypeError: __init__() got an unexpected keyword argument 'agent_obs_shape'" when running " python3 -u smac_5m6m_masac_config.py" HOT 2
- question for SMAC HOT 3
- docker内运行lunarlander_dqn_deploy失败 HOT 6
- cannot run GTrXL demo since v0.5.0 HOT 1
- BrokenPipeError: [WinError 232] 管道正在被关闭 has occurred, when running MARL algorithm QMIX in pettingzoo
- bug when running MARL algorithm Qmix in pettingzoo HOT 3
- gym_anytrading : could not broadcast input array from shape (62,) into shape (20,3) Please help!! HOT 3
- 马里奥代码咨询 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from di-engine.