Comments (5)
@Achronus thank you so much! It runs and I am training now on the default parameters (with create_model = rainbow, of course).
Fingers crossed :)
from rl_atari_games.
Hey @sunchipsster1 ,
Thanks for your interest in this repository.
It appears the issue relates to OpenAI's recent update of the Gym package to Gymnasium. I've added a quick hotfix to accommodate compatibility changes. Unfortunately, fixing this issue has caused me to encounter a new one with the PyTorch package - RuntimeError: Numpy is not available
.
It's likely this is only a local issue (I hope), but I am currently updating all my packages (especially PyTorch) which should resolve the new issue. The hotfix only applies to one file: core/env_details.py
-> here.
Additionally, I've made minor changes to the main.py
file to simplify it.
While I'm updating the packages, can you please apply the new hotfix and see if things work on your end? The best option would be to copy the new core/env_details.py
into the old one and go from there. Let me know if you encounter any more issues :)
Ryan
from rl_atari_games.
Hey @sunchipsster1,
As a follow up on my previous comment, I've updated the repository to fix the broken compatibility issues. Everything should now be working as intended. Please update your packages to the latest ones as detailed in the dependencies section of the README and the requirements.txt file.
Let me know if you have any further issues.
Many thanks.
Ryan
from rl_atari_games.
Hi @Achronus just wanted to confirm that I am doing as you had originally intended - I have only modified main.py by one line:
model = create_model('ppo', env='primary', device=device) --> into model = create_model('rainbow', env=env3, device=device)
I've left the other hyperparameters as they were.
For 3 different seeds, currentlyrunning around ~~ (40.0K/100K) episodes, the Episode Score is still at 0. Would love to get the beautiful results that you obtained in your baseline. Am I making an unforeseen error?
Thanks so much again! :)
from rl_atari_games.
Hey @sunchipsster1,
That's normal behaviour for agents in complex environments. It's known as a 'burn-in' period, where the agent is still learning the environment dynamics. Typically, its length varies depending on the model and the environment's complexity. The more complex the environment, the longer the burn-in period. For example, with an A2C algorithm, I created for Super Mario Bros, the burn-in period lasted around 110k-120k episodes before it finally figured out how to jump longer and get over a large pipe.
I haven't had the opportunity (yet) to explore the full extent of RDQNs, so the parameters are not optimised. The best advice I can offer:
- Refer to the original paper's hyperparameters in the appendix - Rainbow: Combining Improvements in Deep Reinforcement Learning. Their approach is extremely computationally expensive, requiring a buffer size of 1 million. Typically, this is unfeasible on standard computer hardware.
- Experiment with the parameters yourself to find ones that work for your hypothesis. For example, how many episodes will it take to reach the first key (reward signal)?
- Enable curiosity to improve agent exploration -
model = create_model('rainbow', env=env3, device=device, im_type='curiosity')
What makes Montezuma's so challenging is the sparse rewards and the vast state space. The agent needs to identify a specific movement pattern to reach the first reward and then continue from there. Solving the environment as a whole is a fascinating problem, requiring a lot of small steps.
Sorry, I can't be of more help.
Ryan
from rl_atari_games.
Related Issues (1)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rl_atari_games.