Code Monkey home page Code Monkey logo

Comments (5)

sunchipsster1 avatar sunchipsster1 commented on June 4, 2024 1

@Achronus thank you so much! It runs and I am training now on the default parameters (with create_model = rainbow, of course).

Fingers crossed :)

from rl_atari_games.

Achronus avatar Achronus commented on June 4, 2024

Hey @sunchipsster1 ,

Thanks for your interest in this repository.

It appears the issue relates to OpenAI's recent update of the Gym package to Gymnasium. I've added a quick hotfix to accommodate compatibility changes. Unfortunately, fixing this issue has caused me to encounter a new one with the PyTorch package - RuntimeError: Numpy is not available.

It's likely this is only a local issue (I hope), but I am currently updating all my packages (especially PyTorch) which should resolve the new issue. The hotfix only applies to one file: core/env_details.py -> here.

Additionally, I've made minor changes to the main.py file to simplify it.

While I'm updating the packages, can you please apply the new hotfix and see if things work on your end? The best option would be to copy the new core/env_details.py into the old one and go from there. Let me know if you encounter any more issues :)

Ryan

from rl_atari_games.

Achronus avatar Achronus commented on June 4, 2024

Hey @sunchipsster1,

As a follow up on my previous comment, I've updated the repository to fix the broken compatibility issues. Everything should now be working as intended. Please update your packages to the latest ones as detailed in the dependencies section of the README and the requirements.txt file.

Let me know if you have any further issues.

Many thanks.

Ryan

from rl_atari_games.

sunchipsster1 avatar sunchipsster1 commented on June 4, 2024

Hi @Achronus just wanted to confirm that I am doing as you had originally intended - I have only modified main.py by one line:
model = create_model('ppo', env='primary', device=device) --> into model = create_model('rainbow', env=env3, device=device)

I've left the other hyperparameters as they were.

For 3 different seeds, currentlyrunning around ~~ (40.0K/100K) episodes, the Episode Score is still at 0. Would love to get the beautiful results that you obtained in your baseline. Am I making an unforeseen error?

Thanks so much again! :)

from rl_atari_games.

Achronus avatar Achronus commented on June 4, 2024

Hey @sunchipsster1,

That's normal behaviour for agents in complex environments. It's known as a 'burn-in' period, where the agent is still learning the environment dynamics. Typically, its length varies depending on the model and the environment's complexity. The more complex the environment, the longer the burn-in period. For example, with an A2C algorithm, I created for Super Mario Bros, the burn-in period lasted around 110k-120k episodes before it finally figured out how to jump longer and get over a large pipe.

I haven't had the opportunity (yet) to explore the full extent of RDQNs, so the parameters are not optimised. The best advice I can offer:

  1. Refer to the original paper's hyperparameters in the appendix - Rainbow: Combining Improvements in Deep Reinforcement Learning. Their approach is extremely computationally expensive, requiring a buffer size of 1 million. Typically, this is unfeasible on standard computer hardware.
  2. Experiment with the parameters yourself to find ones that work for your hypothesis. For example, how many episodes will it take to reach the first key (reward signal)?
  3. Enable curiosity to improve agent exploration -
    model = create_model('rainbow', env=env3, device=device, im_type='curiosity')

What makes Montezuma's so challenging is the sparse rewards and the vast state space. The agent needs to identify a specific movement pattern to reach the first reward and then continue from there. Solving the environment as a whole is a fascinating problem, requiring a lot of small steps.

Sorry, I can't be of more help.

Ryan

from rl_atari_games.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.