Hello! Wonderful repository for playing with montezuma's revenge with an algorithm tha

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Error: Cannot batch space with type `<class 'gym.spaces.box.Box'>`. The space must be a valid `gymnasium.Space` instance.,about achronus/rl_atari_games

Comments (5)

sunchipsster1 commented on June 4, 2024 1

@Achronus thank you so much! It runs and I am training now on the default parameters (with create_model = rainbow, of course).

Fingers crossed :)

from rl_atari_games.

Achronus commented on June 4, 2024

Hey @sunchipsster1 ,

Thanks for your interest in this repository.

It appears the issue relates to OpenAI's recent update of the Gym package to Gymnasium. I've added a quick hotfix to accommodate compatibility changes. Unfortunately, fixing this issue has caused me to encounter a new one with the PyTorch package - RuntimeError: Numpy is not available.

It's likely this is only a local issue (I hope), but I am currently updating all my packages (especially PyTorch) which should resolve the new issue. The hotfix only applies to one file: core/env_details.py -> here.

Additionally, I've made minor changes to the main.py file to simplify it.

While I'm updating the packages, can you please apply the new hotfix and see if things work on your end? The best option would be to copy the new core/env_details.py into the old one and go from there. Let me know if you encounter any more issues :)

Ryan

from rl_atari_games.

Achronus commented on June 4, 2024

Hey @sunchipsster1,

As a follow up on my previous comment, I've updated the repository to fix the broken compatibility issues. Everything should now be working as intended. Please update your packages to the latest ones as detailed in the dependencies section of the README and the requirements.txt file.

Let me know if you have any further issues.

Many thanks.

Ryan

from rl_atari_games.

sunchipsster1 commented on June 4, 2024

Hi @Achronus just wanted to confirm that I am doing as you had originally intended - I have only modified main.py by one line:
model = create_model('ppo', env='primary', device=device) --> into model = create_model('rainbow', env=env3, device=device)

I've left the other hyperparameters as they were.

For 3 different seeds, currentlyrunning around ~~ (40.0K/100K) episodes, the Episode Score is still at 0. Would love to get the beautiful results that you obtained in your baseline. Am I making an unforeseen error?

Thanks so much again! :)

from rl_atari_games.

Achronus commented on June 4, 2024

Hey @sunchipsster1,

That's normal behaviour for agents in complex environments. It's known as a 'burn-in' period, where the agent is still learning the environment dynamics. Typically, its length varies depending on the model and the environment's complexity. The more complex the environment, the longer the burn-in period. For example, with an A2C algorithm, I created for Super Mario Bros, the burn-in period lasted around 110k-120k episodes before it finally figured out how to jump longer and get over a large pipe.

I haven't had the opportunity (yet) to explore the full extent of RDQNs, so the parameters are not optimised. The best advice I can offer:

Refer to the original paper's hyperparameters in the appendix - Rainbow: Combining Improvements in Deep Reinforcement Learning. Their approach is extremely computationally expensive, requiring a buffer size of 1 million. Typically, this is unfeasible on standard computer hardware.
Experiment with the parameters yourself to find ones that work for your hypothesis. For example, how many episodes will it take to reach the first key (reward signal)?
Enable curiosity to improve agent exploration -
model = create_model('rainbow', env=env3, device=device, im_type='curiosity')

What makes Montezuma's so challenging is the sparse rewards and the vast state space. The agent needs to identify a specific movement pattern to reach the first reward and then continue from there. Solving the environment as a whole is a fascinating problem, requiring a lot of small steps.

Sorry, I can't be of more help.

Ryan

from rl_atari_games.

Error: Cannot batch space with type `<class 'gym.spaces.box.Box'>`. The space must be a valid `gymnasium.Space` instance. about rl_atari_games HOT 5 CLOSED

Comments (5)

Related Issues (1)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent