Code Monkey home page Code Monkey logo

Comments (18)

zxzzz0 avatar zxzzz0 commented on August 22, 2024

Try this one https://github.com/opendilab/DI-engine/blob/main/dizoo/bsuite/config/serial/memory_len/memory_len_15_r2d2_gtrxl_config.py

from di-engine.

PaParaZz1 avatar PaParaZz1 commented on August 22, 2024

You can try gtrxl with R2D2 according to above mentioned link.

As for gtrxl with PPO, if necessary, we will add corresponding implementation. Which RL environment do you want to use gtrxl with ppo, such as some discrete action envs like lunarlander?

from di-engine.

hlsafin avatar hlsafin commented on August 22, 2024

Hmmm oddly enough, I tried that link and it was giving me some error. I will try to run it again and see what the error was. I am just trying to test it out on atari env at the moment.

from di-engine.

hlsafin avatar hlsafin commented on August 22, 2024

and I get this error

File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 778, in getattr
raise ModuleAttributeError("'{}' object has no attribute '{}'".format(
torch.nn.modules.module.ModuleAttributeError: 'GTrXLDiscreteHead' object has no attribute 'dropout'

from di-engine.

PaParaZz1 avatar PaParaZz1 commented on August 22, 2024

and I get this error

File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 778, in getattr raise ModuleAttributeError("'{}' object has no attribute '{}'".format( torch.nn.modules.module.ModuleAttributeError: 'GTrXLDiscreteHead' object has no attribute 'dropout'

Maybe your config is wrong, you can check it by comparing it with our atari config, i.e. for Atari env, you need to indicate 3D obs_shape

from di-engine.

hlsafin avatar hlsafin commented on August 22, 2024

oh, I see! thank you!

from di-engine.

hlsafin avatar hlsafin commented on August 22, 2024

Okay so that code works now, I didnt change the config, but it crashes after a bit. I have a RTX 3080 and 32GB of RAM.

from di-engine.

PaParaZz1 avatar PaParaZz1 commented on August 22, 2024

What kind of crash error you get, can you offer more details like the error traceback?

from di-engine.

hlsafin avatar hlsafin commented on August 22, 2024

I can't really see, I have everything on that docker file, it trains for a while then after several hours it exits, so I am assuming maybe it was a memory leak then sigkill was called? possibly. Were you able to run training in that environment with no issues after a very long period of time?

from di-engine.

PaParaZz1 avatar PaParaZz1 commented on August 22, 2024

Maybe your problem is OOM, due to the huge replay buffer of R2D2-GTrXL, you can try to rerun your experiment with smaller replay_buffer_size. In our experiment about R2D2-GTrXL in Atari (training curve is in here), it usually needs more than 50~60GB of RAM, you can monitor the usage of RAM.

from di-engine.

hlsafin avatar hlsafin commented on August 22, 2024

I had my replay buffer at 1000, and also doesn't the replay buffer fill up fairly quickly in the beginning? or does it still fill up several hours later that would cause this OOM. Because my understanding was that if it can work for the first 3 hours, why all of a sudden does it break after that? make no sense to me.

from di-engine.

PaParaZz1 avatar PaParaZz1 commented on August 22, 2024

For R2D2-GTrXL, each element in replay buffer is a train_sample, i.e., a list of transition of length unroll_len, so it will not be full at one. I think you should monitor the usage of RAM in your experiment first, you can utilize some tools like this

from di-engine.

hlsafin avatar hlsafin commented on August 22, 2024

image

This is what I get after 1.6 million env_time_steps, and the mean reward is still around -20, I use the same config as what was mentioned before except I brought the experience replay buffer to 1000 vs 10,000 before. Does this look right to you?

from di-engine.

PaParaZz1 avatar PaParaZz1 commented on August 22, 2024

I have rerun the pong_r2d2_gtrxl.py with buffer_size 10000 and 1000, here is the naive result:
Screen Shot 2022-05-25 at 2 55 56 PM
You can see the experiment with buffer_size 1000 (blue curve) indeed shows more poor performance and until 2M env step it begins to rise. But the experiment with buffer_size 10000 exhibits the similar result with our previous experiment, so there is no bug in our implementation and you need a larger replay buffer.

BTW, the memory utilization of these two experiment are shown as follows:

  • buffer_size 10000, max usage 40.0GB:

Screen Shot 2022-05-25 at 3 07 40 PM

  • buffer_size 1000, max usage 9.8GB:

Screen Shot 2022-05-25 at 3 09 19 PM

from di-engine.

hlsafin avatar hlsafin commented on August 22, 2024

Okay wow, thank you for the demonstration, I didn't realize that the buffer size played such a huge role. I'm currently trying to tackle Montezuma revenge with 8 GPUs, how do I make them run r2d2gtrx on multi-GPUs with data-parallel (i'm currently doing ddp). What sort of config do you recommend? I currently am running it on 1 gpu with memory len at 256, and unroll_len 95,seq_len = 90, and not much learning is happening, I'm wondering if it's my setup or r2d2gtrxl just isn't able to solve this problem?

from di-engine.

PaParaZz1 avatar PaParaZz1 commented on August 22, 2024

Buffer size is of great important to off-policy value-based method, if interested, you can have a look about this paper.

For data-parallel, you can refer to this doc.

To solve Montezuma Revenge, exploration is more important than better time-series model like GTrXL, I think you can think about combine RND or Go-Explore with your current code.

from di-engine.

hlsafin avatar hlsafin commented on August 22, 2024

Okay, Because I was under the impression that games like minigrid, which this approach shouldn't have any issue solving, and Montezuma revenge are similar in that they both have sparse rewards. Maybe I am incorrect here.

Also, I tried running pong with the approach, and I still wasn't able to get your reward. It stays around -19 for 5 million time steps.

from di-engine.

Sino-Huang avatar Sino-Huang commented on August 22, 2024

@PaParaZz1 Greeting, a quick question, is there a easy way for me to concatenate a Conv2D network before the GTrXl network because it looks like I cannot directly use GTrXl for Atari environment due to obs_shape, also there is no obs_shape parameter in GTrXl.

from di-engine.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.