Code Monkey home page Code Monkey logo

dreamer's Introduction

Dreamer

Dreamer is a visual Model-Based Reinforcement algorithm, that learns a world model which captures latent dynamics from high-level pixel images and trains a control agent entirely in imagined rollouts from the learned world model.

This work is my attempt at reproducing Dreamerv1 & v2 papers in pytorch specifically for continuous control tasks in deepmind control suite.

Noteworthy differences from original and prior works:

  1. This work compares Dreamer and Dreamerv2 agents for continuous control tasks only. Only KL-Balancing is used for dreamerv2 and policy type remains the same as dreamerv1 i.e. Tanh transformed MultivariateNormalDiag distribution.
  2. This work doesn't train dreamerv1 and v2 for 2M timesteps as did in the papers, instead both the agents are trained till 100K timesteps.
  3. All experiments are carried out on free single GPUs(Tesla T4) on google colab. Training time on Tesla T4 for 100K timesteps ~ 3 Hrs
  4. Due to limited computational resources (colab strict timeouts) results produced here are for five control tasks and are run for single seed only.
  5. Hence plot_results are produced by running agents for 10 eval episodes for single seed. A fair evaluation would require running experiments for multiple seeds, this repo serves as a working implementation for both agents.

Evaluated agents Left: Dreamerv1, Right:Dreamerv2 after training for 100K timesteps.

For further information regarding methodology and experiments refer these papers

  1. Dreamerv1 - DREAM TO CONTROL: LEARNING BEHAVIORS BY LATENT IMAGINATION
  2. Dreamerv2 - MASTERING ATARI WITH DISCRETE WORLD MODELS

Code Structure

Code structure is similar to original work by Danijar Hafner in Tensorflow

dreamer.py - main function for training and evaluating dreamer agent

utils.py - Logger, miscallaneous utility functions

models.py - All the networks for world model and actor are implemented here

replay_buffer.py - Experience buffer for training world model

env_wrapper.py - Gym wrapper for Dm_control suite

All the hyperparameters are listed in main.py and are avaialble as command line args.

For training

python dreamer.py --env 'walker-walk' --algo 'Dreamerv1' --exp 'default_hp' --train

For Evaluation

python dreamer.py --env 'walker-walk' --algo 'Dreamerv1' --exp 'eval' --evaluate --restore --checkpoint_path '<your_ckpt_path>'

Google_Colab

I have added a colab file Open Dreamer in Colab to train and evaluate on freely avilable GPUs on google colab for quick reproducilibilty.

Plot Results

Training and Evaluation results for dreamerv1 and v2 agents

Evaluation results after traning for 100k steps, average returns and standard deviations are reported for 10 eval episodes

Control Task Dreamer Dreamerv2
cartpole-balance 450.6 ± 19.54 976.8 ± 1.11
cartpole-swingup 235.8 ± 42.74 684.1 ± 36.97
walker-stand 836.1 ± 165.2 937.5 ± 25.91
walker-walk 487.9 ± 84.81 228.7 ± 45.95
cheetah-run 234.6 ± 92.67 321.6 ± 8.91

Training Plots: Training Returns plotted against environment timesteps dreamer_train

Evaluation Plots: Evaluation is done at every 10k steps during training for 10 eval episodes. Plots show average returns as solid lines and std deviations as shaded areas

dreamer_eval

Acknowledgements

This code is heavily inpsired by following open-source works

dreamer by Danijar Hafner(lead author of both papers) : https://github.com/danijar/dreamer/blob/master/dreamer.py

dreamer-pytorch by yusukeurakami : https://github.com/yusukeurakami/dreamer-pytorch

Dreamerv2 by Rajghugare : https://github.com/RajGhugare19/dreamerv2

dreamer's People

Contributors

adityabingi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.