Code Monkey home page Code Monkey logo

deeprl's Introduction

DeepRL

If you have any question or want to report a bug, please open an issue instead of emailing me directly.

Modularized implementation of popular deep RL algorithms by PyTorch. Easy switch between toy tasks and challenging games.

Implemented algorithms:

  • (Double/Dueling) Deep Q-Learning (DQN)
  • Categorical DQN (C51, Distributional DQN with KL Distance)
  • Quantile Regression DQN
  • (Continuous/Discrete) Synchronous Advantage Actor Critic (A2C)
  • Synchronous N-Step Q-Learning
  • Deep Deterministic Policy Gradient (DDPG, low-dim-state)
  • (Continuous/Discrete) Synchronous Proximal Policy Optimization (PPO, pixel & low-dim-state)
  • The Option-Critic Architecture (OC)

Asynchronous algorithms (e.g., A3C) can be found in v0.1. Action Conditional Video Prediction can be found in v0.4.

Dependency

  • MacOS 10.12 or Ubuntu 16.04
  • PyTorch v1.1.0
  • Python 3.6, 3.5
  • OpenAI Baselines (commit 8e56dd)
  • Core dependencies: pip install -e .

Remarks

  • PyTorch v0.4.0 should also work in principle, at least for commit 80939f.
  • There is a super fast DQN implementation with an async actor for data generation and an async replay buffer to transfer data to GPU. Enable this implementation by setting config.async_actor = True and using AsyncReplay. However, with atari games this fast implementation may not work in macOS. Use Ubuntu or Docker instead.
  • Although there is a setup.py, which means you can install the repo as a library, this repo is never designed to be a high-level library like Keras. Use it as your codebase instead.
  • Code for my papers can be found in corresponding branches, which may be good examples for extending this codebase.
  • TensorFlow is used only for logging. Open AI baselines is used very slightly. If you carefully read the code, you should be able to remove/replace them.

Usage

examples.py contains examples for all the implemented algorithms

Dockerfile contains the environment for generating the curves below.

Please use this bibtex if you want to cite this repo

@misc{deeprl,
  author = {Shangtong, Zhang},
  title = {Modularized Implementation of Deep RL Algorithms in PyTorch},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub Repository},
  howpublished = {\url{https://github.com/ShangtongZhang/DeepRL}},
}

Curves (commit 80939f)

BreakoutNoFrameskip-v4

Loading...

  • This is my synchronous option-critic implementation, not the original one.
  • The curves are not directly comparable, as many hyper-parameters are different.

Mujoco

  • DDPG evaluation performance. Loading...

  • PPO online performance. Loading...

References

deeprl's People

Contributors

shangtongzhang avatar wassname avatar nadavbh12 avatar

Watchers

James Cloos avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.