Code Monkey home page Code Monkey logo

tensorflow_rl's Introduction

This Repository is Reinforcement Learning Agent FrameWork

This repository is designed to provide an easy demo reinforcement learning framework for those studying deep reinforcement learning.

This framework is based on a tensorflow. And the basic model is implemented in example_model directory. If you want to use your own model, please refer provided model in example_model directory

We provide a tutorial to train the agent for the environment, and tutorials by action and input shape are provided as follows.

Environment

Continuous Action MLP - bipedalwalker, pendulum
Discrete Action MLP - LunarLander
Discrete Action CNN - Breakout

Algorithms

Continuous Action MLP - DDPG, TD3, PPO, PPO2
Discrete Action MLP - Vanilla PG, A2C, PPO, DQN, QRDQN, IQN
Discrete Action CNN - Vanilla PG, A2C, PPO, DQN, QRDQN, IQN

Our tutorial is being done in the gym environment provided by openai and you need to install the openai gym and box2d to run the tutorial code.

Installation

from git repository

https://github.com/RLOpensource/tensorflow_RL
pip install .

cpu version

pip install tensorflow-rl[tf-cpu]

gpu version

pip install tensorflow-rl[tf-gpu]

If you install this repository by only

pip install tensorflow-rl

tensorflow is not installed

Requirements

tensorflow
box2d
gym
numpy
tensorboardX

Implemented

  • Vanilla Policy Gradient
  • Advantage Actor Critic
  • Proximal Policy Optimization
  • Deep Deterministic Policy Gradient
  • Value based Reinforcement Learning
  • Soft Actor Critic
  • LSTM train Algorithm

Demonstration

1. Continuous Action BipedalWalker

  • Script : bipedalwalker_td3.py, bipedalwalker_ddpg.py, bipedalwalker_ppo.py, bipedalwalker_ppo2.py
  • Environment : BipedalWalker-v2
  • Orange : td3, Blue: ddpg, SkyBlue: ppo, Pink: ppo2
  • Episode : 600
  • Image : td3
BipedalWalker

2. Continuous Action Pendulum

  • Script : pendulum_td3.py, pendulum_ddpg.py
  • Environment : Pendulum-v0
  • Orange : ddpg, Blue: td3
  • Episode : 300
  • Image : td3
Pendulum

3. Discrete Action CNN Breakout

  • Script : breakout_rollout_a2c.py, breakout_rollout_ppo.py, breakout_rollout_vpg.py
  • Environment : BreakoutDeterministic-v4 with Multi-processing
  • Blue : ppo, Orange : a2c, Red : vpg
  • Episode : 600
  • Image : PPO
Breakout

4. Discrete Action MLP LunarLander

  • Script : lunarLander_rollout_a2c.py, lunarLander_rollout_ppo.py, lunarLander_rollout_vpg.py
  • Environment : LunarLander-v2 with Multi-processing
  • Blue : ppo, Orange : a2c, Red : vpg
  • Episode : 350
  • Image : PPO
LunarLander

5. Value Based Reinforcement Learning with CNN

  • Script : breakout_value_dqn.py, breakout_value_qrdqn.py, breakout_value_iqn.py
  • Environment : BreakoutDeterministic-v4 with Multi-processing
  • Green : IQN, Blue : QRDQN, Pink : DQN
  • Episode : 280
  • Image : IQN
Breakout

6. Value Based Reinforcement Learning with MLP

  • Script : lunarLander_value_dqn.py, lunarLander_value_qrdqn.py, lunarLander_value_iqn.py
  • Environment : LunarLander-v2 with Multi-processing
  • Orange : IQN, Blue : QRDQN, Red : DQN
  • Episode : 250
  • Image : IQN
Breakout

7. Discrete Action CNN LSTM Breakout inspired from drqn

  • Script : breakout_rollout_ppo_1stack_lstm.py, breakout_rollout_ppo_1stack.py
  • Environment : BreakoutDeterministic-v4 with Multi-processing
  • Orange : PPOLSTM, Blue : PPO-1stack
  • Episode : 1000
  • Image : PPOLSTM
Breakout

Member

License

We do not have the copyright to this repository.

Please 'just' use these code and just 'refer' the url of repository in any form.

MIT License

Reference

[1] mario_rl

[2] Proximal Policy Optimization

[3] Efficient Parallel Methods for Deep Reinforcement Learning

[4] High-Dimensional Continuous Control Using Generalized Advantage Estimation

[5] Asynchronous Methods for Deep Reinforcement Learning

[6] Continuous Control With Deep Reinforcement Learning

[7] Vanilla Policy Gradient

[8] Deep Recurrent Q-Learning for Partially Observable MDPs

[9] Playing Atari with Deep Reinforcement Learning

[10] Distributional Reinforcement Learning with Quantile Regression

[11] Implicit Quantile Networks for Distributional Reinforcement Learning

[12] OpenAI Spinningup

[13] Reinforcement Learning Korea PG Travel

[14] Medipixel Reinforcement Learning Repository

Please fork this repository and contribute to strengthen the tensorflow reinforcement learning ecosystem

Support us in any form. Thank you

Content us to [email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.