Code Monkey home page Code Monkey logo

pytorch_ppo_rl's Introduction

This Repository is Reinforcement Learning related with PPO

This Repository is Reinforcece Learning Implementation related with PPO. The framework used in this Repository is Pytorch. The multi-processing method is basically built in. The agents are trained by PAAC(Parallel Advantage Actor Critic) strategy.

1. Multi-processing MLP Proximal Policy Optimization

  • Script : LunarLander_ppo.py
  • Environment : LunarLander-v2
  • Orange : 8 Process, Blue : 4 Process, Red : 1 Process
LunarLander-v2

2. Multi-processing CNN Proximal Policy Opimization

  • Script : Breakout_ppo.py
  • Environment : BreakoutDeterministic-v4
  • Red: 8 Process, Blue: 4 Process, Orange: 1 Process
BreakoutDeterministic-v4

3. Multi-processing CNN Proximal Policy Opitimization with Intrinsic Curiosity Module

  • Script : Breakout_ppo_icm.py
  • Environment : BreakoutNoFrameskip-v4(handled by custom environment)
  • With no environment Reward
  • Because the game initial key is not selected, the peak point and performance drop is generated.
  • Left : Comparison between (extrinsic reward and intrinsic, oragne) and (only intrinsic reward, gray), the average of three times of experiment
  • Right : only intrinsic reward
  • 32 process
BreakoutNoFrameskip-v4(handled by custom environment)

4. Multi-processing Mlp Proximal Policy Opitimization with Intrinsic Curiosity Module

  • Script : MountainCar_ppo_icm.py
  • Environment : MountainCart-v0
  • With no environment Reward
  • 32 process
MountainCart-v0

5. Unity MLAgents Mlp Proximal Policy Optimization with Intrinsic Curiosity Module

  • Script : PushBlock_ppo_icm.py
  • Environment : PushBlock
  • 32 Environment, PAAC
  • orange : 0.5int + 0.5ext, blue : only int, Red : only ext
  • reward shaping for sparse-reward environment : sucess - 1, others - 0
  • The environment has not sparsed-reward property even if the reward is engineered to two categories(0, 1)
PushBlock

6. Unity MLAgents Mlp Proximal Policy Optimization with Intrinsic Curiosity Module

  • Script : Pyramid_ppo_icm.py
  • Environment : Pyramid
  • 16 Environment, PAAC
  • orange : only ext, blue : 0.01int + 0.99ext
Pyramid

Reference

[1] mario_rl

[2] Proximal Policy Optimization

[2] Efficient Parallel Methods for Deep Reinforcement Learning

[3] High-Dimensional Continuous Control Using Generalized Advantage Estimation

[4] Curiosity-driven Exploration by Self-supervised Prediction

[5] Large-Scale Study of Curiosity-Driven Learning

[6] curiosity-driven-exploration-pytorch

[7] ml-agents

[8] Unity: A General Platform for Intelligent Agents

[9] Solving sparse-reward tasks with Curiosity

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.