benjamesbabala / rllabplusplus Goto Github PK

View Code? Open in Web Editor NEW

License: Other

Python 95.95% Ruby 0.69% Mako 0.21% Shell 0.24% CSS 0.47% JavaScript 1.60% HTML 0.85%

rllabplusplus's Introduction

rllab++

rllab++ is a framework for developing and evaluating reinforcement learning algorithms, built on rllab. It has the following implementations besides the ones implemented in rllab:

Q-Prop

Installation

Please follow the basic installation instructions in rllab documentation, with the following minor changes:

Install tensorflow-0.11.0rc0

Examples

From the launchers directory, run the following, with optional additional flags defined in launcher_utils.py:

python algo_gym_stub.py --exp=<exp_name>

Flags include:

algo_name: trpo (TRPO), vpg (vanilla policy gradient), ddpg (DDPG), qprop (Q-Prop with trpo), qvpg (Q-Prop with vpg).
env_name: OpenAI Gym environment name, e.g. HalfCheetah-v1.

The experiment will be saved in /data/local/<exp_name>. To view the results, run viskit or viskit2 with the path to the experiment folders as arguments.

Citations

If you use rllab++ for academic research, you are highly encouraged to cite the following papers:

Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine. "Q-Prop: Sample-Efficient Policy Gradient with an Off-Policy Critic". arXiv:1611.02247 [cs.LG], 2016.
Yan Duan, Xi Chen, Rein Houthooft, John Schulman, Pieter Abbeel. "Benchmarking Deep Reinforcement Learning for Continuous Control". Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016.

Recommend Projects