This repository contains code to explore the policy optimiaztion landscape.
To run cartpole simply do:
python3 run_eager_policy_optimization.py --env CartPole-v0 --policy_type discrete
To run something from Mujoco you must have it installed and the associated license. To run Hopper-v1 use:
python3 run_eager_policy_optimization.py --env Hopper-v1 --policy_type normal --std 0.5
Parameters will be saved into ./parameters
as numpy files. After obtaining
some parameters from different runs use the following commands to analyze the landscape.
-
First install eager_pg:
pip install -e .
. -
Random Pertubations Experiment:
cd interpolation_experiments
python paired_random_directions_experiment.py --p1 ./path/to/parameter/1/npy \
--save_dir ./path/to/save/in/ \
--alpha 0.5 --std 0.5 --n_directions 500
- Linear Interpolation Experiment:
cd interpolation_experiments
python simple_1d_interpolation_experiment.py --p1 ./path/to/parameter/1/npy \
--p2 ./path/to/parameter/2/npy --save_dir ./path/to/save/in/ \
--stds 5.0 --alpha_start -0.5 --alpha_end 1.5 --n_alphas 2 \
--save_dir ./path/to/save/in
Note that interpolation tools only work with continuous policies.
eager_pg
: contains a small library to enable quick research in policy gradient reinforcement learning.analysis_tools
: contains tooling to make nice figures in papers.interpolation_experiments
: Experiments to explore the landscape in policy optimization.
If you use the proposed method or code, we'd appreciate if you could cite this work!
@article{ahmed2018understanding,
title={Understanding the impact of entropy in policy learning},
author={Ahmed, Zafarali and Roux, Nicolas Le and Norouzi, Mohammad and Schuurmans, Dale},
journal={arXiv preprint arXiv:1811.11214},
year={2018}
}
This is not an official Google product.
policy-learning-landscape's People
Forkers
etsangsplk wwxfromtju muxinghan john2912 sharmasecureservices stjordanis muskanmahajan37 0xjchen isabella232 ismarou l5d1l5 ayaabdelsalam91 alcatrazbeepbop 5ky9uyRecommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.