Code Monkey home page Code Monkey logo

eppo's Introduction

Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble

This is the experiment code for our IJCAI 2022 paper "Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble".

Abstract

It is challenging for reinforcement learning (RL) algorithms to succeed in real-world applications like financial trading and logistic system due to the noisy observation and environment shifting between training and evaluation. Thus, it requires both high sample efficiency and generalization for resolving real-world tasks. However, directly applying typical RL algorithms can lead to poor performance in such scenarios. Considering the great performance of ensemble methods on both accuracy and generalization in supervised learning (SL), we design a robust and applicable method named Ensemble Proximal Policy Optimization (EPPO), which learns ensemble policies in an end-to-end manner. Notably, EPPO combines each policy and the policy ensemble organically and optimizes both simultaneously. In addition, EPPO adopts a diversity enhancement regularization over the policy space which helps to generalize to unseen states and promotes exploration. We theoretically prove EPPO increases exploration efficacy, and through comprehensive experimental evaluations on various tasks, we demonstrate that EPPO achieves higher efficiency and is robust for real-world applications compared with vanilla policy optimization algorithms and other ensemble methods. Code and supplemental materials are available at https://seqml.github.io/eppo.

Environment Dependencies

Dependencies

pip install -r requirements.txt

Running

Take Pong environment in Atari benchmarks as an example, to run EPPO, you can do the following.

python code/tools/train_on_atari.py exp/atari_local.yml

To run EPPO-Ens, please set the center_policy_coef in exp/atari_local.yml to 0.

To run EPPO-Div, please set the diverse_coef in exp/atari_local.yml to 0.

Reference

You are more than welcome to cite our paper:

@article{yang2022towards,
  title={Towards Applicable Reinforcement Learning: Improving the Generalization and Sample Efficiency with Policy Ensemble},
  author={Yang, Zhengyu and Ren, Kan and Luo, Xufang and Liu, Minghuan and Liu, Weiqing and Bian, Jiang and Zhang, Weinan and Li, Dongsheng},
  journal={arXiv preprint arXiv:2205.09284},
  year={2022}
}

eppo's People

Contributors

dependabot[bot] avatar microsoft-github-operations[bot] avatar microsoft-github-policy-service[bot] avatar microsoftopensource avatar rk2900 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

eppo's Issues

finite_env was recently removed but is still referenced/required by logging.py.

Finite_env was recently removed but is still referenced/required by logging.py.

   from code.env import EnvConfig
 File "/Eppo/code/env/__init__.py", line 10, in <module>
   from .logging import Logger
 File "/Eppo/code/env/logging.py", line 15, in <module>
   from .finite_env import BaseLogger
ModuleNotFoundError: No module named 'code.env.finite_env'

No module named 'code.env'

When we run python code/tools/train_on_atari.py exp/atari_local.yml, we get the following error:
Traceback (most recent call last):
File "code/tools/train_on_atari.py", line 25, in
from code.env import EnvConfig
ModuleNotFoundError: No module named 'code.env'

We searched the folder code and did not find the file env. Maybe, the file env is forgotten to put into the folder code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.