Code Monkey home page Code Monkey logo

gail_atari's Introduction

Atari-GAIL

This repository contains the PyTorch code for Generative Adversarial Imitation Learning (GAIL) with visual inputs, i.e. Atari games and visual dm-control.

Requirements

Experiments were run with Python 3.6 and these packages:

  • torch == 1.10.2
  • gym == 0.19.0
  • atari-py == 0.2.9

Collect Expert Demonstrations

  • Train an Expert Policy with PPO
 python train_ppo.py --env-name "PongNoFrameskip-v4" --algo ppo --use-gae --lr 2.5e-4  --clip-param 0.1 --value-loss-coef 0.5 --num-processes 8 --num-steps 128 --num-mini-batch 4 --log-interval 1 --use-linear-lr-decay --entropy-coef 0.01
  • Collect Expert Demonstrations
 python collect.py --env-name "PongNoFrameskip-v4"

We provide collected expert demonstrations in the following link. 'Level 2' demonstrations are optimal demonstrations and 'Level 1' demonstrations are sub-optimal demonstrations. [Google Drive]

Train GAIL

  • Train GAIL with optimal demonstrations (without BC pre-training)
 python gail.py --gail --env-name "PongNoFrameskip-v4" --name pong --algo ppo --use-gae --lr 2.5e-4 --clip-param 0.1 --value-loss-coef 0.5 --num-processes 8 --num-steps 128 --num-mini-batch 4 --log-interval 1 --use-linear-lr-decay --entropy-coef 0.01
  • Train GAIL with optimal demonstrations (with BC pre-training)
 python gail.py --bc --gail --env-name "PongNoFrameskip-v4" --name pong --algo ppo --use-gae --lr 2.5e-4 --clip-param 0.1 --value-loss-coef 0.5 --num-processes 8 --num-steps 128 --num-mini-batch 4 --log-interval 1 --use-linear-lr-decay --entropy-coef 0.01
  • Train GAIL with imperfect demonstrations
 python gail.py --imperfect --bc --gail --env-name "PongNoFrameskip-v4" --name pong --algo ppo --use-gae --lr 2.5e-4 --clip-param 0.1 --value-loss-coef 0.5 --num-processes 8 --num-steps 128 --num-mini-batch 4 --log-interval 1 --use-linear-lr-decay --entropy-coef 0.01

Results

We train GAIL with 2000 optimal demonstrations. The results are as follow.

Method Pong Seaquest BeamRider Hero Qbert
BC -20.7(0.46) 200.0(83.43) 1028.4(396.37) 7782.5(50.56) 11420.0(3420.0)
GAIL -1.73(18.1) 1474.0(201.6) 1087.6(559.09) 13942.5(67.13) 8027.27(24.9)
GAIL+BC 21.0(0.0) 1662.0(161.85) 2306.4(1527.23) 20020(22.91) 13225.0(1347.22)
PPO(Best) 21.0(0.0) 1840(0.0) 2637.45(1378.23) 27814.09(46.01) 15268.18(127.07)

In our experiments, we find that using BC as a pre-training step can significantly improve the performance of GAIL in some Atari games.

Citations

If you are using the code/data in this repo, please consider citing:

   @inproceedings{wang2021learning,
     title={Learning to Weight Imperfect Demonstrations},
     author={Wang, Yunke and Xu, Chang and Du, Bo and Lee, Honglak},
     booktitle={International Conference on Machine Learning},
     pages={10961--10970},
     year={2021},
     organization={PMLR}
   }

Acknowledegement

Our code structure is largely based on Kostrikov's implementation.

gail_atari's People

Contributors

yunke-wang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

linwei94

gail_atari's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.