Code Monkey home page Code Monkey logo

patchail's Introduction

PatchAIL: Visual Imitation with Patch Rewards

This is a repository containing the code for the paper "Visual Imitation with Patch Rewards".

PatchDisc

PatchAIL

Download DMC expert demonstrations, weights and environment libraries [link]

The link contains the following:

  • The expert demonstrations for all tasks in the paper.
  • The weight files for the expert (DrQ-v2) and behavior cloning (BC).
  • The supporting libraries for environments (Gym-Robotics, metaworld) in the paper.
  • Extract the files provided in the link
    • set the path/to/dir portion of the root_dir path variable in cfgs/config.yaml to the path of the PatchAIL repository.
    • place the expert_demos and weights folders in ${root_dir}/PatchAIL.

Obtain Atari games demonstrations:

  • Download pkl files from [link] or python generate_atari_rlunplugged.py (change the env name contained in the script before running).

Instructions

  • Install Mujoco based on the instructions given here.

  • Install the following libraries:

    sudo apt update
    sudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3
    
  • Install dependencies

    • Set up Environment (Conda)
    conda env create -f conda_env.yml
    conda activate vil
    
    • Set up Environment (Pip)
    pip install -r requirement.txt
    
  • (If you want to run Atari games) Install Atari ROMS:

    pip install ale-py
    ale-import-roms path_to_ROMS
    
  • Main Imitation Experiments (Observations only) (10 exp trajs) - Commands for running the code on the DeepMind Control Suite, for pixel-based input

    • Train PatchAIL (w.o. Reg) agent on DMC

      python train.py agent=patchirl suite=dmc obs_type=pixels suite/dmc_task=finger_spin algo_name=patchairl_ss num_demos=10 seed=1 replay_buffer_size=150000
      
    • Train PatchAIL (w.o. Reg) agent on Atari

      python train.py agent=patchirl suite=atari obs_type=pixels suite/atari_task=pong algo_name=patchairl num_demos=20 seed=1 replay_buffer_size=1000000
      
    • Train PatchAIL-W agent

      python train.py agent=patchirl_simreg suite=dmc obs_type=pixels suite/dmc_task=finger_spin algo_name=patchairl_ss_weight num_demos=10 seed=1
      
    • Train PatchAIL-B agent

      python train.py agent=patchirl_simreg suite=dmc obs_type=pixels suite/dmc_task=finger_spin algo_name=patchairl_ss_bonus num_demos=10 seed=1 reward_scale=0.5 agent.sim_rate=auto-0.5 +agent.sim_type="bonus"
      
    • Train Shared-Encoder AIL agent

      python train.py agent=encirl_ss suite=dmc obs_type=pixels suite/dmc_task=finger_spin num_demos=10 seed=1 algo_name=encairl_ss reward_type=airl replay_buffer_size=150000 
      
    • Train Independent-Encoder AIL agent

      python train.py agent=ind_encirl_ss suite=dmc obs_type=pixels suite/dmc_task=finger_spin num_demos=10 seed=1 algo_name=ind_encairl_ss reward_type=airl replay_buffer_size=150000 
      
    • Train BC agent

      python train.py agent=bc suite=dmc obs_type=pixels suite/dmc_task=walker_run num_demos=10
      
  • Visual Imitation with Actions (1 exp traj)

    • Train PatchAIL (w.o. Reg) agent

      python train.py agent=patchirl suite=dmc obs_type=pixels suite/dmc_task=finger_spin algo_name=patchairl_ss_bc num_demos=10 seed=1 replay_buffer_size=150000 bc_regularize=true suite.num_train_frames=1101000
      
    • Train PatchAIL-W agent

      python train.py agent=patchirl_simreg suite=dmc obs_type=pixels suite/dmc_task=finger_spin algo_name=patchairl_ss_weight_bc num_demos=1 seed=1 bc_regularize=true suite.num_train_frames=1101000
      
    • Train PatchAIL-B agent

      python train.py agent=patchirl_simreg suite=dmc obs_type=pixels suite/dmc_task=finger_spin algo_name=patchairl_ss_bonus_bc num_demos=1 seed=1 reward_scale=0.5 agent.sim_rate=auto-0.5 +agent.sim_type="bonus" bc_regularize=true suite.num_train_frames=1101000
      
    • Train Shared-Encoder AIL agent

      python train.py agent=encirl_ss suite=dmc obs_type=pixels suite/dmc_task=finger_spin num_demos=1 seed=1 algo_name=encairl_ss_bc reward_type=airl replay_buffer_size=150000  bc_regularize=true suite.num_train_frames=1101000
      
    • Train Independent-Encoder AIL agent

      python train.py agent=ind_encirl_ss suite=dmc obs_type=pixels suite/dmc_task=finger_spin num_demos=1 seed=1 algo_name=ind_encairl_ss_bc reward_type=airl replay_buffer_size=150000 bc_regularize=true suite.num_train_frames=1101000
      
    • Train ROT

      python train.py agent=potil suite=dmc obs_type=pixels suite/dmc_task=walker_run bc_regularize=true num_demos=1 replay_buffer_size=150000 suite.num_train_frames=1101000 algo_name=rot
      
  • If you want to resume experiments from previous experiment:

    python train.py ...(use the same parameters that you want resume) +resume_exp=true
    

    This will load models from the snapshot of previous log directory.

  • Monitor results

tensorboard --logdir exp_local
  • Visualize Rewards See guidance in PatchAIL/visualization

Ack: This repo is based on the ROT repo.

patchail's People

Contributors

ericonaldo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.