Code Monkey home page Code Monkey logo

depo's Introduction

DePO codes

Official Pytorch implemetation of ICML2022 paper Depo (Plan Your Target and Learn Your Skills: Transferable State-Only Imitation Learning via Decoupled Policy Optimization).

Important Notes

This repository is based on ILSwiss. The code is for Mujoco experiments, if you are looking for NGSIM experiments, check here.

Algorithms Contained

Implemented RL algorithms:

  • Soft-Actor-Critic (SAC)

Implemented LfD algorithms:

  • Adversarial methods for Inverse Reinforcement Learning
    • AIRL / GAIL / FAIRL / DAC
  • BC

Implemented LfO algorithms:

  • BCO
  • GAIfO
  • DPO

Running Notes:

Before running, assign important log and output paths in \rlkit\launchers\config.py.

There are simple multiple processing shcheduling (we use multiple processing to clarify it with multi-processing since it only starts many independent sub-process without communication) for simple hyperparameter grid search.

The main entry is run_experiments.py, with the assigned experiment yaml file in \exp_specs: python run_experiment.py -g 0 -e your_yaml_path or python run_experiment.py -e your_yaml_path.

When you run the run_experiments.py, it reads the yaml file, and generate small yaml files with only one hyperparameter setting for each. In a yaml file, a script file path is assigned (see \run_scripts\), which is specified to run the script with every the small yaml file. See \exp_specs\sac\bc.yaml for necessary explaination of each parameter.

NOTE: all experiments, including the evaluation tasks (see \run_scripts\evaluate_policy.py and \exp_specs\evaluate_policy) and the render tasks, can be run under this framework by specifying the yaml file (in a multiple processes style).

Reproducing Results

Training Expert Policies

Train an SAC agent and collect expert demos, or use the demo here. Then write the demo path in \demos_listing.yaml.

Example scripts

-e means the path to the yaml file, -g means gpu id. Existing specs are the ones for producing the final results.

Training LfO Agents

Baseline results are available in here. Config files are in exp_specs/dpo_exps. Example commands:

BCO

python run_experiment.py -e exp_specs/dpo_exps/bco_hopper_4.yaml

GAIfO

python run_experiment.py -e exp_specs/dpo_exps/gailfo_hopper_4.yaml

GAIfO-DP

python run_experiment.py -e exp_specs/dpo_exps/gailfo_dp_hopper_4.yaml

DPO (Supervised)

python run_experiment.py -e exp_specs/dpo_exps/sl_lfo_hopper_4.yaml

DPO

python run_experiment.py -e exp_specs/dpo_exps/dpo_hopper_4_weightedmle_qsa_weight.yaml

Abaltion Study

Config files are in exp_specs/ablation. Example commands:

python run_experiment.py -e exp_specs/ablation/dpo_hopper_4_weightedmle_qsa_static_lambdah.yaml

Transfer Experiments

Config files are in exp_specs/transfer_exps and exp_specs/complex_transfer. Example commands (remember to change the loaded policy ckpt path in the yaml file):

python run_experiment.py -e exp_specs/transfer_exps/dpo_hopper_4_weightedmle_qsa_weight.yaml

RL Experiments

Config files are in exp_specs/rl. Example commands:

python run_experiment.py -e exp_specs/rl/dpo_hopper.yaml

RL transfer Experiments

Config files are in exp_specs/rl_transfer. Example commands (remember to change the loaded policy ckpt path in the yaml file):

python run_experiment.py -e exp_specs/rl_transfer/dpo_hopper.yaml

Evaluate state planner

Config files are in exp_specs/evaluation. Example commands (remember to change the loaded policy ckpt in evaluate_state_predictor.py):

python run_experiment.py -e exp_specs/evaluation/eval_sp.yaml

depo's People

Contributors

ericonaldo avatar zbzhu99 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.