Policy Dissection

[NeurIPS 2022] Official implementation of the paper: Human-AI Shared Control via Policy Dissection

Webpage | Code | Video | Paper |

In this repo, we provide the implementation of Policy Dissection and some interactive neural controllers enabled by this method.

Supported Environments:

MetaDrive
Pybullet-Quadrupedal Robot (Forked from: https://github.com/Mehooz/vision4leg.git)
Isaacgym-Cassie (Forked from: https://github.com/leggedrobotics/legged_gym)
Isaacgym-ANYmal (Forked from: https://github.com/leggedrobotics/legged_gym)
Gym-Walker (Mujoco-200)
Gym-BipedalWalker (Box2D)
Gym-Ant (Mujoco-200)

Installation

Basic Installation

# Clone the code to local
git clone https://github.com/metadriverse/policydissect.git
cd policydissect

# Create virtual environment
conda create -n policydissect python=3.7
conda activate policydissect

# install torch
pip3 install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

# Install basic dependency
pip install -e .

IsaacGym Installation (Optional)

For playing with agents trained in IsaacGym, follow the instructions below to install IsaacGym

Download and install Isaac Gym Preview 3 from https://developer.nvidia.com/isaac-gym
cd isaacgym/python && pip install -e .

Please review the file isaacgym/docs/install.html for more information on installation. See the Troubleshooting section for debugging.

Mujoco Installation (Optional)

For playing with the Mujoco-Ant and Mujoco-Walker, please

install mujoco200 according to https://www.roboti.us/download.html (Mujoco licence can be found at https://www.roboti.us/license.html)
copy contents in the folder to ~/.mujoco/mujoco200
copy licence from https://www.roboti.us/license.html to ~/.mujoco/
add this line to .bashrc: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/USERNAME/.mujoco/mujoco200/bin
run pip install mujoco-py==2.0.2.7 (Solutions of compiling error can be easily found at https://github.com/openai/mujoco-py/issues)

Play with AI

MetaDrive

To collaborate with the AI driver in MetaDrive environment, run:

# MetaDrive
# Keymap:
# - KEY_W: lane following
# - KEY_A: left lane changing
# - KEY_S: braking
# - KEY_D: right lane changing
# - KEY_R:Reset
python play/play_metadrive.py

Pybullet Quadrupedal Robot

The quadrupedal robot is trained with the code provided by https://github.com/Mehooz/vision4leg.git. For playing with legged robot, run:

# Pybullet Quadrupedal Robot
# Keymap:
# - KEY_W: forward
# - KEY_A: moving left
# - KEY_S: stop
# - KEY_D: moving right
# - KEY_R: reset
python play/play_pybullet_a1.py
python play/play_pybullet_a1.py --hard
python play/play_pybullet_a1.py --hard --seed 1001

Also, you can collaborate with AI and challenge the hard environment consisting of obstacles and challenging terrains by adding --hard flag. You can change to a different environment by adding --seed your_seed_int_type.

tips: Avoid running fast!

IsaacGym Cassie

The Cassie robot is trained with the code provided by https://github.com/leggedrobotics/legged_gym with a fixed forward command [1, 0, 0], and thus can only move forward. By applying Policy Dissection, primitives related to yaw rate, forward speed, height control and torque force can be identified. Activating these primitives enable various skills like crouching, forward jumping, back-flipping and so on. Run the following command to play with the robot. Add flag--parkourto launch a challenging parkour environment.

# Keymap:
# - KEY_W:Forward
# - KEY_A:Left
# - KEY_S:Stop
# - KEY_C:Crouch
# - KEY_X:Tiptoe
# - KEY_Q:Jump
# - KEY_D:Right
# - KEY_SPACE:Back Flip
# - KEY_R:Reset
python play/play_cassie.py
python play/play_cassie.py --parkour

tips: Switch to Tiptoe state before pressing Key_Q to increase the distance of jump.

Note Do not draw the windows or close the pygame window during running.

Gym Environments

We also discover motor primitives in three gym environments: Box2d-BipedalWalker, Mujoco-Ant and Mujoco-Walker. You can try them via:

# BipedalWalker
# Keymap:
# - KEY_W: jump
# - KEY_A: front-flip
# - KEY_S: restore running after jumping
# - KEY_R: reset
python play/play_gym_bipedalwalker.py

# Mujoco-Ant
# Keymap:
# - KEY_W: move up
# - KEY_A: move left
# - KEY_S: move down
# - KEY_D: move right
# - KEY_Q: rotation
# - KEY_R: reset
python play/play_mujoco_ant.py
    
# Mujoco-Walker
# Keymap:
# - KEY_R: reset
# - KEY_A: stop
# - KEY_W: freeze red knee
# - KEY_D: restore running
python play/play_mujoco_walker.py

Comparison with explicit goal-conditioned control

To measure the coarseness of the control approach enabled by Policy Dissection, we train a goal-conditioned quadrupedal ANYmal robot controller with code provided by https://github.com/leggedrobotics/legged_gym. We build primitive-activation conditional control system on this controller with a PID controller determining the unit output according to the tracking error. As a result, it can track the target yaw command and can achieve the similar control precision, compared to explicitly indicating the goal in the network input. Video is available here.

The experiment script can be found at play/run_tracking_experiment.py. The default yaw tracking is achieved by explicit goal-conditioned control, while running python play/run_tracking_experiment.py --primitive_activation will change to primitive-activation conditional control.

Policy Dissection Examples

In example folder, we provide two examples showing how to dissect policy. The results can be read by opening read_result.ipynb with jupyter notebook. Also, the identified units are chosen as motor primitives for evoking behaviors of Anymal and the MetaDrive agents. Check previous section about how to play with them.

Troubleshooting

Installing IsaacGym

If you encounter ImportError: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory, run this:

export LD_LIBRARY_PATH=/path/to/libpython/directory
# If you are using Conda, the path should be /path/to/conda/envs/your_env/lib.
# For example:
export LD_LIBRARY_PATH=/home/USERNAME/anaconda3/envs/policydissect/lib

If you encounter CalledProcessError: Command '['which', 'c++']' returned non-zero exit status 1., try this:

sudo apt-get install build-essential

If you encounter AttributeError: module 'distutils' has no attribute 'version' from tensorboard, try this:

pip install -U setuptools==50.0.0

Installing Mujoco

If you encounter: fatal error: GL/osmesa.h: No such file or directory:

sudo apt-get install libosmesa6-dev

If you encounter: error: [Errno 2] No such file or directory: 'patchelf': 'patchelf':

sudo apt-get install patchelf

If you encounter: ERROR: GLEW initalization error: Missing GL version:

sudo apt-get install -y libglew-dev

Reference

@inproceedings{
    li2022humanai,
    title={Human-{AI} Shared Control via Policy Dissection},
    author={Quanyi Li and Zhenghao Peng and Haibin Wu and Lan Feng and Bolei Zhou},
    booktitle={Thirty-Sixth Conference on Neural Information Processing Systems},
    year={2022},
    url={https://openreview.net/forum?id=LCOv-GVVDkp}
}

metadata-generation-failed by installation

Hi, I was trying to install the package on my local machine following the basic installation guide in the readme. After executing pip install -e . I encounter the following error. I want to ask if you have any insight on how to fix this?

Obtaining file:///home/hongyi/workspace/play_ground/policydissect
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [19 lines of output]
error: Multiple top-level packages discovered in a flat-layout: ['play', 'policydissect'].

  To avoid accidental inclusion of unwanted files or directories,
  setuptools will not proceed with this build.
  
  If you are trying to create a single distribution with multiple packages
  on purpose, you should not rely on automatic discovery.
  Instead, consider the following options:
  
  1. set up custom discovery (`find` directive with `include` or `exclude`)
  2. use a `src-layout`
  3. explicitly set `py_modules` or `packages` with a list of names
  
  To find more information, look for "package discovery" on setuptools docs.
  linux-x86_64
  numpy is enabled.
  numpy_include_dirs = /home/hongyi/anaconda3/envs/policydissect/lib/python3.7/site-packages/numpy/core/include
  linux
  ['policydissect', 'policydissect.metadrive', 'policydissect.weights', 'policydissect.quadrupedal', 'policydissect.utils', 'policydissect.gym', 'policydissect.legged_gym', 'policydissect.quadrupedal.torchrl', 'policydissect.quadrupedal.vision4leg', 'policydissect.quadrupedal.starter', 'policydissect.quadrupedal.torchrl.collector', 'policydissect.quadrupedal.torchrl.replay_buffers', 'policydissect.quadrupedal.torchrl.utils', 'policydissect.quadrupedal.torchrl.policies', 'policydissect.quadrupedal.torchrl.algo', 'policydissect.quadrupedal.torchrl.env', 'policydissect.quadrupedal.torchrl.networks', 'policydissect.quadrupedal.torchrl.collector.para', 'policydissect.quadrupedal.torchrl.replay_buffers.shared', 'policydissect.quadrupedal.torchrl.algo.on_policy', 'policydissect.quadrupedal.torchrl.algo.off_policy', 'policydissect.quadrupedal.vision4leg.envs', 'policydissect.quadrupedal.vision4leg.utilities', 'policydissect.quadrupedal.vision4leg.robots', 'policydissect.quadrupedal.vision4leg.assets', 'policydissect.quadrupedal.vision4leg.envs.utilities', 'policydissect.quadrupedal.vision4leg.envs.env_wrappers', 'policydissect.quadrupedal.vision4leg.envs.gym_envs', 'policydissect.quadrupedal.vision4leg.envs.sensors', 'policydissect.quadrupedal.vision4leg.assets.a1', 'policydissect.legged_gym.envs', 'policydissect.legged_gym.utils', 'policydissect.legged_gym.training_script', 'policydissect.legged_gym.rsl_rl', 'policydissect.legged_gym.rsl_rl.algorithms', 'policydissect.legged_gym.rsl_rl.modules', 'policydissect.legged_gym.rsl_rl.utils', 'policydissect.legged_gym.rsl_rl.env', 'policydissect.legged_gym.rsl_rl.storage', 'policydissect.legged_gym.rsl_rl.runners']
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

metadriverse / policydissect Goto Github PK

policydissect's Introduction

Policy Dissection

Installation

Basic Installation

IsaacGym Installation (Optional)

Mujoco Installation (Optional)

Play with AI

MetaDrive

Pybullet Quadrupedal Robot

IsaacGym Cassie

Gym Environments

Comparison with explicit goal-conditioned control

Policy Dissection Examples

Troubleshooting

Installing IsaacGym

Installing Mujoco

Reference

policydissect's People

Contributors

Stargazers

Watchers

Forkers

policydissect's Issues

Recommend Projects

Recommend Topics

Recommend Org