Code Monkey home page Code Monkey logo

pvm-robotics's Introduction

For Pre-Trained Vision Models in Motor Control, Not All Policy Learning Methods are Created Equal

This is a repository containing the code for the paper:

For Pre-Trained Vision Models in Motor Control, Not All Policy Learning Methods are Created Equal. ICML 2023 Yingdong Hu, Renhao Wang, Li Erran Li, and Yang Gao

Installation

Dependency Setup

  • Install the following libraries
sudo apt update
sudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3
  • Set up Environment
conda env create -f conda_env.yml
conda activate pvm
  • Install PyTorch, torchvision and timm following official instructions. For example:
conda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.6 -c pytorch -c conda-forge
pip install timm==0.4.5
  • Install MuJoCo version 2.1 and mujoco-py
  1. Please follow the instructions in the mujoco-py package.
  2. You should make sure that the GPU version of mujoco-py gets built, so that image rendering is fast. An easy way to ensure this is to clone the mujoco-py repository, change this line to Builder = LinuxGPUExtensionBuilder, and install from source by running pip install -e . in the mujoco-py root directory. You can also download our changed mujoco-py package and install from source.
  • Install Meta-World

Download the package from here.

pip install -e /path/to/dir/metaworld
  • Install Robosuite

We use the offline_study branch of Robosuite, dowload it from here.

pip install -e /path/to/dir/robosuite-offline_study
  • Install Franka-Kitchen

Please follow the instructions in the R3M repository. Unilke R3M, we only randomize the pose of the robot arm between episodes but not the kitchen. So be be sure to add the line

FIXED_ENTRY_POINT = RANDOM_ENTRY_POINT

here https://github.com/vikashplus/mj_envs/blob/stable/mj_envs/envs/relay_kitchen/__init__.py#L160. Note that we use RANDOM_ENTRY_POINT instead of RANDOM_DESK_ENTRY_POINT.

Download Pre-Trained Vision Models

Model Architecture Highlights Link
MoCo v2 ResNet-50 Contrastive learning, momentum encoder download
SwAV ResNet-50 Contrast online cluster assignments download
SimSiam ResNet-50 Without negative pairs download
DenseCL ResNet-50 Dense contrastive learning, learn local features download
PixPro ResNet-50 Pixel-level pretext task, learn local features download
VICRegL ResNet-50 Learn global and local features download
VFS ResNet-50 Encode temporal dynamics download
R3M ResNet-50 Learn visual representations for robotics download
VIP ResNet-50 Learn representations and reward for robotics download
MoCo v3 ViT-B/16 Contrastive learning for ViT download
DINO ViT-B/16 Self-distillation with no labels download
MAE ViT-B/16 Masked image modeling (MIM) download
iBOT ViT-B/16 Combine self-distillation with MIM download
CLIP ViT-B/16 Language-supervised pre-training download

After downloading a pre-trained vision model, place it under PVM-Robotics/pretrained/ folder. Please don't modify the file names of these checkpoints.

Download Expert Demonstrations

  • Download the expert demonstrations for all tasks from here.
  • Unzip expert_demos.zip and place the expert_demos directory into PVM-Robotics/expert_demos.
  • set the path/to/dir portion of the root_dir path variable in cfgs/config.yaml to the path of the PVM-Robotics repository.

Train Agents

Reinforcement learning

Meta-World

python train_rl.py \
agent=drqv2 \
suite=metaworld \
suite/metaworld_task=hammer \
agent.backbone=resnet \
agent.embedding_name=mocov2-resnet50 \
replay_buffer_size=500000 suite.num_seed_frames=4000 batch_size=512 \
use_wandb=true seed=1 exp_prefix=RL
  • suite/metaworld_task can be set to hammer, drawer_close, door_open, bin_picking, button_press_topdown, window_close, lever_pull, and coffee_pull.
  • When agent.backbone is set to resnet, agent.embedding_name can be set to mocov2-resnet50, simsiam-resnet50, swav-resnet50, densecl-resnet50, pixpro-resnet50, vicregl-resnet50, vfs-resnet50, r3m-resnet50, and vip-resnet50_VIPfc.
  • When agent.backbone is set to vit, agent.embedding_name can be set to mocov3-vit-b16, dino-vit-b16, ibot-vit-b16, clip-vit-b16, and mae-vit-b16.

Robosuite

python train_rl.py \
agent=drqv2 \
suite=robosuite \
suite/robosuite_task=panda_door \
agent.backbone=resnet \
agent.embedding_name=mocov2-resnet50 \
replay_buffer_size=500000 suite.num_seed_frames=4000 batch_size=512 \
use_wandb=true seed=1 exp_prefix=RL
  • suite/robosuite_task can be set to panda_door, panda_lift, panda_twoarm_peginhole, panda_pickplace_can, panda_nut_assembly_square, jaco_door, jaco_lift, and jaco_twoarm_peginhole.

Franka-Kitchen

python train_rl.py \
agent=drqv2 \
suite=kitchen \
suite/kitchen_task=turn_knob \
agent.backbone=resnet \
agent.embedding_name=mocov2-resnet50 \
num_train_frames_drq=1100000 replay_buffer_size=500000 suite.num_seed_frames=4000 batch_size=512 \
use_wandb=true seed=1 exp_prefix=RL
  • suite/kitchen_task can be set to turn_knob, turn_light_on, slide_door, open_door, and open_micro.
  • We train RL agents for 1.1M environment steps on Franka-Kitchen.

Imitation learning through behavior cloning

Meta-World

python train_bc.py \
agent=bc \
suite=metaworld \
suite/metaworld_task=hammer \
agent.backbone=resnet \
agent.embedding_name=mocov2-resnet50 \
num_demos=25 \
use_wandb=true seed=1 exp_prefix=BC
  • For Meta-World, the maximum value of num_demos is 25.

Robosuite

python train_bc.py \
agent=bc \
suite=robosuite \
suite/robosuite_task=panda_door \
agent.backbone=resnet \
agent.embedding_name=mocov2-resnet50 \
num_demos=50 \
use_wandb=true seed=1 exp_prefix=BC
  • For Robosuite, the maximum value of num_demos is 50.

Franka-Kitchen

python train_bc.py \
agent=bc \
suite=kitchen \
suite/kitchen_task=turn_knob \
agent.backbone=resnet \
agent.embedding_name=mocov2-resnet50 \
num_demos=25 \
use_wandb=true seed=1 exp_prefix=BC
  • For Franka-Kitchen, the maximum value of num_demos is 25.

Imitation learning with a visual reward function

Meta-World

python train_vrf.py \
agent=potil \
suite=metaworld \
suite/metaworld_task=hammer \
agent.backbone=resnet \
agent.embedding_name=mocov2-resnet50 \
bc_regularize=true num_demos=1 \
use_wandb=true seed=1 exp_prefix=VRF

Robosuite

python train_vrf.py \
agent=potil \
suite=robosuite \
suite/robosuite_task=panda_door \
agent.backbone=resnet \
agent.embedding_name=mocov2-resnet50 \
bc_regularize=true num_demos=1 \
use_wandb=true seed=1 exp_prefix=VRF

Franka-Kitchen

python train_vrf.py \
agent=potil \
suite=kitchen \
suite/kitchen_task=turn_knob \
agent.backbone=resnet \
agent.embedding_name=mocov2-resnet50 \
bc_regularize=true num_demos=1 \
use_wandb=true seed=1 exp_prefix=VRF

Acknowledgement

We have modified and integrated the code from ROT and DrQ-v2 into this project.

Citation

If you find this repository useful, please consider giving a star โญ and citation:

@article{hu2023pre,
  title={For Pre-Trained Vision Models in Motor Control, Not All Policy Learning Methods are Created Equal},
  author={Hu, Yingdong and Wang, Renhao and Li, Li Erran and Gao, Yang},
  journal={arXiv preprint arXiv:2304.04591},
  year={2023}
}

pvm-robotics's People

Contributors

yingdong-hu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

pvm-robotics's Issues

Real Robot

Do you have examples on how to train the policy from real demonstration and on real robot?

Expert demos

Is it possible to generate our own demos? if so, can you point me to how i can do that. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.