This repo is the official implementation of the paper "FLEX: Full-Body Grasping Without Full-Body Grasps".

FLEX: Full-Body Grasping Without Full-Body Grasps

Purva Tendulkar · Dídac Surís · Carl Vondrick

FLEX is a generative model that generates full-body avatars grasping 3D objects in a 3D environment. FLEX leverages the existence of pre-trained prior models for:

Full-Body Pose - VPoser (trained on the AMASS dataset)
Right-Hand Grasping - GrabNet (trained on right-handed grasps of the GRAB dataset)
Pose-Ground Relation - PGPrior (trained on the AMASS dataset)

For more details please refer to the Paper or the project website.

Description
Requirements
Installation
Getting Started
Examples
Citation
Acknowledgments
Contact

Description

This implementation:

Can run FLEX on arbitrary objects in arbitrary scenes provided by users.
Can run FLEX on the test objects of the GRAB dataset (with pre-computed object centering and BPS representation).

Requirements

This package has been tested for the following:

Installation

To install the dependencies please follow the next steps:

Clone this repository:

git clone https://github.com/purvaten/FLEX.git
cd FLEX

Install the dependencies by the following commands:

conda create -n flex python=3.7.11
conda activate flex
conda install pytorch==1.10.1 torchvision torchaudio cudatoolkit=11.3 -c pytorch
conda install pytorch3d -c pytorch3d
conda install meshplot
conda install -c conda-forge jupyterlab
pip install -r requirements.txt
pip install kaolin==0.12.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-1.10.1_cu113.html

Getting started

In order to run FLEX, create a data/ directory and follow the steps below:

ReplicaGrasp Dataset

Download the Habitat receptacle meshes info from here. This is dictionary where the keys are the names of the receptacles, and the values are a list of [vertices, faces] for different configurations of that receptacle (e.g., with doors open, doors closed, drawer opened, etc.)
Download the main dataset from here. This is a dictionary where the keys are the names of the example instances, and the values are the [object_translation, object_orientation, recept_idx] where recept_idx is the index to the receptacle configuration in receptacles.npz.
Store both files under FLEX/data/replicagrasp/.
To visualize random instances of the dataset, run the notebook FLEX/flex/notebooks/viz_replicagrasp.ipynb.

Dependecy Files

Download the SMPL-X and MANO models from the SMPL-X website and MANO website.
Download the GRAB object mesh (.ply) files and BPS points (bps.npz) from the GRAB website. Download obj_info.npy from here.
Download full-body related desiderata here.
The final structure of data should look as below:

    FLEX
    ├── data
    │   │
    │   ├── smplx_models
    │   │       ├── mano
    │   │       │     ├── MANO_LEFT.pkl
    │   │       │     ├── MANO_RIGHT.pkl
    │   │       └── smplx
    │   │             ├── SMPLX_FEMALE.npz
    │   │             └── ...
    │   ├── obj
    │   │    ├── obj_info.npy
    │   │    ├── bps.npz
    │   │    └── contact_meshes
    │   │             ├── airplane.ply
    │   │             └── ...
    │   ├── sbj
    │   │    ├── adj_matrix_original.npy
    │   │    ├── adj_matrix_simplified.npy
    │   │    ├── faces_simplified.npy
    │   │    ├── interesting.npz
    │   │    ├── MANO_SMPLX_vertex_ids.npy
    │   │    ├── sbj_verts_region_mapping.npy
    │   │    └── vertices_simplified_correspondences.npy
    │   │
    │   └── replicagrasp
    │        ├── dset_info.npz
    │        └── receptacles.npz
    .
    .

Pre-trained Checkpoints

Download the VPoser prior (VPoser v2.0) from the SMPL-X website.
Download the checkpoints of the hand-grasping pre-trained model (coarsenet.pt and refinenet.pt) from the GRAB website.
Download the pose-ground prior from here.
Place all pre-trained models in FLEX/flex/pretrained_models/ckpts as follows:

    ckpts
    ├── vposer_amass
    │   │
    │   ├── snapshots
    │   │       └── V02_05_epoch=13_val_loss=0.03
    │   ├── V02_05.log
    │   └── V02_05.yaml
    │
    ├── coarsenet.pt
    ├── refinenet.pt
    └── pgp.pth

Examples

After installing the FLEX package, dependencies, and downloading the data and the models, you should be able to run the following examples:

Generate whole-body grasps for ReplicaGrasp.
```
python run.py \
--obj_name stapler \
--receptacle_name receptacle_aabb_TvStnd1_Top3_frl_apartment_tvstand \
--ornt_name all \
--gender 'female'
```
The result will be saved in FLEX/save. The optimization for an example should take 7-8 minutes on a single RTX Ti 2080.
Visualize the result by running the jupyter notebook FLEX/flex/notebooks/viz_results.ipynb.

Citation

@inproceedings{tendulkar2022flex,
    title = {FLEX: Full-Body Grasping Without Full-Body Grasps},
    author = {Tendulkar, Purva and Sur\'is, D\'idac and Vondrick, Carl},
    booktitle = {Conference on Computer Vision and Pattern Recognition ({CVPR})},
    year = {2023},
    url = {https://flex.cs.columbia.edu/}
}

Acknowledgments

This research is based on work partially supported by NSF NRI Award #2132519, and the DARPA MCS program under Federal Agreement No. N660011924032. Dídac Surís is supported by the Microsoft PhD fellowship. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the sponsors.

We thank: Alexander Clegg for helping with Habitat-related questions and Harsh Agrawal for helpful discussions and feedback.

This template was adapted from the GitHub repository of GOAL.

Contact

The code of this repository was implemented by Purva Tendulkar and Dídac Surís.

For questions, please contact [email protected].

Multi-gpu implementation

Hi @purvaten
I am running the code on 1 A100, and it's taking 8 mins to generate poses when the receptacle is receptacle_aabb_TvStnd1_Top3_frl_apartment_tvstand which has 656 faces. Does this look normal?

I was trying to reduce the execution time by adding multi-gpu support, but I'm unable to do it successfully.
Once we get the model over here, I am wrapping it by torch.nn.DataParallel like so

all_devices = [0,1,2,3]
model = nn.DataParallel(model, device_ids=all_devices)

This model will hold all the pre-trained models that we load and also the MLP which is to be trained.

However, I get this error

Reduced version:
File "/home/t2/FLEX/flex/tto/inf_opt.py", line 405, in gan_loss
    rh_match_loss, bm_output, rv, rf = self.get_rh_match_loss(z, transl, global_orient, w, a, extras)
File "/home/t2/FLEX/flex/tto/inf_opt.py", line 87, in get_rh_match_loss
    sbj_pose = self.gan_body.decode(z)['pose_body'].reshape(bs,-1)                                                    # (b, 63)
  File "/home/t2/FLEX/flex/models/vposer_model.py", line 126, in decode
    prec = self.decoder_net(Zin)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! (when checking argument for argument mat1 in method wrapper_addmm)

When I check, it shows Zin device is cuda:1 and vposer's self.decoder_net's weights are in cuda:0.
I am not able to debug why this is the case. As our model has vposer model and we are wrapping the model with DataParallel, pytorch should manage the data sharing across gpus.

Can you please add support for multi-gpu or even help me debug this? It would really help a lot!

Thanks!

purvaten / flex Goto Github PK

flex's Introduction

FLEX: Full-Body Grasping Without Full-Body Grasps

Table of Contents

Description

Requirements

Installation

Getting started

ReplicaGrasp Dataset

Dependecy Files

Pre-trained Checkpoints

Examples

Generate whole-body grasps for ReplicaGrasp.

Visualize the result by running the jupyter notebook FLEX/flex/notebooks/viz_results.ipynb.

Citation

Acknowledgments

Contact

flex's People

Contributors

Stargazers

Watchers

Forkers

flex's Issues

Recommend Projects

Recommend Topics

Recommend Org

Visualize the result by running the jupyter notebook `FLEX/flex/notebooks/viz_results.ipynb`.