Code Monkey home page Code Monkey logo

hierarchicalprobabilistic3dhuman's Introduction

Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild

Akash Sengupta, Ignas Budvytis, Roberto Cipolla
ICCV 2021
[paper+supplementary][poster][results video]

This is the official code repository of the above paper, which takes a probabilistic approach to 3D human shape and pose estimation and predicts multiple plausible 3D reconstruction samples given an input image.

teaser

This repository contains inference, training and evaluation code. A few weaknesses of this approach, and future research directions, are listed below. If you find this code useful in your research, please cite the following publication:

@InProceedings{sengupta2021hierprobhuman,
               author = {Sengupta, Akash and Budvytis, Ignas and Cipolla, Roberto},
               title = {{Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild}},
               booktitle = {International Conference on Computer Vision},
               month = {October},
               year = {2021}                         
}

Installation

Requirements

  • Linux or macOS
  • Python ≥ 3.6

Instructions

We recommend using a virtual environment to install relevant dependencies:

python3 -m venv HierProbHuman
source HierProbHuman/bin/activate

Install torch and torchvision (the code has been tested with v1.6.0 of torch), as well as other dependencies:

pip install torch==1.6.0 torchvision==0.7.0
pip install -r requirements.txt

Finally, install pytorch3d, which we use for data generation during training and visualisation during inference. To do so, you will need to first install the CUB library following the instructions here. Then you may install pytorch3d - note that the code has been tested with v0.3.0 of pytorch3d, and we recommend installing this version using:

pip install "git+https://github.com/facebookresearch/[email protected]"

IMPORTANT: if you would like to use pytorch3d v0.5.0 or greater, you will need to slightly modify the camera models in the renderer class definition, as outlined here.

Model files

You will need to download the SMPL model. The neutral model is required for training and running the demo code. If you want to evaluate the model on datasets with gendered SMPL labels (such as 3DPW and SSP-3D), the male and female models are available here. You will need to convert the SMPL model files to be compatible with python3 by removing any chumpy objects. To do so, please follow the instructions here.

Download pre-trained model checkpoints for our 3D Shape/Pose network, as well as for 2D Pose HRNet-W48 from here. In addition to the neutral-gender prediction network presented in the paper, we provide pre-trained checkpoints for male and female prediction networks, which are trained with male/female SMPL shape respectively. Download these checkpoints if you wish to do gendered shape inference.

Place the SMPL model files and network checkpoints in the model_files directory, which should have the following structure. If the files are placed elsewhere, you will need to update configs/paths.py accordingly.

HierarchicalProbabilistic3DHuman
├── model_files                                  # Folder with model files
│   ├── smpl
│   │   ├── SMPL_NEUTRAL.pkl                     # Gender-neutral SMPL model
│   │   ├── SMPL_MALE.pkl                        # Male SMPL model
│   │   ├── SMPL_FEMALE.pkl                      # Female SMPL model
│   ├── poseMF_shapeGaussian_net_weights.tar     # Pose/Shape distribution predictor checkpoint
│   ├── pose_hrnet_w48_384x288.pth               # Pose2D HRNet checkpoint
│   ├── cocoplus_regressor.npy                   # Cocoplus joints regressor
│   ├── J_regressor_h36m.npy                     # Human3.6M joints regressor
│   ├── J_regressor_extra.npy                    # Extra joints regressor
│   └── UV_Processed.mat                         # DensePose UV coordinates for SMPL mesh             
└── ...

Inference

run_predict.py is used to run inference on a given folder of input images. For example, to run inference on the demo folder, do:

python run_predict.py --image_dir ./demo/ --save_dir ./output/ --visualise_samples --visualise_uncropped

This will first detect human bounding boxes in the input images using Mask-RCNN. If your input images are already cropped and centred around the subject of interest, you may skip this step using --cropped_images as an option. The 3D Shape/Pose network is somewhat sensitive to cropping and centering - this is a good place to start troubleshooting in case of poor results.

If the gender of the subject is known, you may wish to carry out gendered inference using the provided male/female model weights. This can be done by modifying the above command as follows:

python run_predict.py --gender male --pose_shape_weights model_files/poseMF_shapeGaussian_net_weights_male.tar --image_dir ./demo/ --save_dir ./output_male/ --visualise_samples --visualise_uncropped

(similar for the female model). Using gendered models for inference may result in better body shape estimates, as it serves as a prior over 3D shape.

Inference can be slow due to the rejection sampling procedure used to estimate per-vertex 3D uncertainty. If you are not interested in per-vertex uncertainty, you may modify predict/predict_poseMF_shapeGaussian_net.py by commenting out code related to sampling, and use a plain texture to render meshes for visualisation (this will be cleaned up and added as an option to in the run_predict.py future).

Evaluation

run_evaluate.py is used to evaluate our method on the 3DPW and SSP-3D datasets. A description of the metrics used to measure performance is given in metrics/eval_metrics_tracker.py.

Download SSP-3D from here. Update configs/paths.py with the path pointing to the un-zipped SSP-3D directory. Evaluate on SSP-3D with:

python run_evaluate.py -D ssp3d

Download 3DPW from here. You will need to preprocess the dataset first, to extract centred+cropped images and SMPL labels (adapted from SPIN):

python data/pw3d_preprocess.py --dataset_path $3DPW_DIR_PATH

This should create a subdirectory with preprocessed files, such that the 3DPW directory has the following structure:

$3DPW_DIR_PATH
      ├── test                                  
      │   ├── 3dpw_test.npz    
      │   ├── cropped_frames   
      ├── imageFiles
      └── sequenceFiles

Additionally, download HRNet 2D joint detections on 3DPW from here, and place this in $3DPW_DIR_PATH/test. Update configs/paths.py with the path pointing to $3DPW_DIR_PATH/test. Evaluate on 3DPW with:

python run_evaluate.py -D 3dpw

The number of samples used to evaluate sample-related metrics can be changed using the --num_samples option (default is 10).

Training

run_train.py is used to train our method using random synthetic training data (rendered on-the-fly during training).

Download .npz files containing SMPL training/validation body poses and textures from here. Place these files in a ./train_files directory, or update the appropriate variables in configs/paths.py with paths pointing to the these files. Note that the SMPL textures are from SURREAL and MultiGarmentNet.

We use images from LSUN as random backgrounds for our synthetic training data. Specifically, images from the 10 scene categories are used. Instructions to download and extract these images are provided here. The copy_lsun_images_to_train_files_dir.py script can be used to copy LSUN background images to the ./train_files directory, which should have the following structure:

train_files
      ├── lsun_backgrounds
          ├── train
          ├── val
      ├── smpl_train_poses.npz
      ├── smpl_train_textures.npz                                  
      ├── smpl_val_poses.npz                                  
      └── smpl_val_textures.npz                                  

Finally, start training with:

python run_train.py -E experiments/exp_001

As a sanity check, the script should find 91106 training poses, 125 + 792 training textures, 397582 training backgrounds, 33347 validation poses, 32 + 76 validation textures and 3000 validation backgrounds.

Weaknesses and Future Research

The following aspects of our method may be the subject of future research:

  • Mesh interpenetrations: this occurs occasionally amongst 3D mesh samples drawn from shape and pose distribution predictions. A sample inter-penetratation penalty may be useful.
  • Sample diversity / distribution expressiviness: since the predicted distributions are uni-modal, sample diversity may be limited.
  • Sampling speed: rejection sampling from a matrix-Fisher distribution is currently slow.
  • Non-tight clothing: body shape prediction accuracy suffers when subjects are wearing non-tight clothing, since the synthetic training data does not model clothing in 3D (only uses clothing textures). Perhaps better synthetic data (e.g. AGORA) will alleviate this issue.

Acknowledgments

Code was adapted from/influenced by the following repos - thanks to the authors!

hierarchicalprobabilistic3dhuman's People

Contributors

akashsengupta1997 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hierarchicalprobabilistic3dhuman's Issues

run_predict.py generates output with same image

Hello
first of all - nice Job.

I am doing right now a research project regarding self collision and i would like to use your repo as a baseline to generate 3d avatars of people.

I ran your run_predict.py successfully but noticed that the output image contains multiple smaller, identical images.
I was expecting something similar as the teaser.png where u have the estimated 3d mesh of the person.

Could you please tell me how to reproduce this visualization? I would like to get the 3d mesh and use it as a mask to overlap with a pointcloud..

I would appreciate your help

Thanks

Downloading time of the background file

Dear akash,

Thank you very much for the amazing work! When I tried to download the background files, it said I would probably need two weeks due to the slow downloading speed in google colab (I tried several different high-speed internets). Can I ask how you download them, or do you have a downloaded version already in your google drive that you can share?

Thanks again! :)

Data extraction

HI! Great job with the project.
How can I extract data such as the blenshapes so I can export it?
Thanks for your help!

render problem

Hello,

When I run the Inference Step,
I got:
Traceback (most recent call last):
File "run_predict.py", line 126, in
gender=args.gender)
File "run_predict.py", line 90, in run_predict
visualise_samples=visualise_samples)
File "/Users/mac/Desktop/reconstruction/predict/predict_poseMF_shapeGaussian_net.py", line 197, in predict_poseMF_shapeGaussian_net
verts_features=vertex_var_colours)
File "/opt/anaconda3/envs/ll/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/Users/mac/Desktop/reconstruction/renderers/pytorch3d_textured_renderer.py", line 275, in forward
fragments = self.rasterizer(meshes_iuv, cameras=self.cameras)
File "/opt/anaconda3/envs/ll/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/anaconda3/envs/ll/lib/python3.6/site-packages/pytorch3d/renderer/mesh/rasterizer.py", line 151, in forward
cull_backfaces=raster_settings.cull_backfaces,
File "/opt/anaconda3/envs/ll/lib/python3.6/site-packages/pytorch3d/renderer/mesh/rasterize_meshes.py", line 150, in rasterize_meshes
cull_backfaces,
File "/opt/anaconda3/envs/ll/lib/python3.6/site-packages/pytorch3d/renderer/mesh/rasterize_meshes.py", line 205, in forward
cull_backfaces,
RuntimeError: NOT IMPLEMENTED
"

Do you have any idea on it?

error shape when get part render

when i want to training straps,i find a problem,i put the vertices and cam_t in renderer,can not get the true shape ,the shape is not (B,N,W,H),just a image(3,256,256),your code wirte that parts, _, mask = self.renderer(vertices, self.faces, self.textures, t=cam_ts),but i can not get three value,just one value which shape is 3,256,256

Projecting predicted vertices given predicted camera

Hi,

I'm interested in projecting the predicted vertices given the predicted camera. There is a a functionality in STRAPS for this:

                pred_vertices2d = orthographic_project_torch(pred_vertices, pred_cam_wp)
                pred_vertices2d = undo_keypoint_normalisation(pred_vertices2d,
                                                              proxy_rep_input_wh)

Is there something similar in HierarchicalProbabilistic3DHuman? I would like to avoid using body_vis_renderer.

Thanks!

project SMPL vertices to 3d space with respective camera matrix

Hi,

i would like to use your code for my research work but I encounter some problems and require some support.

My ultimate goal is to get the 3d coordinates of the vertices with my own camera matrix K. I want later on make selective body region selection and and need the 3d coordinates of the Human body to do so.

Acoording to your code, the SMPL vertices in the proxy representation and i would like to get the 3d coordinates somehow out of it.

Could you tell me how to do the projection from the proxy to the 3d coordinates using a specific camera matrix.

I would appreciate your help.

thanks

Running on Headless System

Hey,

After getting the STRAPS code to work i was highly impressed, and thought this Programm would be much better for the purpose of my Thesis. I was wondering if there was maybe a possibility to run HP3DH on a Headless ssh server the same as it is possible in STRAPS

Best

Training time

Hi,

Thank you for the amazing work!

I was wondering how long does the synthetic data training take, considering there is no DataParallel/DistributedDataParallel implementation.
Thank you in advance!

Traing with my own dataset

Thank you for the great work.
Is it possible to train with my own dataset?. If so, can you give me some prepossessing tips?

Why learn scale?

Dear Akash,

I read your paper, as well as other papers from the series ([2], [3], [4]). :) It's interesting that your attempts are one of the few trying to recover an accurate shape of the person, instead of mostly focusing on a 3D pose.

However, the crucial question that is still bothering me is why do you use unnormalized loss components, such verts3d and joints3d? I guess you already thought about that and decided to still force the model to learn the scale. The advantage that I see using this approach is the regularization effect, where you confuse the model and then hope that it will still manage to somehow figure out the scale, i.e., height of the person.

Thanks in advance!

About the distribution of relative rotation

Hi...
Thanks for the great job!
And I have a question:
In part <3.4. Body shape and pose distribution prediction>,

Here, each joint is modelled independently of all the other joints. Thus, the matrix parameter of the i-th joint, F_i , is a function of the input X only.

As you mentioned in the paper that the probability density function of each joint’s relative rotation matrix is a matrix-Fisher
distribution conditioned on the parents of that joint in the kinematic tree.
So I am confused that why you can model each joint independently as "each joint is modelled independently of all the other joints" ....

not getting 3d shape

Hello, I am running prediction script with no errors but I only get hrnet pose heatmap. The other images do not have SMPL model overlaid. Am I missing something?

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.