Code Monkey home page Code Monkey logo

3d-crowdpose-estimation-based-on-mvg's Introduction

Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry

Created by He (Crane) Chen*, Pengfei Guo* and Pengfei Li when working at the Johns Hopkins University.

Citation

Please cite this paper if you find the repository helpful:

@article{chen2020multi, title={Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry}, author={Chen, He and Guo, Pengfei and Li, Pengfei and Lee, Gim Hee and Chirikjian, Gregory}, journal={arXiv preprint arXiv:2007.10986}, year={2020} }

Introduction

This work is based on our paper Multi-person 3D Pose Estimation in Crowded Scenes Based on Multi-View Geometry, which appeared at European Conference on Computer Vision (ECCV) 2020 Spotlight.

You can check our paper for furtuer details.

We propose a 3D crowd human pose estimation method based on multi-view geometry. Specifically, we target at overcoming the bottlenecks when departing from multi-person 3D pose estimation problem and pushing it further to dense crowd 3D pose estimation problem. Epipolar constraint is at the core for key point matching across views. However, effectiveness of this formulation is frequently challenged for denser crowd. Based on this observation, we proposed our method.

The pipeline takes images from multiple calibrated RGB cameras as input. Firstly, a human detector is used to produce bounding boxes. Secondly, a modified SPPE network, which keeps multiple peaks in one heatmap to predict occluded joints, is used to estimate 2D joints, with attention placed on feet joints. Thirdly, a combinatorial optimization problem is solved to achieve people matching across views. Finally, MLE is applied to reach a good initialization of 3D human pose estimation before the final result is obtained from MAP optimization.

In the pipeline, cross-view correspondence problem is the bottleneck. A graphical model is developed for fast cross-view matching. Instead of exhaustive searching on pixel space, matching is carried out for 2D joints. In fact, we push it further by focusing on feet joints so that matching process is significantly sped up. The idea is to use homography matrix to warp the ground among views, so that everything on the ground surface would also be warped to another view together with the ground, which include the `feet' joints. To this light, the problem of people matching boils down to feet assignment. The metric we defined to calculate the cost on each edge is related to three elements, foot location, stride size, and stride direction.

Note

Our method is a 3-step approach. (1) 2D pose detection, (2) matching, (3) 3D reconstruction. Data is passed through these three steps by JSON files. Each step can also be used separately or plugged into other pipelines.

Prerequisites

  • Ubuntu 16.04
  • Python 3.6
  • Pytorch 1.0.1

Acknowledgement

For 2D pose detection, our proposed loss function is plugged into CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark. Please check their repo for installation and further information.

3d-crowdpose-estimation-based-on-mvg's People

Contributors

guopengf avatar hecranechen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

3d-crowdpose-estimation-based-on-mvg's Issues

Maybe something wrong with the 2D annotations?

Thanks for the great work and for releasing the annotations!
But I found there exist some wrong annotations in Wildtrack dataset after I visualize it.
For example, in C1 00000050.png, there are some redundant detections.
wildtrack-C1_0050

In C1/00000380.png, the feet keypoints were labeled incorrectly.
wildtrack-C1-380

Other frames have similar wrong annotations.

Code release plan

Hi, this is really wonderful work. We are waiting to release of code.
Is there any deadline to release the code?

Optimization code different from paper?

Hi, I would just like to confirm that the optimization code provided is different than what is shown in equations 6 and 7 of the paper. I was not able to find the Gaussian distributions in the code. In case this is available, would you please let me know where can I find it in the code? Thank you.

About 2D pose estimation

Hi He,

Thanks for the great work and releasing the code!

Is the 2D pose estimation code included in this repo? After I look into the original repo of CrowdPose (or AlphaPose), I also can't figure out how to run the CrowdPose with their code. The --matching argument they stated for running CrowdPose seems not working. It would be great if you could provide the code for the modified SPPE and some pretrained model weights.

How to run the code?

Thanks for releasing your code!!

It will be great if you could add additional steps about how to run the code and a demo file as well.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.