Code Monkey home page Code Monkey logo

diverse_sampling's Introduction

Diverse Sampling

PWC

PWC

Official pytorch project of ACMMM2022 accepted paper "Diverse Human Motion Prediction via Gumbel-Softmax Sampling from an Auxiliary Space".

[Paper & Supp] [Poster PDF] [Poster PPT] [Slides] [Video] [Audio] [Subtitles]

Notes:

Authors

  1. Lingwei Dang, School of Computer Science and Engineering, South China University of Technology, China, [email protected]
  2. Yongwei Nie, School of Computer Science and Engineering, South China University of Technology, China, [email protected]
  3. Chengjiang Long, Meta Reality Lab, USA, [email protected]
  4. Qing Zhang, School of Computer Science and Engineering, Sun Yat-sen University, China, [email protected]
  5. Guiqing Li, School of Computer Science and Engineering, South China University of Technology, China, [email protected]

Abstract

    Diverse human motion prediction aims at predicting multiple possible future pose sequences from a sequence of observed poses. Previous approaches usually employ deep generative networks to model the conditional distribution of data, and then randomly sample outcomes from the distribution. While different results can be obtained, they are usually the most likely ones which are not diverse enough. Recent work explicitly learns multiple modes of the conditional distribution via a deterministic network, which however can only cover a fixed number of modes within a limited range. In this paper, we propose a novel sampling strategy for sampling very diverse results from an imbalanced multimodal distribution learned by a deep generative model. Our method works by generating an auxiliary space and smartly making randomly sampling from the auxiliary space equivalent to the diverse sampling from the target distribution. We propose a simple yet effective network architecture that implements this novel sampling strategy, which incorporates a Gumbel-Softmax coefficient matrix sampling method and an aggressive diversity promoting hinge loss function. Extensive experiments demonstrate that our method significantly improves both the diversity and accuracy of the samplings compared with previous state-of-the-art sampling approaches.

Overview

mmfp0856_A0poster_horizontal.png

We propose a novel sampling method converts the sampling of the distribution into randomly sampling of points from an auxiliary space for diverse and accurate sampling.

Dependencies

Nvidia RTX 3090
Python                 3.9.7
matplotlib             3.5.0
numpy                  1.20.3
opencv-python          4.5.4.60
pandas                 1.4.2
PyYAML                 6.0
tensorboard            2.7.0
tensorboardX           2.4.1
torch                  1.10.0+cu113
torchvision            0.11.1+cu113
scipy                  1.7.2
scikit-learn           1.0.1

Get the data and pretrained models

Dataset and pretrained models can be found via the Diverse Sampling Resources Link, download them and then

  • unzip dataset.zip to ./dataset
  • unzip pretrained.zip to ./ckpt/pretrained
  • unzip classifier.zip to ./ckpt/classifier

then the dictionary becomes:

diverse_sampling
├─dataset
│  │  .gitignore
│  │  data_3d_h36m.npz
│  │  data_3d_h36m_test.npz
│  │  data_3d_humaneva15.npz
│  │  data_3d_humaneva15_test.npz
│  │  h36m_valid_angle.p
│  │  humaneva_valid_angle.p
│  ├─data_multi_modal
│  │      data_candi_t_his25_t_pred100_skiprate20.npz
│  │      t_his25_1_thre0.500_t_pred100_thre0.100_filtered_dlow.npz
│  └─humaneva_multi_modal
│          data_candi_t_his15_t_pred60_skiprate15.npz
│          t_his15_1_thre0.500_t_pred60_thre0.010_index_filterd.npz
└─ckpt
   ├─classifier
   │     .gitignore
   │      h36m_classifier.pth
   │      humaneva_classifier.pth
   └─pretrained
           .gitignore
           h36m_t1.pth
           h36m_t2.pth
           humaneva_t1.pth
           humaneva_t2.pth   

Evaluatation

  • evaluate on Human3.6M:

    python main.py --exp_name=h36m_t2 --is_load=1 --model_path=ckpt/pretrained/h36m_t2.pth

  • evaluate on HumanEva-I:

    python main.py --exp_name=humaneva_t2 --is_load=1 --model_path=ckpt/pretrained/humaneva_t2.pth

Calculate perceptual scores (FID and ACC)

  • For Human3.6M:

    python main_classifier.py --exp_name=h36m_t2

  • For HumanEva-I:

    python main_classifier.py --exp_name=humaneva_t2

Train

  • train on Human3.6M:

    • train CVAE on Human3.6M:

      python main.py --exp_name=h36m_t1 --is_train=1

    • train DiverseSampling on Human3.6M:

      python main.py --exp_name=h36m_t2 --is_train=1

  • train on HumanEva-I:

    • train CVAE on HumanEva-I:

      python main.py --exp_name=humaneva_t1 --is_train=1

    • train DiverseSampling on HumanEva-I:

      python main.py --exp_name=humaneva_t2 --is_train=1

Citation

If you use our code, please cite our work

@inproceedings{dang2022diverse,
  title={Diverse Human Motion Prediction via Gumbel-Softmax Sampling from an Auxiliary Space},
  author={Dang, Lingwei and Nie, Yongwei and Long, Chengjiang and Zhang, Qing and Li, Guiqing},
  booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
  pages={5162--5171},
  year={2022}
}

Acknowledgments

We follow the code framework of our previous work MSR-GCN (ICCV 2021), and some code was adapted from DLow by Ye Yuan, and GSPS by Wei Mao.

Licence

MIT

diverse_sampling's People

Contributors

droliven avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

escapefreeg

diverse_sampling's Issues

Bug in the evaluation

Hi Droliven,

If I'm not wrong, there might be a small issue in the evaluation function. When iterating the data generator, if the number of multi-modal GT is 1, the evaluation will be bypassed #L280-L281, so in the final metric computation #L369-L37 the summed ADE/FDE ... should be divided by the real total number of data examples that have been used in the evaluation (should be a smaller number than i+1). After correcting this issue, the final results from the proposed pretrained models should be like this:

HumanEva
| ADE | FDE | MMADE | MMFDE |
| 0.234 | 0.247 | 0.350 | 0.327 |

Human3.6M
| ADE | FDE | MMADE | MMFDE |
| 0.378 | 0.495 | 0.483 | 0.525 |

Best,

Issue on computing MMADE and MMFDE

Hi Droliven,

Thanks for your impressive work on human motion generation and the code contribution to the community. I just have a few concerns about the computation in MMADE and MMFDE, it seems that you use some of the code from gsps Generating Smooth Pose Sequences for Diverse Human Motion Prediction in computing the evaluation metrics, however, what I do not understand is that in gsps for the computation of MMADE and MMFDE, they do not include the results from the ground truth gsps#L230-L245, while you did in your code #L309-L343 and #L320-L354. Is there any special reason here? Thanks for your attention in advance.

Best,
Xiaoyu

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.