mathux / actor Goto Github PK

Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021

License: MIT License

Shell 0.58% Python 99.42%

actor's Introduction

ACTOR

Official Pytorch implementation of the paper "Action-Conditioned 3D Human Motion Synthesis with Transformer VAE", ICCV 2021.

Please visit our webpage for more details.

Bibtex

If you find this code useful in your research, please cite:

@INPROCEEDINGS{petrovich21actor,
  title     = {Action-Conditioned 3{D} Human Motion Synthesis with Transformer {VAE}},
  author    = {Petrovich, Mathis and Black, Michael J. and Varol, G{\"u}l},
  booktitle = {International Conference on Computer Vision (ICCV)},
  year      = {2021}
}

Installation 👷

1. Create conda environment

conda env create -f environment.yml
conda activate actor

Or install the following packages in your pytorch environnement:

pip install tensorboard
pip install matplotlib
pip install ipdb
pip install sklearn
pip install pandas
pip install tqdm
pip install imageio
pip install pyyaml
pip install smplx
pip install chumpy

The code was tested on Python 3.8 and PyTorch 1.7.1.

2. Download the datasets

For all the datasets, be sure to read and follow their license agreements, and cite them accordingly.

For more information about the datasets we use in this research, please check this page, where we provide information on how we obtain/process the datasets and their citations. Please cite the original references for each of the datasets as indicated.

Please install gdown to download directly from Google Drive and then:

bash prepare/download_datasets.sh

Update: Unfortunately, the NTU13 dataset (derived from NTU) is no longer available.

3. Download some SMPL files

bash prepare/download_smpl_files.sh

This will download the SMPL neutral model from this github repo and additionnal files.

If you want to integrate the male and the female versions, you must:

Download the models from the SMPL website
Move them to models/smpl
Change the SMPL_MODEL_PATH variable in src/config.py accordingly.

4. Download the action recogition models

bash prepare/download_recognition_models.sh

Action recognition models are used to extract motion features for evaluation.

For NTU13 and HumanAct12, we use the action recognition models directly from Action2Motion project.

For the UESTC dataset, we train an action recognition model using STGCN, with this command line:

python -m src.train.train_stgcn --dataset uestc --extraction_method vibe --pose_rep rot6d --num_epochs 100 --snapshot 50 --batch_size 64 --lr 0.0001 --num_frames 60 --view all --sampling conseq --sampling_step 1 --glob --no-translation --folder recognition_training

How to use ACTOR 🚀

NTU13

Training

python -m src.train.train_cvae --modelname cvae_transformer_rc_rcxyz_kl --pose_rep rot6d --lambda_kl 1e-5 --jointstype vertices --batch_size 20 --num_frames 60 --num_layers 8 --lr 0.0001 --glob --translation --no-vertstrans --dataset DATASET --num_epochs 2000 --snapshot 100 --folder exp/ntu13

HumanAct12

Training

python -m src.train.train_cvae --modelname cvae_transformer_rc_rcxyz_kl --pose_rep rot6d --lambda_kl 1e-5 --jointstype vertices --batch_size 20 --num_frames 60 --num_layers 8 --lr 0.0001 --glob --translation --no-vertstrans --dataset humanact12 --num_epochs 5000 --snapshot 100 --folder exps/humanact12

UESTC

Training

python -m src.train.train_cvae --modelname cvae_transformer_rc_rcxyz_kl --pose_rep rot6d --lambda_kl 1e-5 --jointstype vertices --batch_size 20 --num_frames 60 --num_layers 8 --lr 0.0001 --glob --translation --no-vertstrans --dataset uestc --num_epochs 1000 --snapshot 100 --folder exps/uestc

Evaluation

python -m src.evaluate.evaluate_cvae PATH/TO/checkpoint_XXXX.pth.tar --batch_size 64 --niter 20

This script will evaluate the trained model, on the epoch XXXX, with 20 different seeds, and put all the results in PATH/TO/evaluation_metrics_XXXX_all.yaml.

If you want to get a table with mean and interval, you can use this script:

python -m src.evaluate.tables.easy_table PATH/TO/evaluation_metrics_XXXX_all.yaml

Pretrained models

You can download pretrained models with this script:

bash prepare/download_pretrained_models.sh

Visualization

Grid of stick figures

 python -m src.visualize.visualize_checkpoint PATH/TO/CHECKPOINT.tar --num_actions_to_sample 5  --num_samples_per_action 5

Each line corresponds to an action. The first column on the right represents a movement of the dataset, and the second column represents the reconstruction of the movement (via encoding/decoding). All other columns on the left are generations with random noise.

Example

Generating and rendering SMPL meshes

Additional dependencies

pip install trimesh
pip install pyrender
pip install imageio-ffmpeg

Generate motions

python -m src.generate.generate_sequences PATH/TO/CHECKPOINT.tar --num_samples_per_action 10 --cpu

It will generate 10 samples per action, and store them in PATH/TO/generation.npy.

Render motions

python -m src.render.rendermotion PATH/TO/generation.npy

It will render the sequences into this folder PATH/TO/generation/.

Examples

Pickup	Raising arms	High knee running	Bending torso	Knee raising

Overview of the available models

List of models

modeltype	architecture	losses
cvae	fc	rc
	gru	rcxyz
	transformer	kl

Construct a model

Follow this: {modeltype}_{architecture} + "_".join(*losses)

For example for the cvae model with Transformer encoder/decoder and with rc, rcxyz and kl loss, you can use: --modelname cvae_transformer_rc_rcxyz_kl.

License

This code is distributed under an MIT LICENSE.

Note that our code depends on other libraries, including SMPL, SMPL-X, PyTorch3D, and uses datasets which each have their own respective licenses that must also be followed.

actor's People

Contributors

Stargazers

Watchers

Forkers

forkkit ajinkyapuar barvin04 thuxyz19 paper-implementation sirui-xu hoangpm-2081 xishen0220 quanzhou-li guytevet popmeshgrid anhquancao yonghoonkwon neptune-trojans human2b tim-blo zhengyiluo immocat briang13 haiantyz shrave secutron ttt496 expert68 sunny77889 janwaltl bewithme xliu443 exitudio peterzs jiahongwu1995 luomingshuang zoukaifeng hologerry evm7 ritaank hendriktpl hendrik-deepxid ht-dev-id wangxiaoshawn zc1213856 sigal-raab 2132660698 paperwave lyngq hannah-chou bruinxiong shunliu01 xinyangjiang rma17 song-jinshui

actor's Issues

Training result of UETSC dataset differs from the pertained model

Hi,
first of all, Thank you for releasing the nicely arranged code.
Released code really helped understanding of the model, and I appreciate kindly written instructions for the code.

However, I encountered undesirable result on UETSC dataset, which comes from the model I trained from the scratch.
I followed instruction on the dataset, also did not modify training script.
python -m src.train.train_cvae --modelname cvae_transformer_rc_rcxyz_kl --pose_rep rot6d --lambda_kl 1e-5 --jointstype vertices --batch_size 20 --num_frames 60 --num_layers 8 --lr 0.0001 --glob --translation --no-vertstrans --dataset uestc --num_epochs 1000 --snapshot 100 --folder exps/uestc

I double-checked whether saved opt.yml file from the experiment is identical to the pretrained model, but I found that no configurations were different.
Also, I conducted the same experiment twice but results were identical.

This is the train log of the released pretrained model,

and these are the train logs from the trainings I conducted.

As my training result on HumanAct dataset is same with the pre-trained model, I guess no training environmental issue involves here.

Is there any possible reason for this failure of the training?

TypeError: 'NoneType' object is not callable

When I try to render the generated motions. I got the following error:

(actor) nankaingy@lza-PC:~/data/code/pose-shape/ACTOR$ python -m src.render.rendermotion pretrained_models/humanact12/generation.npy
Visualize generation_0, action 0: 0%| | 0/60 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/OpenGL/latebind.py", line 41, in call
return self._finalCall( *args, **named )
TypeError: 'NoneType' object is not callable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/nankaingy/data/code/pose-shape/ACTOR/src/render/rendermotion.py", line 92, in
main()
File "/home/nankaingy/data/code/pose-shape/ACTOR/src/render/rendermotion.py", line 88, in main
render_video(meshes, key, action, renderer, path, background)
File "/home/nankaingy/data/code/pose-shape/ACTOR/src/render/rendermotion.py", line 27, in render_video
img = renderer.render(background, mesh, cam, color=color)
File "/home/nankaingy/data/code/pose-shape/ACTOR/src/render/renderer.py", line 143, in render
rgb, _ = self.renderer.render(self.scene, flags=render_flags)
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/pyrender/offscreen.py", line 99, in render
return self._renderer.render(scene, flags)
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/pyrender/renderer.py", line 121, in render
self._update_context(scene, flags)
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/pyrender/renderer.py", line 709, in _update_context
p._add_to_context()
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/pyrender/primitive.py", line 324, in _add_to_context
self._vaid = glGenVertexArrays(1)
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/OpenGL/latebind.py", line 45, in call
return self._finalCall( *args, **named )
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/OpenGL/wrapper.py", line 657, in wrapperCall
result = wrappedOperation( *cArguments )
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/OpenGL/platform/baseplatform.py", line 401, in call
if self.load():
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/OpenGL/platform/baseplatform.py", line 383, in load
func = platform.PLATFORM.constructFunction(
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/OpenGL/platform/baseplatform.py", line 148, in constructFunction
if (not is_core) and not self.checkExtension( extension ):
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/OpenGL/platform/baseplatform.py", line 270, in checkExtension
result = extensions.ExtensionQuerier.hasExtension( name )
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/OpenGL/extensions.py", line 98, in hasExtension
result = registered( specifier )
File "/home/nankaingy/data/anaconda3/envs/actor/lib/python3.8/site-packages/OpenGL/extensions.py", line 105, in call
if not specifier.startswith( self.prefix ):
TypeError: startswith first arg must be bytes or a tuple of bytes, not str

Anyone can help?

Thanks in advance!

there is no 'src.utils.trainer' in the src.util package

hello, when I want to train stgcn using new dataset, I find that there is no ‘src.util.trainer’， did the author forget to put it in？

Some questions about the paper.

Hello, it's a great work. But I have a question about this paper. This paper gave me a lot of inspiration. For the Decoder, directly generating all the results at one time will make the training process difficult. So what method is used in the paper to ensure that the decoder is provided with a inductive bias to ensure that it can perform well during training?

Custom Training/Inference Data

Thanks for the repo and the documentation!

How can I create .pkl files for a custom dataset to train models on or evaluate current models on them?

GLError during render motions

Hello,Mathux! Thanks for your implementation. When I tried to render motions with generation.npy， the error occurs like the following. I have tried the introduction in this link mmatl/pyrender#86, but it does not work. Thank you in advance!

Adding clothes or skin

Hello Dear Author. Thank you very much for your good work. I wanted to ask if there is a way to add clothes, skin, background to these 3d actions.

Thank you very much,

Sardor

Anyone help me with the NTU13?

Could anyone tell me what the 13 action type in NTURGB120 is used?
I appreciate any reply!
THX

How can i input the output data of decoder into input data of decoder?

I want to expand output length like following method,
i.e. run decoder twice.
Because i think if i only expand length from 60 to 120 lengths, meaningless output will be generated after 60th frame.

If i generate 120 length(60 length *2)
input -> decoder -> output(60 length) -> input -> decoder -> output(60 length)
=> total 120 length.

Question for embedding dimension

Thanks for the wonderful work @Mathux!
I really enjoyed reading paper and also testing it on my environment,
while looking at your code and paper, I found out that you've set dimensionality of embedding for VAE to 256 and I'm wondering if there's any reason for this.

Thanks,
Joseph

Train/val/test split for HumanAct12

Hi,
Thanks for releasing your code. I am wondering what is the train/val/test split that you are using for HumanAct12? It seems that you are always using the entire dataset for training and evaluating if I am not mistaken..
Tanks for your feedback and your help.

AttributeError on command visualize grid

Hello,

I am an University student at UHA (Université de Haute Alsace) and as my university project,
I have to try to make your code ACTOR to work on Google Colab.
But when I got to the part when i tried to visualize the grid I got this error:

I got the same error with the trained models created with the command train or the pretrained models.

Could you please tell me if you have a solution to this problem?

Cordialy,
Scherrer Xavier

shake in the later sequence

Hi, @Mathux , Thanks for your nice work!
I encountered some issue when i generated some motion sequences.
Like the shake problem occurred about 60 frames later, do you have any idea about it? thanks!

action2_generation_1.mp4

action22_generation_0.mp4

How can I get 'smplfaces.npy' ?

when i run python -m src.render.rendermotion exps/humanact12/generation.npy

i got 'FileNotFoundError: [Errno 2] No such file or directory: 'models/smpl/smplfaces.npy''

Convert keypoints3d into mesh

Wonderful work! Now, I have a question about how to convert 3D keypoints estimated by other models and transfer them to the desired mesh by parameters of mean meshed SMPL. Please forgive I'm a beginer for learning SMPL.

KL-vanishing issue and another idea

Hi Mathis
Thanks for sharing the implementation. I have two issues:

If I understand correctly, this work can be treated as a sequence VAE model used to learn distributions from observations. The KL-divergence loss term encourages the learned distribution to be as close as to a normal distribution. So have you ever encountered the KL-vanishing (KL loss = 0) issue? If yes, how have you solved it ?
Instead of feeding the sampled code z merely to Transformer decoder block, how about to adding it to all tokens for the decoder input sequence (plus with the position embeddings)?

Thanks

Question about interpolation in latent space

Hi, thanks for your great work!
I'm interested in concatenating two actions using your model. But I'm not sure what's means of "interpolation in latent space".
Could you give more details or some guidance?

Rendering error

Hello Author thank you for the good work. I can say it is impressive. But while running your code I came across to an error.

Thank you beforehand for the reply!

storing predicted rotations in bvh format

Hi,

Is there a way to store model generated rotations as bvh file ?
I want to load the generated motion via following package in order to proceed with my research.
Thanks.

How can I concatenate 2d and 3d tensors?

xseq = torch.cat((self.muQuery[y][None], self.sigmaQuery[y][None], x), axis=0)

This code tries to concatenate 2d and 3d tensors.
Can you explain how it works?

Thanks in advance.

missing module "optutils"

Hi, thank you for the code, I have already trained some models and would like to visualize the results. However, the scripts visualize_latent_space.py and visualize_sequence.py are trying to import the module optutils, which seems to be missing from the repository. Could you please add this module?

Reproduce `gru` and `fc`

Hi @Mathux !
Can you share with us the training commands reproducing the gru and fc models reported in your paper? Thanks:)

mathux / actor Goto Github PK

actor's Introduction

ACTOR

Bibtex

Installation 👷

1. Create conda environment

2. Download the datasets

3. Download some SMPL files

4. Download the action recogition models

How to use ACTOR 🚀

NTU13

Training

HumanAct12

Training

UESTC

Training

Evaluation

Pretrained models

Visualization

Grid of stick figures

Example

Generating and rendering SMPL meshes

Additional dependencies

Generate motions

Render motions

Examples

Overview of the available models

List of models

Construct a model

License

actor's People

Contributors

Stargazers

Watchers

Forkers

actor's Issues

Recommend Projects

Recommend Topics

Recommend Org