Code Monkey home page Code Monkey logo

animatablegaussians's Introduction

News

  • 05/22/2024 📢 An extension work of Animatable Gaussians for human avatar relighting is available at here. Welcome to check it!
  • 03/11/2024 The code has been released. Welcome to have a try!
  • 03/11/2024 AvatarReX dataset, a high-resolution multi-view video dataset for avatar modeling, has been released.
  • 02/27/2024 Animatable Gaussians is accepted by CVPR 2024!

Todo

  • Release the code.
  • Release AvatarReX dataset.
  • Release all the checkpoints and preprocessed dataset.

Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling

CVPR 2024

Zhe Li 1, Zerong Zheng 2, Lizhen Wang 1, Yebin Liu 1

1Tsinghua Univserity 2NNKosmos Technology

teaser.mp4

Abstract: Modeling animatable human avatars from RGB videos is a long-standing and challenging problem. Recent works usually adopt MLP-based neural radiance fields (NeRF) to represent 3D humans, but it remains difficult for pure MLPs to regress pose-dependent garment details. To this end, we introduce Animatable Gaussians, a new avatar representation that leverages powerful 2D CNNs and 3D Gaussian splatting to create high-fidelity avatars. To associate 3D Gaussians with the animatable avatar, we learn a parametric template from the input videos, and then parameterize the template on two front & back canonical Gaussian maps where each pixel represents a 3D Gaussian. The learned template is adaptive to the wearing garments for modeling looser clothes like dresses. Such template-guided 2D parameterization enables us to employ a powerful StyleGAN-based CNN to learn the pose-dependent Gaussian maps for modeling detailed dynamic appearances. Furthermore, we introduce a pose projection strategy for better generalization given novel poses. Overall, our method can create lifelike avatars with dynamic, realistic and generalized appearances. Experiments show that our method outperforms other state-of-the-art approaches.

Demo Results

We show avatars animated by challenging motions from AMASS dataset.

basketball.mp4
More results (click to expand)
football.mp4
dancing.mp4
irish_dancing.mp4

Installation

  1. Clone this repo.
git clone https://github.com/lizhe00/AnimatableGaussians.git
# or
git clone [email protected]:lizhe00/AnimatableGaussians.git
  1. Install environments.
# install requirements
pip install -r requirements.txt

# install diff-gaussian-rasterization-depth-alpha
cd gaussians/diff_gaussian_rasterization_depth_alpha
python setup.py install
cd ../..

# install styleunet
cd network/styleunet
python setup.py install
cd ../..
  1. Download SMPL-X model, and place pkl files to ./smpl_files/smplx.

Data Preparation

AvatarReX, ActorsHQ or THuman4.0 Dataset

  1. Download AvatarReX, ActorsHQ, or THuman4.0 datasets.
  2. Data preprocessing. We provide two manners below. The first way is recommended if you plan to employ our pretrained models, because the renderer utilized in preprocessing may cause slight differences.
    1. (Recommended) Download our preprocessed files from PREPROCESSED_DATASET.md, and unzip them to the root path of each character.
    2. Follow the instructions in gen_data/GEN_DATA.md to preprocess the dataset.

Note for ActorsHQ dataset: 1) DATA PATH. The subject from ActorsHQ dataset may include more than one sequences, but we only utilize the first sequence, i.e., Sequence1. The root path is ActorsHQ/Actor0*/Sequence1. 2) SMPL-X Registration. We provide SMPL-X fitting for ActorsHQ dataset. You can download it from here, and place smpl_params.npz at the corresponding root path of each subject.

Customized Dataset

Please refer to gen_data/GEN_DATA.md to run on your own data.

Avatar Training

Take avatarrex_zzr from AvatarReX dataset as an example, run:

python main_avatar.py -c configs/avatarrex_zzr/avatar.yaml --mode=train

After training, the checkpoint will be saved in ./results/avatarrex_zzr/avatar.

Avatar Animation

  1. Download pretrained checkpoint from PRETRAINED_MODEL.md, unzip it to ./results/avatarrex_zzr/avatar, or train the network from scratch.
  2. Download THuman4.0_POSE or AMASS dataset for acquiring driving pose sequences. We list some awesome pose sequences from AMASS dataset in configs/awesome_amass_poses.yaml. Specify the testing pose path in configs/avatarrex_zzr/avatar.yaml#L57.
  3. Run:
python main_avatar.py -c configs/avatarrex_zzr/avatar.yaml --mode=test

You will see the animation results like below in ./test_results/avatarrex_zzr/avatar.

example_animation.mp4

Evaluation

We provide evaluation metrics and example codes of comparison with body-only avatars in eval/comparison_body_only_avatars.py.

Acknowledgement

Our code is based on these wonderful repos:

Citation

If you find our code or data is helpful to your research, please consider citing our paper.

@inproceedings{li2024animatablegaussians,
  title={Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling},
  author={Li, Zhe and Zheng, Zerong and Wang, Lizhen and Liu, Yebin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
}

animatablegaussians's People

Contributors

lizhe00 avatar trthanhnguyen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

animatablegaussians's Issues

AvatarRex数据集中的cano_smpl_pos_map.exr和init_pts_lbs.npy我该如何获得?在network.avatar.py中的第27行和31行中有引用

cano_smpl_map = cv.imread(config.opt['train']['data']['data_dir'] + '/smpl_pos_map/cano_smpl_pos_map.exr', cv.IMREAD_UNCHANGED)

self.lbs = torch.from_numpy(np.load(config.opt['train']['data']['data_dir'] + '/smpl_pos_map/init_pts_lbs.npy')).to(torch.float32).to(config.device)

actorsHQ‘s camera pose does not fit 3DGS

When I tried your method of processing camera parameters in the actorshq dataset and applied the processed extrinsic and intrinsic matrices to the original 3DGS, the results were very blurry. Have you tried the same method before?
image

Driving any arbitrary action?

In the process of creating Avatar Animations, you may find that certain pose-driven animations deliver superior quality compared to others. For example, the "THuman4.0_POSE" may provide more convincing results than "MPI_mosh". How to improve the quality of driving any arbitrary action?

Key parameter to reduce VRAM

Hi, thanks again for releasing the code! If I'd like to retrain the model to reduce the VRAM usage, what are some key parameters/configurations I can tune? Thanks!

smpl_params

How do I get smpl_params in my own dataset?

Relightable models

Hi,

Thank you for your amazing work!

I noticed that you recently updated your website/github for relightable models. Any plan to release either the code or a technical report for the relightable models?

best

Training Time

Hi,I wonder how long does the model take to train? I use one 3090 and have trained about 3 days...

ERROR: Could not find a version that satisfies the requirement igl==2.2.1

Error: No matching distribution found for igl==2.2.1

Description

When trying to install the required dependencies from the requirements.txt file, I encountered the following error:

Collecting glfw==2.4.0 (from -r requirements.txt (line 1))
Using cached glfw-2.4.0-py2.py27.py3.py30.py31.py32.py33.py34.py35.py36.py37.py38-none-manylinux2014_x86_64.whl.metadata (4.8 kB)
ERROR: Could not find a version that satisfies the requirement igl==2.2.1 (from versions: none)
ERROR: No matching distribution found for igl==2.2.1

This error occurs when trying to install the `igl` package at version 2.2.1, which appears to be listed in the `requirements.txt` file.

Environments Tested

I have tried the following environments:

  1. venv with Python 3.11
  2. Anaconda with Python 3.10
  3. Anaconda with Python 3.9

Steps to Reproduce

  1. Create a new virtual environment (using pyenv or Anaconda)
  2. Install the required dependencies from the requirements.txt file

Expected Behavior

All the dependencies listed in the requirements.txt file should be installed successfully without any errors.

Additional Information

  • cuda:12.2.0
  • ubuntu22.04

Can you please let me know if you have an environment that works well for these errors?
Thank you in advance for your time.

StyleUNet Conditions

Hi, thank you for the amazing work!

I just have a question regarding the inputs of the StyleUNet.

According to the paper, your StyleUNets takes as inputs both front and back posed position maps and outputs the front and back pose-dependent gaussian maps.
Screenshot from 2024-04-15 20-40-22

Meanwhile, I noticed that only the front posed position map is used as the condition to predict both front and back gaussian maps.

pose_map = items['smpl_pos_map'][:3]

I wonder whether this is intentionally done since the outputs still look good.

Thank you in advance.

Error when loading data

I download your preprocessed files from [PREPROCESSED_DATASET.md].(https://github.com/lizhe00/AnimatableGaussians/blob/master/PREPROCESSED_DATASET.md)
Here is the detailed error:
Import AvatarNet from network.avatar
[ WARN:[email protected]] global /io/opencv/modules/imgcodecs/src/loadsave.cpp (239) findDecoder imread_('media/data4/yejr/AvatarReX/avatarrex_lbn1/smpl_pos_map/cano_smpl_pos_map.exr'): can't open/read file: check file path/integrity
Traceback (most recent call last):
File "main_avatar.py", line 829, in
trainer = AvatarTrainer(config.opt)
File "main_avatar.py", line 48, in init
self.avatar_net = AvatarNet(self.opt['model']).to(config.device)
File "/home/yejr/Digital_Avater/AnimatableGaussians-master/network/avatar.py", line 28, in init
self.cano_smpl_map = torch.from_numpy(cano_smpl_map).to(torch.float32).to(config.device)
TypeError: expected np.ndarray (got NoneType)
How should I do?

ValueError: Invalid data_path!

https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
warnings.warn(
Traceback (most recent call last):
File "/nas/project/AnimatableGaussians/main_avatar.py", line 837, in
trainer.test()
File "/home/gao/anaconda3/envs/AniGaussian/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/nas/project/AnimatableGaussians/main_avatar.py", line 535, in test
testing_dataset = PoseDataset(**self.opt['test']['pose_data'], smpl_shape = training_dataset.smpl_data['betas'][0])
File "/home/gao/anaconda3/envs/AniGaussian/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/nas/project/AnimatableGaussians/dataset/dataset_pose.py", line 59, in init
raise ValueError('Invalid data_path!')
ValueError: Invalid data_path!

Preprocessed avatarrex dataset misses smpl_params.npz

I use the link https://github.com/lizhe00/AnimatableGaussians/blob/master/PREPROCESSED_DATASET.md to download the preprocessed avatarrex data. However, the error report reminds me that it lacks the smpl_params.npz.

(m3dgs) (base) duantong@user-R8428-A12:/data/duantong/mazipei/AnimatableGaussians$ python main_avatar.py -c configs/avatarrex_zzr/avatar.yaml --mode=train
# Using Pytorch3d Renderer
Import AvatarNet from network.avatar
# Parameter number of AvatarNet is 223648936
Traceback (most recent call last):
  File "main_avatar.py", line 834, in <module>
    trainer.pretrain()
  File "main_avatar.py", line 269, in pretrain
    self.dataset = MvRgbDataset(**self.opt['train']['data'])
  File "/data/duantong/mazipei/AnimatableGaussians/dataset/dataset_mv_rgb.py", line 393, in __init__
    super(MvRgbDatasetAvatarReX, self).__init__(
  File "/home/duantong/anaconda3/envs/m3dgs/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/duantong/mazipei/AnimatableGaussians/dataset/dataset_mv_rgb.py", line 40, in __init__
    self.load_smpl_data()
  File "/data/duantong/mazipei/AnimatableGaussians/dataset/dataset_mv_rgb.py", line 253, in load_smpl_data
    smpl_data = np.load(self.data_dir + '/smpl_params.npz', allow_pickle = True)
  File "/home/duantong/anaconda3/envs/m3dgs/lib/python3.8/site-packages/numpy/lib/npyio.py", line 405, in load
    fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: './avatar_data/avatarrex/zzr/smpl_params.npz'

Training on our custum dataset and smpl_params.npz

Thank you for your excellent work and providing the codes.

I want to train on our own dataset. For the setting, how can I obtain smpl_params.npz? Should I use PyMAF-X as mentioned in the AvatarRex? If so, how can I tune the parameters to align it to the calibrated camera parameters (K[R|T])? I was curious about it because smpl parameters obtained from PyMAF-X are trained on fixed virtual camera (maybe focal length=(5000,5000) and extrinsic matrices are identity) across all the images.

I would appreciate it if you could answer.

details about training and testing on avatarrex dataset

Hi, thanks for your nice work. I have read your paper, but I did not find any information about the viewpoint settings during training and testing. I noticed that all 16 viewpoints were used during training in your code, but can you clarify which viewpoints were used during testing? For example, which viewpoints were the results in Table 2 of the paper based on? Could you provide some instructions regarding the frame and camera viewpoint configuration during testing?

Batch size > 1

Hi, thanks for releasing the code! I'm wondering if it's possible to train/test with a batch size bigger than 1? What's the reason a batch size of 1 is used for training? Thanks!

Possible code errors in raster settings

In line 52 of gaussian_renderer.py, camera_center = torch.linalg.inv(extr)[:3, 3] seems to be changed to camera_center = torch.linalg.inv(extr)[3, :3], although judging from the results It seems to have little impact.

Details about evaluation

Great work on the new relightable extension.
As claimed in the paper, that Cam127 was used for the evaluation in Table IV.

The numerical results are computed on the 48-548 frames and the “Cam127” camera view in the “Actor01/Sequence1” from ActorsHQ dataset.

Questions are:

  1. Was the Cam127 used for training?
  2. Was the training conducted on the whole sequence (including 48-548)?
  3. Was the setting the same as the avatar.yaml files in this repo? e.g. the used_cam_ids

Rendered results not meeting expectation on training set

It is really a fantastic work. Congrats!
I managed to deal with all the data preprocessing and testing.
However I found the rendered results on avatar_lbn1 is not quite ideal, even on the training smpl dataset.
The rendered image rendered below ith command [python main_avatar.py -c configs/avatarrex_lbn1/avatar.yaml --mode=test]:
image
image
Is that sounds reasonable to you?
Thanks very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.