Code Monkey home page Code Monkey logo

elicit's Introduction

One-shot Implicit Animatable Avatars with Model-based Priors [ICCV2023]

teaser.mp4

ELICIT creates free-viewpoint motion videos from a single image by constructing an animatable avatar NeRF representation in one-shot learning.

Official repository of "One-shot Implicit Animatable Avatars with Model-based Priors".

[Arxiv] [Website]

What Can Your Learn from ELICIT?

  1. The data-efficient pipeline of creating a 3D animatable avatar from a single image.
  2. Use CLIP-based semantic loss to infer the entire 3D appearance of the human body with the help of a rough SMPL shape.
  3. A segmentation-based sampling strategy to create more realistic visual details and geometries for 3D avatars.

Installation

Please follow the Installation Instruction to setup all the required packages.

Data

Results of the experiments

We provide result videos in our webpage for the qualitative and quantitative evaluations in our paper. We also provided checkpoints for those experiments in Google Drive.

Training data for re-implementation

For the datasets we use for quantitative evaluations (ZJU-MoCAP, Human 3.6M), please prepare the original datasets into the same format as ZJU-MoCAP. Then use our scripts in tools to preprocess the dataset and render SMPL meshes for training.

For customized single-image data, we provides examples from DeepFashion datasets in dataset/fashion.

See more details in Data Instruction.

Getting Started

Training

python train.py --cfg configs/elicit/zju_mocap/377/smpl_init_texture.yaml # Run SMPL Meshes initialization.
python train.py --cfg configs/elicit/zju_mocap/377/finetune.yaml # Run training on the input subject.

We also provide checkpoints for all the subjects in Google Drive, please unzip the file in the following structure:

${ELICIT_ROOT}
    └── experiments
        └── elicit
            ├── zju_mocap
            ├── h36m
            └── fashion

Please refer to scripts for training all the quantative experiments of novel pose synthesis and novel view synthesis on ZJU MoCap and Human 3.6M.

Evaluation / Rendering

We also provide results of all our quantitative results of ELICIT and other baselines in Google Drive. Please use the bounding masks in this file to calculate correct PSNR, SSIM and LPIPS scores, which are generated by Neural Human Performer and Animatable-NeRF.

Evaluate novel pose synthesis.

python run.py --type movement --cfg configs/elicit/zju_mocap/377/finetune.yaml 

Evaluate novel view synthesis.

python run.py --type freeview --cfg configs/elicit/zju_mocap/377/finetune.yaml freeview.use_gt_camera True

Freeview rendering on arbitrary frames.

python run.py --type freeview  --cfg configs/elicit/zju_mocap/377/finetune.yaml freeview.frame_idx $FRAME_INDEX_TO_RENDER

The rendered frames and video will be saved at experiments/zju_mocap/377/latest.

Citation

@inproceedings{huang2022elicit,
  title={One-shot Implicit Animatable Avatars with Model-based Priors},
  author={Huang, Yangyi and Yi, Hongwei and Liu, Weiyang and Wang, Haofan and Wu, Boxi and Wang, Wenxiao and Lin, Binbin and Zhang, Debing and Cai, Deng},
  booktitle={IEEE Conference on Computer Vision (ICCV)}, 
  year={2023}
}

Acknowledgments

Our implementation is mainly based on HumanNeRF, and took reference from Animatable NeRF and AvatarCLIP. We thanks the authors for their open source contributions. In addition, we thank the authors of Animatble NeRF for their help in the data preprocessing of Human 3.6M.

elicit's People

Contributors

huangyangyi avatar yhw-yhw avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.