Code Monkey home page Code Monkey logo

sickface's Introduction

SickFace

MasterHead

For current SOTA please use the following instead https://github.com/KwaiVGI/LivePortrait

Overview

This repository focuses on portrait animation, specifically lip-synchronization via 3DMM control, but also allows for video-driven animation.

Note:

I will update this repo with correct layout and intruction but this has been tested on python 3.10 with cuda 11.8 in annaconda windows

Setup

Please complete the following steps.

Clone the repository:

git clone https://github.com/Inferencer/SickFace.git
cd SickFace

We recommend to create a new conda environment:

conda create -n sickface python=3.10
conda activate sickface

Dependencies

This code requires at least Python 3.10 and PyTorch.

  1. Install PyTorch (>= 1.12.0)

    conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
    
  2. Additional dependencies can be installed via:

    pip install -r requirements.txt
    
  3. Run the gradio ui

python app.py

Alternatively you can console commands using the following

python demo.py --checkpoint checkpoints/vox256.pt --config ./configs/vox256.yaml  --source_images ./examples/myimagefile.jpg --driving_video ./examples/mydriveingfile.mp4 --relative --adapt_scale --find_best_frame --audio

If you wish to use more than one source image you are welcome to use up to two in total by adding another image path such as

python demo.py --checkpoint checkpoints/vox256_2Source.pt --config ./configs/vox256.yaml  --source_images ./examples/myimagefile.jpg ./examples/myimagefile2.jpg --driving_video ./examples/drive.mp4 --relative --adapt_scale --find_best_frame --audio

You can use different file formats such as .png if you so wish but before using multiple source image I reccomend reading this issue you will also notice the checkpoint has changed to vox256_2Source.pt

Pretrained Checkpoints

Pretrained models can be found at google-drive.

The models should be downloaded and placed in ./checkpoints so checkpoints/kp_detector.pt & checkpoints/vox256.pt. Note that all pretrained checkpoints are trained using the same keypoint detector weights.

Features

Version 1 (V1)

  • Code Base: Uses code from FSRT.
  • Enhancements:
    • Added upscaling implementation.
    • UI
    • Possible further training with the vox512 dataset or another dataset (not sure why vox256 has been the default in recent years.)
  • 3DMM Selection: The 3DMM to be used is yet to be decided from the following list:
    • CodeTalker
    • EmoTalk
    • Emote
    • FaceDiffuser
    • FaceFormer

Version 2 (V2)

  • Upgrades: Will incorporate advancements from InvertAvatar, Portrait-4dv2, or another state-of-the-art (SOTA) model released this year.
  • 3DMM Upgrade: Potential integration of the upcoming Media2Face.

Version 3 (V3)

  • Future Goals:
    • Expected release by late 2025.
    • Aim for Gaussian-based methods.
    • Focus on one-shot methods with minimal training requirements.
    • Training constraints: Should not exceed 1 hour on an A100 GPU and should use a maximum of 30 seconds of video identity data.

Conclusion

SickFace aims to push the boundaries of portrait animation by leveraging state-of-the-art techniques and efficient training methods. Stay tuned for updates as we progress through each version!

sickface's People

Contributors

inferencer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sickface's Issues

Fidelity

I have run some quick test of using multiple source images, however using 3 source images compared to one has increased the inference time by 4x
I there is also a quality loss in using multiple source images as you can see from my results the right side used 3 source images and lacks hair texture & shine, there is also a slight difference in the nose not found in the source. I have decided to include the code regardless as an option for command line users.

1v3.mp4

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.