Code Monkey home page Code Monkey logo

comfyui_aniportrait's Introduction

Updates:

① Implement the frame_interpolation to speed up generation

② Modify the current code and support chain with the VHS nodes, i just found that comfyUI IMAGE type requires the torch float32 datatype, and AniPortrait heavily used numpy of image unit8 datatype,so i just changed my mind from my own image/video upload and generation nodes to the prevelance SOTA VHS image/video upload and video combined nodes,it WYSIWYG and inteactive well and instantly render the result

  • ✅ [2024/04/09] raw video to pose video with reference image(aka self-driven)
  • ✅ [2024/04/09] audio driven
  • ✅ [2024/04/09] face reenacment
  • ✅ [2024/04/22] implement audio2pose model and pre-trained weight for audio2video,the face reenacment and audio2video workflow has been modified, currently inference up to a maximum length of 10 seconds has been supported,you can experiment with the length hyperparameter.

U can contact me thr twitter_1twitter wechat_1 Weixin:GalaticKing

audio driven combined with reference image and reference video

audio2video audio2video workflow

Aniportrait_00002-audio.mp4

raw video to pose video with reference image

pose2video

Aniportrait_00004-audio.mp4

face reenacment

face_reenacment video2video workflow

AnimateDiff_00001-audio.mp4

This is unofficial implementation of AniPortrait in ComfyUI custom_node,cuz i have routine jobs,so i will update this project when i have time

Aniportrait_pose2video.json

Audio driven

face reenacment

you should run

git clone https://github.com/frankchieng/ComfyUI_Aniportrait.git

then run

pip install -r requirements.txt

download the pretrained models

StableDiffusion V1.5

sd-vae-ft-mse

image_encoder

wav2vec2-base-960h

download the weights:

denoising_unet.pth reference_unet.pth pose_guider.pth motion_module.pth audio2mesh.pt audio2pose.pt film_net_fp16.pt

./pretrained_model/
|-- image_encoder
|   |-- config.json
|   `-- pytorch_model.bin
|-- sd-vae-ft-mse
|   |-- config.json
|   |-- diffusion_pytorch_model.bin
|   `-- diffusion_pytorch_model.safetensors
|-- stable-diffusion-v1-5
|   |-- feature_extractor
|   |   `-- preprocessor_config.json
|   |-- model_index.json
|   |-- unet
|   |   |-- config.json
|   |   `-- diffusion_pytorch_model.bin
|   `-- v1-inference.yaml
|-- wav2vec2-base-960h
|   |-- config.json
|   |-- feature_extractor_config.json
|   |-- preprocessor_config.json
|   |-- pytorch_model.bin
|   |-- README.md
|   |-- special_tokens_map.json
|   |-- tokenizer_config.json
|   `-- vocab.json
|-- audio2mesh.pt
|-- audio2pose.pt
|-- denoising_unet.pth
|-- motion_module.pth
|-- pose_guider.pth
|-- reference_unet.pth
|-- film_net_fp16.pt

Tips : The intermediate audio file will be generated and deleted,the raw video to pose video with audio and pose2video mp4 file will be located in the output directory of ComfyUI the original uploaded mp4 video requires square size like 512x512, otherwise the result will be weird

I've updated diffusers from 0.24.x to 0.26.2,so the diffusers/models/embeddings.py classname of PositionNet changed to GLIGENTextBoundingboxProjection and CaptionProjection changed to PixArtAlphaTextProjection,you should pay attention to it and modify the corresponding python files like src/models/transformer_2d.py if you installed the lower version of diffusers

comfyui_aniportrait's People

Contributors

frankchieng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

comfyui_aniportrait's Issues

Thanks for your node!

I test this node,the effect is good than I expect.I'm looking forward to support but only 512*512 not also other resolution ratio

CLIPVisionModelWithProjection Shape Size Error

Unfortunately, running any of the example workflows I get the following error:

Error occurred when executing AniPortrait_Pose_Gen_Video:

Error(s) in loading state_dict for CLIPVisionModelWithProjection:
size mismatch for vision_model.embeddings.class_embedding: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for vision_model.embeddings.patch_embedding.weight: copying a param with shape torch.Size([1024, 3, 14, 14]) from checkpoint, the shape in current model is torch.Size([768, 3, 32, 32]).
size mismatch for vision_model.embeddings.position_embedding.weight: copying a param with shape torch.Size([257, 1024]) from checkpoint, the shape in current model is torch.Size([50, 768]).
size mismatch for vision_model.pre_layrnorm.weight: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for vision_model.pre_layrnorm.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([768]).
[...]
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.

File "E:\COMFY\ComfyUI-robe\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\COMFY\ComfyUI-robe\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\COMFY\ComfyUI-robe\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\COMFY\ComfyUI-robe\custom_nodes\ComfyUI_Aniportrait\nodes.py", line 169, in pose_generate_video
image_enc = CLIPVisionModelWithProjection.from_pretrained(image_encoder_path).to(dtype=weight_dtype, device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxx\.conda\envs\comfy\Lib\site-packages\transformers\modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xxx\.conda\envs\comfy\Lib\site-packages\transformers\modeling_utils.py", line 4155, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")

Is this related to my torch or transformers version?
I'm running transformers 4.40.2 and torch 2.3.0+cu118

Can you please help me to fix it?

Error occurred when executing AniPortraitLoader

Error occurred when executing AniPortraitLoader:

Error no file named config.json found in directory E:\AI\ComfyUI-aki-v1.1\custom_nodes\ComfyUI-AniPortrait\pretrained_model\StableDiffusion V1.5.

File "E:\AI\ComfyUI-aki-v1.1\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "E:\AI\ComfyUI-aki-v1.1\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "E:\AI\ComfyUI-aki-v1.1\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "E:\AI\ComfyUI-aki-v1.1\custom_nodes\ComfyUI-AniPortrait\nodes.py", line 120, in run
reference_unet = UNet2DConditionModel.from_pretrained(
File "E:\AI\ComfyUI-aki-v1.1\python\lib\site-packages\huggingface_hub\utils_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "E:\AI\ComfyUI-aki-v1.1\python\lib\site-packages\diffusers\models\modeling_utils.py", line 567, in from_pretrained
config, unused_kwargs, commit_hash = cls.load_config(
File "E:\AI\ComfyUI-aki-v1.1\python\lib\site-packages\huggingface_hub\utils_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "E:\AI\ComfyUI-aki-v1.1\python\lib\site-packages\diffusers\configuration_utils.py", line 374, in load_config
raise EnvironmentError(

如何仅保存生成后的视频?

大佬,我顺利的运行完工作流,看到保存的是视频带有原图片和面部网络,请问如果只保存生成后的视频?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.