sylym / comfy_vid2vid Goto Github PK

License: Apache License 2.0

Python 100.00%

comfy_vid2vid's Introduction

Vid2vid Node Suite for ComfyUI

A node suite for ComfyUI that allows you to load image sequence and generate new image sequence with different styles or content.

Original repo: https://github.com/sylym/stable-diffusion-vid2vid

Install

Firstly, install comfyui

Then run:

cd ComfyUI/custom_nodes
git clone https://github.com/sylym/comfy_vid2vid
cd comfy_vid2vid

Next, download dependencies:

python -m pip install -r requirements.txt

For ComfyUI portable standalone build:

#You may need to replace "..\..\..\python_embeded\python.exe" depends your python_embeded location
..\..\..\python_embeded\python.exe -m pip install -r requirements.txt

Usage

All nodes are classified under the vid2vid category. For some workflow examples you can check out:

vid2vid workflow examples

Nodes

LoadImageSequence

Load image sequence from a folder.

Inputs:

None

Outputs:

IMAGE
- Image sequence
MASK_SEQUENCE
- The alpha channel of the image sequence is the channel we will use as a mask.

Parameters:

image_sequence_folder
- Select the folder that contains a sequence of images. Node only uses folders in the input folder.
- The folder should only contain images with the same size.
sample_start_idx
- The start index of the image sequence. The image sequence will be sorted by image names.
sample_frame_rate
- The frame rate of the image sequence. If the frame rate is 2, the node will sample every 2 images.
n_sample_frames
- The number of images in the sequence. The number of images in image_sequence_folder must be greater than or equal to sample_start_idx - 1 + n_sample_frames * sample_frame_rate.
- If you want to use the node CheckpointLoaderSimpleSequence to generate a sequence of pictures, set n_sample_frames >= 3.

LoadImageMaskSequence

Load mask sequence from a folder.

Inputs:

None

Outputs:

MASK_SEQUENCE
- Image mask sequence

Parameters:

image_sequence_folder
- Select the folder that contains a sequence of images. Node only uses folders in the input folder.
- The folder should only contain images with the same size.
channel
- The channel of the image sequence that will be used as a mask.
sample_start_idx
- The start index of the image sequence. The image sequence will be sorted by image names.
sample_frame_rate
- The frame rate of the image sequence. If the frame rate is 2, the node will sample every 2 images.
n_sample_frames
- The number of images in the sequence. The number of images in image_sequence_folder must be greater than or equal to sample_start_idx - 1 + n_sample_frames * sample_frame_rate.

VAEEncodeForInpaintSequence

Encode the input image sequence into a latent vector using a Variational Autoencoder (VAE) model. Also add image mask sequence to latent vector.

Inputs:

pixels: IMAGE
- Image sequence that will be encoded.
vae: VAE
- VAE model that will be used to encode the image sequence.
mask_sequence: MASK_SEQUENCE
- Image mask sequence that will be added to the latent vector. The number of images and masks must be the same.

Outputs:

LATENT
- The latent vector with image mask sequence. The image mask sequence in the latent vector will only take effect when using the node KSamplerSequence.

Parameters:

None

DdimInversionSequence

Generate a specific noise vector by inverting the input latent vector using the Ddim model. Usually used to improve the time consistency of the output image sequence.

Inputs:

samples: LATENT
- The latent vector that will be inverted.
model: MODEL
- Full model that will be used to invert the latent vector.
clip: CLIP
- Clip model that will be used to invert the latent vector.

Outputs:

NOISE
- The noise vector that will be used to generate the image sequence.

Parameters:

steps
- The number of steps to invert the latent vector.

SetLatentNoiseSequence

Add noise vector to latent vector.

Inputs:

samples: LATENT
- The latent vector that will be added noise.
noise: NOISE
- The noise vector that will be added to the latent vector.

Outputs:

LATENT
- The latent vector with noise. The noise vector in the latent vector will only take effect when using the node KSamplerSequence.

Parameters:

None

CheckpointLoaderSimpleSequence

Load the checkpoint model into UNet3DConditionModel. Usually used to generate a sequence of pictures with time continuity.

Inputs:

None

Outputs:

ORIGINAL_MODEL
- Model for fine-tuning, not for inference
CLIP
- The clip model
VAE
- The VAE model

Parameters:

ckpt_name
- The name of the checkpoint model. The model should be in the models/checkpoints folder.

LoraLoaderSequence

Same function as LoraLoader node, but acts on UNet3DConditionModel. Used after the CheckpointLoaderSimpleSequence node and before the TrainUnetSequence node. The input and output of the model are both of ORIGINAL_MODEL type.

TrainUnetSequence

Fine-tune the incoming model using latent vector and context, and convert the model to inference mode.

Inputs:

samples: LATENT
- The latent vector that will be used to fine-tune the incoming model.
model: ORIGINAL_MODEL
- The model that will be fine-tuned.
context: CONDITIONING
- The context used for fine-tuning the input model, typically consists of words or sentences describing the subject of the action in the latent vector and its behavior.

Outputs:

MODEL
- The fine-tuned model. This model is ready for inference.

Parameters:

seed
- The seed used in model fine-tuning.
steps
- The number of steps to fine-tune the model. If the steps is 0, the model will not be fine-tuned.

KSamplerSequence

Same function as KSampler node, but added support for noise vector and image mask sequence.

Limits

UNet3DCoditionModel has high demand for GPU memory. If you encounter out of memory error, try to reduce n_sample_frames. However, n_sample_frames must be greater than or equal to 3.
Some custom nodes do not support processing image sequences. The nodes listed below have been tested and are working properly:
- Official node
- comfy_controlnet_preprocessors

comfy_vid2vid's People

Contributors

Stargazers

Watchers

Forkers

rne1223 oscarking888 artimaticio rttt1093 alexbofa seanlynch jags111 alanhzh maoxie haohaocreates

comfy_vid2vid's Issues

Folder images didn't found by load_image_sequence

I have a folder with image inside that I put into my input folder.
There is no doc on where to put the input images but I found on the code this line.

https://github.com/sylym/comfy_vid2vid/blob/main/__init__.py#L124

Sad il have no folder found by the node

What should I do? how can I debug this ... I have very few python/comfy knowledge but i want to give a try to this plug in...

Need to update. due to the ComfyUI's lora structure is changes.

https://github.com/sylym/comfy_vid2vid/blob/main/sd.py#L3-L3

'model_lora_keys' is changed to 'model_lora_keys_clip'

CLIP.init() got an unexpected keyword argument 'config'

Error occurred when executing CheckpointLoaderSimpleSequence:

CLIP.init() got an unexpected keyword argument 'config'

File "D:\ComfyUI_windows_portable1\ComfyUI\execution.py", line 145, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "D:\ComfyUI_windows_portable1\ComfyUI\execution.py", line 75, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "D:\ComfyUI_windows_portable1\ComfyUI\execution.py", line 68, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "D:\ComfyUI_windows_portable1\ComfyUI\custom_nodes\comfy_vid2vid_init_.py", line 262, in load_checkpoint
out = load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings"))
File "D:\ComfyUI_windows_portable1\ComfyUI\custom_nodes\comfy_vid2vid\sd.py", line 36, in load_checkpoint_guess_config
clip = CLIP(config=clip_config, embedding_directory=embedding_directory)

was updated: ComfyUI/comfy/sd.py please adapt routine and import

a function inside comfyUI was updated:
def load_lora_for_models(model, clip, lora, strength_model, strength_clip)
please adapt D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfy_vid2vid\sd.py accordingly

https://github.com/sylym/comfy_vid2vid/blob/main/sd.py#L263-L272
https://github.com/comfyanonymous/ComfyUI/blob/90aa59709985ffa1b02b6330b7720ed399fbf4df/comfy/sd.py#L432-446

def load_lora_for_models(model, clip, lora, strength_model, strength_clip):
key_map = model_lora_keys_unet(model.model)
key_map = model_lora_keys_clip(clip.cond_stage_model, key_map)
loaded = load_lora(lora, key_map)
new_modelpatcher = model.clone()
k = new_modelpatcher.add_patches(loaded, strength_model)
new_clip = clip.clone()
k1 = new_clip.add_patches(loaded, strength_clip)
k = set(k)
k1 = set(k1)
for x in loaded:
if (x not in k) and (x not in k1):
print("NOT LOADED", x)
return (new_modelpatcher, new_clip)

https://github.com/sylym/comfy_vid2vid/blob/main/sd.py#L3C2-L3C2
from comfy.sd import load_model_weights, ModelPatcher, VAE, CLIP, model_lora_keys_unet, model_lora_keys_clip

elsewise error get's thrown:

Traceback (most recent call last):
File "D:\ComfyUI_windows_portable\ComfyUI\nodes.py", line 1647, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in call_with_frames_removed
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfy_vid2vid_init.py", line 8, in
from .sd import load_checkpoint_guess_config, load_lora_for_models
File "D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfy_vid2vid\sd.py", line 3, in
from comfy.sd import load_model_weights, ModelPatcher, VAE, CLIP, model_lora_keys
ImportError: cannot import name 'model_lora_keys' from 'comfy.sd' (D:\ComfyUI_windows_portable\ComfyUI\comfy\sd.py)

Cannot import D:\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfy_vid2vid module for custom nodes: cannot import name 'model_lora_keys' from 'comfy.sd' (D:\ComfyUI_windows_portable\ComfyUI\comfy\sd.py)

AttributeError: 'ControlNet' object has no attribute 'get_control_models' when running KSamplerSequence

It seems that as of commit 3696d1699a6fece2485c063317cf65abbcddb79b, get_control_models() is now get_models().

I tried swapping just the function name, but I'm still getting

 in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'ControlNet' object has no attribute 'get_models'

Comfy got an update and the Vid2Vid wont work on it!

Hi, Please update the Repo to be compatible with the new Comfy!

How do I use this tool?

I don't know which folder to put it in, and I don't know how to start this node.

Vid2Vid won't launch

I get a few messages:

I get this (probably unrelated)
D:\DProgram Files\Python\Python310\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py:65: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'CPUExecutionProvider'
warnings.warn(
Could not find efficiency nodes

Then this:
Traceback (most recent call last):
File "E:\ComfyUI\nodes.py", line 1698, in load_custom_node
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in call_with_frames_removed
File "E:\ComfyUI\custom_nodes\comfy_vid2vid_init.py", line 8, in
from .sd import load_checkpoint_guess_config, load_lora_for_models
File "E:\ComfyUI\custom_nodes\comfy_vid2vid\sd.py", line 3, in
from comfy.sd import load_model_weights, ModelPatcher, VAE, CLIP, model_lora_keys_unet, model_lora_keys_clip
ImportError: cannot import name 'ModelPatcher' from 'comfy.sd' (E:\ComfyUI\comfy\sd.py)

Next this
Cannot import E:\ComfyUI\custom_nodes\comfy_vid2vid module for custom nodes: cannot import name 'ModelPatcher' from 'comfy.sd' (E:\ComfyUI\comfy\sd.py)
E:\ComfyUI\custom_nodes\failfast-comfyui-extensions\extensions
E:\ComfyUI\web\extensions\failfast-comfyui-extensions
Total VRAM 12288 MB, total RAM 16172 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 3060 : cudaMallocAsync
VAE dtype: torch.bfloat16
WAS Node Suite: OpenCV Python FFMPEG support is enabled
WAS Node Suite Warning: ffmpeg_bin_path is not set in E:\ComfyUI\custom_nodes\was-node-suite-comfyui\was_suite_config.json config file. Will attempt to use system ffmpeg binaries if available.

Finally this
0.0 seconds (IMPORT FAILED): E:\ComfyUI\custom_nodes\comfy_vid2vid

If there's some lines in some code I'm supposed to update please be specific to the filename & place to update. I see other people with similar errors but not the same so I'm not sure if they're related. I do have FFMPEG installed on the system & it is on the PATH properly.

Couple Tweaks Needed

Just downloaded this for the first time today, so my b if its just an issue on my end/using an unsupported diffusers version.

In Models/Attention.py & Unet.py...

from diffusers.modeling_utils should now be:

from diffusers.models.modeling_utils

In c_vid2vid's sd.py, to mirror the referenced sd.py file (at least in the latest comfyui):

Lines 2&3 should be:

from comfy import model_management, model_patcher
from comfy.sd import load_model_weights, VAE, CLIP, load_lora_for_models

–
Might be wrong about the latter tweak, but I received an error if I went a sub-ref below to get "ModelPatcher, model_lora_keys_unet, model_lora_keys_clip" from sd.py

Vid2vid not showing up in node options

I downloaded vid2vid using the manager for Comfy UI but I still don't see the node options...what else do I need to do? Run the python requirement? Where/how would I install that?

Erro when startup comfyui

I get this error after I installed it and startup comfyUI

Traceback (most recent call last): File "C:\WorkingFiles\ML\StableDiffusion\GitRepos\ComfyUI\nodes.py", line 1093, in load_custom_node module_spec.loader.exec_module(module) File "<frozen importlib._bootstrap_external>", line 883, in exec_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "C:\WorkingFiles\ML\StableDiffusion\GitRepos\ComfyUI\custom_nodes\comfy_vid2vid\__init__.py", line 8, in <module> from .sd import load_checkpoint_guess_config File "C:\WorkingFiles\ML\StableDiffusion\GitRepos\ComfyUI\custom_nodes\comfy_vid2vid\sd.py", line 3, in <module> from comfy.sd import load_torch_file, load_model_weights, ModelPatcher, VAE, CLIP ImportError: cannot import name 'load_torch_file' from 'comfy.sd' (C:\WorkingFiles\ML\StableDiffusion\GitRepos\ComfyUI\comfy\sd.py)

diffusers[torch]==0.11.1 conflicts with ComfyUI-InstantID

this diffusers requirement prevents ComfyUI-InstantID from working. After upgrade of diffusers vid2vid no longer works but ComfyUI-InstantID does.

Error occurred when executing CheckpointLoaderSimpleSequence:

When running CheckpointLoaderSimpleSequence i get the error

CLIP.init() got an unexpected keyword argument 'config'

File "C:\Users\xxxx\Desktop\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "C:\Users\xxxx\Desktop\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "C:\Users\xxxx\Desktop\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "C:\Users\xxxx\Desktop\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfy_vid2vid_init_.py", line 262, in load_checkpoint
out = load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings"))
File "C:\Users\xxxx\Desktop\ComfyUI_windows_portable_nvidia_cu118_or_cpu\ComfyUI_windows_portable\ComfyUI\custom_nodes\comfy_vid2vid\sd.py", line 36, in load_checkpoint_guess_config
clip = CLIP(config=clip_config, embedding_directory=embedding_directory)

AttributeError: 'SparseCausalAttention' object has no attribute 'group_norm'

Getting this error:

raceback (most recent call last):
  File "E:\code\stable-difusion\ComfyUI\execution.py", line 257, in execute     
    recursive_execute(self.server, prompt, self.outputs, x, extra_data, executed, prompt_id, self.outputs_ui)
  File "E:\code\stable-difusion\ComfyUI\execution.py", line 120, in recursive_execute
    recursive_execute(server, prompt, outputs, input_unique_id, extra_data, executed, prompt_id, outputs_ui)
  File "E:\code\stable-difusion\ComfyUI\execution.py", line 120, in recursive_execute
    recursive_execute(server, prompt, outputs, input_unique_id, extra_data, executed, prompt_id, outputs_ui)
  File "E:\code\stable-difusion\ComfyUI\execution.py", line 120, in recursive_execute
    recursive_execute(server, prompt, outputs, input_unique_id, extra_data, executed, prompt_id, outputs_ui)
  File "E:\code\stable-difusion\ComfyUI\execution.py", line 128, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "E:\code\stable-difusion\ComfyUI\execution.py", line 75, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "E:\code\stable-difusion\ComfyUI\execution.py", line 68, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "E:\code\stable-difusion\ComfyUI\custom_nodes\comfy_vid2vid\__init__.py", line 332, in train_unet
    model_train = train(copy.deepcopy(model), noise_scheduler, samples, context[0][0].squeeze(0), device, max_train_steps=steps, seed=seed)
  File "E:\code\stable-difusion\ComfyUI\custom_nodes\comfy_vid2vid\train_tuneavideo.py", line 185, in train
    model_pred = unet(noisy_latents, timesteps, encoder_hidden_states)
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\utils\operations.py", line 521, in forward
    return model_forward(*args, **kwargs)
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\utils\operations.py", line 509, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\amp\autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "E:\code\stable-difusion\ComfyUI\custom_nodes\comfy_vid2vid\tuneavideo\models\unet.py", line 349, in forward
    sample, res_samples = downsample_block(
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\code\stable-difusion\ComfyUI\custom_nodes\comfy_vid2vid\tuneavideo\models\unet_blocks.py", line 301, in forward
    hidden_states = torch.utils.checkpoint.checkpoint(
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\checkpoint.py", line 249, in checkpoint
    return CheckpointFunction.apply(function, preserve, *args)
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\autograd\function.py", line 506, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\checkpoint.py", line 107, in forward
    outputs = run_function(*args)
  File "E:\code\stable-difusion\ComfyUI\custom_nodes\comfy_vid2vid\tuneavideo\models\unet_blocks.py", line 294, in custom_forward
    return module(*inputs, return_dict=return_dict)
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\code\stable-difusion\ComfyUI\custom_nodes\comfy_vid2vid\tuneavideo\models\attention.py", line 111, in forward
    hidden_states = block(
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\code\stable-difusion\ComfyUI\custom_nodes\comfy_vid2vid\tuneavideo\models\attention.py", line 243, in forward
    hidden_states = self.attn1(norm_hidden_states, attention_mask=attention_mask, video_length=video_length) + hidden_states
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "E:\code\stable-difusion\ComfyUI\custom_nodes\comfy_vid2vid\tuneavideo\models\attention.py", line 278, in forward
    if self.group_norm is not None:
  File "C:\Users\fcmen\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\modules\module.py", line 1614, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'SparseCausalAttention' object has no attribute 'group_norm'

with this workflow:

ValueError: All contexts must be equal in UNET

By using pretty much a copy of the vid2vid_with_lora.json workflow, I'm obtaining this error when sampling, the UNET finishes training normally.

I was trying to perform a simple test with a MikuMikuDance video on a black background for a quick test. Input context tested were:

woman, dance
hatsune miku, dancing
hatsune miku, dance

I'm a little unsure regarding if this is something on my side only, or a consequence of ComfyUI's newer updates. I tried tinkering with the tensor shapes to no avail.

sylym / comfy_vid2vid Goto Github PK

comfy_vid2vid's Introduction

Vid2vid Node Suite for ComfyUI

Install

Usage

Nodes

LoadImageSequence

LoadImageMaskSequence

VAEEncodeForInpaintSequence

DdimInversionSequence

SetLatentNoiseSequence

CheckpointLoaderSimpleSequence

LoraLoaderSequence

TrainUnetSequence

KSamplerSequence

Limits

comfy_vid2vid's People

Contributors

Stargazers

Watchers

Forkers

comfy_vid2vid's Issues

Recommend Projects

Recommend Topics

Recommend Org