The animatediff-cli-prompt-travel from s9roll7

pad_to_multiple_of

using negative embeds give this in terminal, I'm just ignoring it but
You are resizing the embedding layer without providing apad_to_multiple_of parameter. This means that the new embeding dimension will be 49499. This might induce some performance reduction as *Tensor Cores* will not be available. For more details about this, or help on choosing the correct value for resizing, refer to this guide: https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#requirements-tc

[Test] Upscale test!

I tried a few ways to upscale the video generated by AnimateDiff.

Webui ControlNet(tile+lineart+TemproalNet) batch img2img

Flickering is visible throughout the video, just like a normal batch img2img.

5-2x-RIFE-RIFE4.0-48fps.mp4

AnimateDiff ControlNet tile

Much cleaner than TemporalNet! However, due to high VRAM consumption, upscaling beyond 1024x1536 is not possible.

27.mp4

AnimateDiff ControlNet tile - > Webui Adetailer + NMKD YandereNeo (4x)

I upscaled the video to 1024x1536 with AnimateDiff, then used Adetailer to enhance the detail on the character's face, then upscaled it to 4K resolution with NMKD YandereNeo (4x). It became very sharp!

33.mp4

https://github.com/Bing-su/adetailer

https://openmodeldb.info/models/4x-NMKD-YandereNeo

https://huggingface.co/CiaraRowles/TemporalNet/tree/main

[Feature Request] Add additional controlnet models

Is it possible to add support for ControlNet models like depth, normal and seg? It will be very useful!

help!How to stylize video?

Can you please come up with a more detailed tutorial on stylize video? For some reason I can't seem to link huggingface online to download the model. So I would like to know what models need to be downloaded and the path to where they are stored. Also, I'd like to know the steps to stylize video in more detail, the current steps seem a bit obscure. But that doesn't stop it from being a wonderful job. It's just that I'd like to know in more detail. Thanks!

ImportError: cannot import name 'maybe_allow_in_graph' from 'diffusers.utils'

While trying to set up the animatediff-cli project following the installation steps, I got an ImportError when running the animatediff --help command.

I did the following:
Cloned the repository using git clone https://github.com/neggles/animatediff-cli
Created a virtual environment using python3.10 -m venv .venv
Activated the virtual environment with source .venv/bin/activate
Installed Torch and other dependencies as per the instructions.
Ran animatediff --help

Received an ImportError with the following traceback:

Traceback (most recent call last):
  File "/home/user/Deep-Learning/Stable Diffusion/animatediff-cli/.venv/bin/animatediff", line 7, in <module>
    from animatediff.cli import cli
  File "/home/user/Deep-Learning/Stable Diffusion/animatediff-cli/src/animatediff/cli.py", line 12, in <module>
    from animatediff.generate import create_pipeline, run_inference
  File "/home/user/Deep-Learning/Stable Diffusion/animatediff-cli/src/animatediff/generate.py", line 13, in <module>
    from animatediff.models.unet import UNet3DConditionModel
  File "/home/user/Deep-Learning/Stable Diffusion/animatediff-cli/src/animatediff/models/unet.py", line 18, in <module>
    from .unet_blocks import (
  File "/home/user/Deep-Learning/Stable Diffusion/animatediff-cli/src/animatediff/models/unet_blocks.py", line 9, in <module>
    from animatediff.models.attention import Transformer3DModel
  File "/home/user/Deep-Learning/Stable Diffusion/animatediff-cli/src/animatediff/models/attention.py", line 10, in <module>
    from diffusers.utils import BaseOutput, maybe_allow_in_graph
ImportError: cannot import name 'maybe_allow_in_graph' from 'diffusers.utils'

Environment:

OS: Ubuntu
Python Version: 3.10
Torch Version: 2.0.1+cu118

Would appreciate any help, thanks.

can not run offline?

Every time I run this program it tries to connect to 'raw.githubusercontent.com'.
Don't know if it's a problem with my settings or if it's just the way the program is.
Sometimes connection fails and error printed as "ConnectionError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url:
/CompVis/stable-diffusion/main/configs/stable-diffusion/v1-inference.yaml (Caused by
NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000001F8CD5B4CA0>: Failed to establish a new
connection: [Errno 11004] getaddrinfo failed'))"

[Test] motion module v2 test

mm_sd_v15.ckpt

15_v1.webm

mm_sd_v15_v2.ckpt

15_v2.webm

Using openpose and tile causes the result color to be strange

I used same sd model
I have 3 key frame in tile:

and 3 key frame in openpose:

but the result color is strange such like no VAE being used,and background is not black:

every frame such like this:

How can I generate result like tile keyframes
this is my prompt:
my command is animatediff generate -c config/prompts/prompt.json -W 448 -H 640 -L 30 -C 16
prompt.zip

Color change problems when using controlnet inpaint

I tested using the inpaint controlnet. It was an attempt to duplicate the input image, not for inpaining. The attempt was some success, but the color was changed. I tried various ways, but I couldn't find the cause of the problem. Is there a way to solve it? Thansk!

Running Openpose and other CNs in tile-upscale

Hi! I have set openpose in tile-upscale but its not being applied or it doesnt appear in console.
Any ideas?

First try, AttributeError: 'Attention' object has no attribute 'to_to_k'

Run command: animatediff generate -c config\prompts\prompt_travel.json
In prompt_travel.json I only changed the model and the motion paths. What else do I need to change?
Beyond the "win setup" readme instructions, I had an error at first which said to run pip install mediapipe
I have not copied over any controlnet models and haven't toggled any of the fields to disable, just looking for defaults that work.

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\a\animatediff-cli-prompt-travel\src\animatediff\cli.py:322 in generate │
│ │
│ 319 │ global g_pipeline │
│ 320 │ global last_model_path │
│ 321 │ if g_pipeline is None or last_model_path != model_config.path.resolve(): │
│ ❱ 322 │ │ g_pipeline = create_pipeline( │
│ 323 │ │ │ base_model=base_model_path, │
│ 324 │ │ │ model_config=model_config, │
│ 325 │ │ │ infer_config=infer_config, │
│ │
│ C:\a\animatediff-cli-prompt-travel\src\animatediff\generate.py:321 in │
│ create_pipeline │
│ │
│ 318 │ │ logger.info(f"Loading weights from {model_path}") │
│ 319 │ │ if model_path.is_file(): │
│ 320 │ │ │ logger.debug("Loading from single checkpoint file") │
│ ❱ 321 │ │ │ unet_state_dict, tenc_state_dict, vae_state_dict = get_checkpoint_weights(mo │
│ 322 │ │ elif model_path.is_dir(): │
│ 323 │ │ │ logger.debug("Loading from Diffusers model directory") │
│ 324 │ │ │ temp_pipeline = StableDiffusionPipeline.from_pretrained(model_path) │
│ │
│ C:\a\animatediff-cli-prompt-travel\src\animatediff\utils\model.py:73 in │
│ get_checkpoint_weights │
│ │
│ 70 │
│ 71 def get_checkpoint_weights(checkpoint: Path): │
│ 72 │ temp_pipeline: StableDiffusionPipeline │
│ ❱ 73 │ temp_pipeline, _ = checkpoint_to_pipeline(checkpoint, save=False) │
│ 74 │ unet_state_dict = temp_pipeline.unet.state_dict() │
│ 75 │ tenc_state_dict = temp_pipeline.text_encoder.state_dict() │
│ 76 │ vae_state_dict = temp_pipeline.vae.state_dict() │
│ │
│ C:\a\animatediff-cli-prompt-travel\src\animatediff\utils\model.py:58 in │
│ checkpoint_to_pipeline │
│ │
│ 55 │ if target_dir is None: │
│ 56 │ │ target_dir = pipeline_dir.joinpath(checkpoint.stem) │
│ 57 │ │
│ ❱ 58 │ pipeline = StableDiffusionPipeline.from_single_file( │
│ 59 │ │ pretrained_model_link_or_path=str(checkpoint.absolute()), │
│ 60 │ │ local_files_only=True, │
│ 61 │ │ load_safety_checker=False, │
│ │
│ c:\a\animatediff-cli-prompt-travel\venv\lib\site-packages\diffusers\loaders.p │
│ y:1471 in from_single_file │
│ │
│ 1468 │ │ │ │ force_download=force_download, │
│ 1469 │ │ │ ) │
│ 1470 │ │ │
│ ❱ 1471 │ │ pipe = download_from_original_stable_diffusion_ckpt( │
│ 1472 │ │ │ pretrained_model_link_or_path, │
│ 1473 │ │ │ pipeline_class=cls, │
│ 1474 │ │ │ model_type=model_type, │
│ │
│ c:\a\animatediff-cli-prompt-travel\venv\lib\site-packages\diffusers\pipelines │
│ \stable_diffusion\convert_from_ckpt.py:1374 in download_from_original_stable_diffusion_ckpt │
│ │
│ 1371 │ │ │ vae = AutoencoderKL(**vae_config) │
│ 1372 │ │ │
│ 1373 │ │ for param_name, param in converted_vae_checkpoint.items(): │
│ ❱ 1374 │ │ │ set_module_tensor_to_device(vae, param_name, "cpu", value=param) │
│ 1375 │ else: │
│ 1376 │ │ vae = AutoencoderKL.from_pretrained(vae_path) │
│ 1377 │
│ │
│ c:\a\animatediff-cli-prompt-travel\venv\lib\site-packages\accelerate\utils\mo │
│ deling.py:269 in set_module_tensor_to_device │
│ │
│ 266 │ if "." in tensor_name: │
│ 267 │ │ splits = tensor_name.split(".") │
│ 268 │ │ for split in splits[:-1]: │
│ ❱ 269 │ │ │ new_module = getattr(module, split) │
│ 270 │ │ │ if new_module is None: │
│ 271 │ │ │ │ raise ValueError(f"{module} has no attribute {split}.") │
│ 272 │ │ │ module = new_module │
│ │
│ c:\a\animatediff-cli-prompt-travel\venv\lib\site-packages\torch\nn\modules\mo │
│ dule.py:1614 in getattr │
│ │
│ 1611 │ │ │ modules = self.dict['_modules'] │
│ 1612 │ │ │ if name in modules: │
│ 1613 │ │ │ │ return modules[name] │
│ ❱ 1614 │ │ raise AttributeError("'{}' object has no attribute '{}'".format( │
│ 1615 │ │ │ type(self).name, name)) │
│ 1616 │ │
│ 1617 │ def setattr(self, name: str, value: Union[Tensor, 'Module']) -> None: │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'Attention' object has no attribute 'to_to_k'

traveling lora weights?

First, thanks for this amazing repo, the potential is amazing already. I find myself using prompts to steer animatediff but using CN to design the frames. with lora, would it be possible to also let lora weights travel? like say I have a inference which has a scene change, I can use prompt travel to change the prompt to match, but being able to change which lora influences those frames would be huge.

Is there any way to cut process and save images?

When I kill process (control + C) I lose all rendered frames.

Controlnet seg preprocessor detect map result is different from webui extension

The Detect Map image detected by the preprocessor on the controlnet seg differs from that obtained as a result of the webui controlnet extension. The image of webui has a fixed color for each object, but the image of prompt travel is blinking and not fixed.

Webui extension result

Generated AnimateDiff video

animatediff-cli-prompt-travel result

Generated AnimateDiff video

json config

animatediff command not found on saturn cloud

hi, when I run generate codes on saturn cloud notebook it say "/bin/bash: line 1: animatediff: command not found" but i runned that notebook on colab and it worked

mm_sd_v15_v2.safetensors does not exist or is not a file!

In generate.py, I am getting this error

Motion module /home/foo/apps/animatediff-cli-prompt-travel/data/models/Motion_Module/mm_sd_v15_v2.safetensors does not exist or is not a file!

Coming from this line of code:

            if not (motion_module.exists() and motion_module.is_file()):
                # this should never happen, but just in case...
                raise FileNotFoundError(f"Motion module {motion_module} does not exist or is not a file!")

Looks like the logic is checking for the file to be safetensors but I have a ckpt file. Thanks

stylize\video

Question: Is there a preview mode to review some keyframes?

I really enjoyed make videos on this project, but sometimes I feel a bit lost when trying to preview specific keyframes. For example, when I need to review the frames at 00:05, 00:10, 00:15… those moments are critical for ensuring the quality of the videos.

For now, i must produce and upscaled , retry, retry, retry till some frames looks good.

Shape issues with controlnet_shuffle

Hello, I'm having some trouble getting the majority of ControlNets to work correctly.

So far I've tried depth, canny, lineart_anime, tile, and shuffle, but have only been able to generate a video with shuffle (but the results looked pretty weird, at least for the first part of the video).

Result GIF with shuffle

The error I get is a shape mismatch in the ControlNet results:

Traceback

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /HUGE/Code/animatediff-cli-prompt-travel/src/animatediff/cli.py:363 in generate                  │
│                                                                                                  │
│   360 │   │   │   │   if int(k) < length:                                                        │
│   361 │   │   │   │   │   prompt_map[int(k)]=model_config.prompt_map[k]                          │
│   362 │   │   │                                                                                  │
│ ❱ 363 │   │   │   output = run_inference(                                                        │
│   364 │   │   │   │   pipeline=pipeline,                                                         │
│   365 │   │   │   │   prompt="this is dummy string",                                             │
│   366 │   │   │   │   n_prompt=n_prompt,                                                         │
│                                                                                                  │
│ /HUGE/Code/animatediff-cli-prompt-travel/src/animatediff/generate.py:443 in run_inference        │
│                                                                                                  │
│   440 │                                                                                          │
│   441 │   seed_everything(seed)                                                                  │
│   442 │                                                                                          │
│ ❱ 443 │   pipeline_output = pipeline(                                                            │
│   444 │   │   prompt=prompt,                                                                     │
│   445 │   │   negative_prompt=n_prompt,                                                          │
│   446 │   │   num_inference_steps=steps,                                                         │
│                                                                                                  │
│ /home/hans/.conda/envs/hans/lib/python3.10/site-packages/torch/utils/_contextlib.py:115 in       │
│ decorate_context                                                                                 │
│                                                                                                  │
│   112 │   @functools.wraps(func)                                                                 │
│   113 │   def decorate_context(*args, **kwargs):                                                 │
│   114 │   │   with ctx_factory():                                                                │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                                                   │
│   116 │                                                                                          │
│   117 │   return decorate_context                                                                │
│   118                                                                                            │
│                                                                                                  │
│ /HUGE/Code/animatediff-cli-prompt-travel/src/animatediff/pipelines/animation.py:958 in __call__  │
│                                                                                                  │
│    955 │   │   │   │   │                                                                         │
│    956 │   │   │   │   │   cur_prompt = get_current_prompt_embeds(context, latents.shape[2])     │
│    957 │   │   │   │   │                                                                         │
│ ❱  958 │   │   │   │   │   down_block_res_samples,mid_block_res_sample = get_controlnet_result(  │
│    959 │   │   │   │   │                                                                         │
│    960 │   │   │   │   │   # predict the noise residual                                          │
│    961 │   │   │   │   │   pred = self.unet(                                                     │
│                                                                                                  │
│ /HUGE/Code/animatediff-cli-prompt-travel/src/animatediff/pipelines/animation.py:899 in           │
│ get_controlnet_result                                                                            │
│                                                                                                  │
│    896 │   │   │   │   │   │   mod = torch.tensor(scales).to(device, dtype=cur_mid.dtype)        │
│    897 │   │   │   │   │   │                                                                     │
│    898 │   │   │   │   │   │   add = cur_mid * mod[None,None,:,None,None]                        │
│ ❱  899 │   │   │   │   │   │   _mid_block_res_samples[:, :, loc_index, :, :] = _mid_block_res_s  │
│    900 │   │   │   │   │   │                                                                     │
│    901 │   │   │   │   │   │   for ii in range(len(cur_down)):                                   │
│    902 │   │   │   │   │   │   │   add = cur_down[ii] * mod[None,None,:,None,None]               │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: shape mismatch: value tensor of shape [2, 1280, 1, 8, 8] cannot be broadcast to indexing result of shape [2, 1280, 1, 1, 1]

When I take a look at the shapes, all of the ControlNets that don't work are indeed different from shuffle, for example:

1_controlnet_shuffle
cur_down = [torch.Size([2, 320, 1, 1, 1]), torch.Size([2, 320, 1, 1, 1]), torch.Size([2, 320, 1, 1, 1]), torch.Size([2, 320, 1, 1, 1]), torch.Size([2, 640, 1, 1, 1]), torch.Size([2, 640, 1, 1, 1]), torch.Size([2,
640, 1, 1, 1]), torch.Size([2, 1280, 1, 1, 1]), torch.Size([2, 1280, 1, 1, 1]), torch.Size([2, 1280, 1, 1, 1]), torch.Size([2, 1280, 1, 1, 1]), torch.Size([2, 1280, 1, 1, 1])]
cur_mid =  torch.Size([2, 1280, 1, 1, 1])

1_controlnet_tile
cur_down = [torch.Size([2, 320, 1, 64, 64]), torch.Size([2, 320, 1, 64, 64]), torch.Size([2, 320, 1, 64, 64]), torch.Size([2, 320, 1, 32, 32]), torch.Size([2, 640, 1, 32, 32]), torch.Size([2, 640, 1, 32, 32]), 
torch.Size([2, 640, 1, 16, 16]), torch.Size([2, 1280, 1, 16, 16]), torch.Size([2, 1280, 1, 16, 16]), torch.Size([2, 1280, 1, 8, 8]), torch.Size([2, 1280, 1, 8, 8]), torch.Size([2, 1280, 1, 8, 8])]
cur_mid =  torch.Size([2, 1280, 1, 8, 8])

I'm using the exact same images in data/controlnet_image/test/controlnet_tile and data/controlnet_image/test/controlnet_shuffle. They're all 512x512 which is the same resolution that I'm rendering at (I also tried making the ControlNet images 256, but got the exact same result).

The pre-processed images in output/.../00_detectmap look correct for all of the ControlNets, but still only shuffle has the right shape during generation.

Any idea what I might be doing wrong?

My config

{
  "name": "blobcube",
  "path": "models/sd/v1-5-pruned-emaonly.safetensors",
  "motion_module": "models/motion-module/mm_sd_v15.ckpt",
  "compile": false,
  "seed": [
    -1
  ],
  "scheduler": "k_dpmpp_sde",
  "steps": 40,
  "guidance_scale": 20,
  "clip_skip": 2,
  "prompt_map": {
    "0": "sci-fi futuristic digital machinery on a white background, beautiful cyberpunk cartoon illustration"
  },
  "n_prompt": [
    ""
  ],
  "lora_map": {},
  "controlnet_map": {
    "input_image_dir": "controlnet_image/test",
    "max_samples_on_vram": 999,
    "save_detectmap": true,
    "controlnet_shuffle": {
      "enable": true,
      "use_preprocessor": true,
      "controlnet_conditioning_scale": 1.0,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0,
      "control_scale_list": []
    },
    "controlnet_tile": {
      "enable": true,
      "use_preprocessor": true,
      "controlnet_conditioning_scale": 1.0,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0,
      "control_scale_list": []
    },
    "controlnet_depth": {
      "enable": false,
      "use_preprocessor": true,
      "controlnet_conditioning_scale": 1.0,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0,
      "control_scale_list": []
    },
    "controlnet_lineart_anime": {
      "enable": true,
      "use_preprocessor": true,
      "controlnet_conditioning_scale": 1.0,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0,
      "control_scale_list": []
    }
  },
  "upscale_config": {
    "scheduler": "k_dpmpp_sde",
    "steps": 20,
    "strength": 0.5,
    "guidance_scale": 10,
    "controlnet_tile": {
      "enable": true,
      "controlnet_conditioning_scale": 1.0,
      "guess_mode": false,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0
    },
    "controlnet_ref": {
      "enable": true,
      "use_frame_as_ref_image": false,
      "use_1st_frame_as_ref_image": true,
      "ref_image": "none",
      "attention_auto_machine_weight": 1.0,
      "gn_auto_machine_weight": 1.0,
      "style_fidelity": 0.25,
      "reference_attn": true,
      "reference_adain": false
    }
  }
}

New motion lora support?

Curious if anyone's taking a stab at implementing the new motion LoRas for camera movements? It strikes me that this project may handle those movements even better than base repo since you could in theory transition from camera movement to camera movement

IndexError: list assignment index out of range

13:13:19 INFO     Creating AnimationPipeline...                                                                                                               generate.py:206
         INFO     No TI embeddings found                                                                                                                            ti.py:102
         INFO     Sending pipeline to device "cuda"                                                                                                            pipeline.py:23
         INFO     Selected data types: unet_dtype=torch.float16, tenc_dtype=torch.float16, vae_dtype=torch.bfloat16                                              device.py:90
         INFO     Using channels_last memory format for UNet and VAE                                                                                            device.py:109
         INFO     -> Selected data types: unet_dtype=torch.bfloat16,tenc_dtype=torch.bfloat16,vae_dtype=torch.bfloat16                                         pipeline.py:56
13:13:23 INFO     Saving prompt config to output directory                                                                                                         cli.py:328
         INFO     Initialization complete!                                                                                                                         cli.py:337
         INFO     Generating 1 animations from 1 prompts                                                                                                           cli.py:338
         INFO     Running generation 1 of 1 (prompt 1)                                                                                                             cli.py:347
         INFO     Generation seed: 341774366206100                                                                                                                 cli.py:357
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\src\animatediff\cli.py:364   │
│ in generate                                                                                      │
│                                                                                                  │
│   361 │   │   │   │   if int(k) < length:                                                        │
│   362 │   │   │   │   │   prompt_map[int(k)]=model_config.prompt_map[k]                          │
│   363 │   │   │                                                                                  │
│ ❱ 364 │   │   │   output = run_inference(                                                        │
│   365 │   │   │   │   pipeline=pipeline,                                                         │
│   366 │   │   │   │   prompt="this is dummy string",                                             │
│   367 │   │   │   │   n_prompt=n_prompt,                                                         │
│                                                                                                  │
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\src\animatediff\generate.py: │
│ 401 in run_inference                                                                             │
│                                                                                                  │
│   398 │                                                                                          │
│   399 │   seed_everything(seed)                                                                  │
│   400 │                                                                                          │
│ ❱ 401 │   pipeline_output = pipeline(                                                            │
│   402 │   │   prompt=prompt,                                                                     │
│   403 │   │   negative_prompt=n_prompt,                                                          │
│   404 │   │   num_inference_steps=steps,                                                         │
│                                                                                                  │
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\venv\lib\site-packages\torch │
│ \utils\_contextlib.py:115 in decorate_context                                                    │
│                                                                                                  │
│   112 │   @functools.wraps(func)                                                                 │
│   113 │   def decorate_context(*args, **kwargs):                                                 │
│   114 │   │   with ctx_factory():                                                                │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                                                   │
│   116 │                                                                                          │
│   117 │   return decorate_context                                                                │
│   118                                                                                            │
│                                                                                                  │
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\src\animatediff\pipelines\an │
│ imation.py:751 in __call__                                                                       │
│                                                                                                  │
│    748 │   │   │   │   │   }                                                                     │
│    749 │   │   │   │   │                                                                         │
│    750 │   │   │   │   │   for f in frames:                                                      │
│ ❱  751 │   │   │   │   │   │   controlnet_affected_list[f] = True                                │
│    752 │   │                                                                                     │
│    753 │   │                                                                                     │
│    754 │   │   def controlnet_is_affected( frame_index:int):                                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
IndexError: list assignment index out of range

using the prompt_travel.json

Can you upload the full config of the examples in readme as a reference?

Please merge this fork into the main AnimateDiff-CLI

It would be great if you could submit a PR for the main repo so it could be integrated into the root repo so we can stay put to date with new features and fixes while being able to use your wonderful work. 🙏

Question : Out of memory in colab ?

Hi, I'm running example :

!animatediff generate -c /content/animatediff-cli-prompt-travel/config/prompts/prompt_travel.json -W 256 -H 384 -L 128 -C 16

and it stop by itself, maybe out of memory, the last log :

...
Downloading mm_sd_v15_v2.ckpt: 100% 1.82G/1.82G [00:17<00:00, 102MB/s]
Loading tokenizer...
Loading text encoder...
Loading VAE...
Loading UNet...
Loaded 417.1376M-parameter motion module
Using scheduler "k_dpmpp_sde" (DPMSolverSinglestepScheduler)
Loading weights from /content/animatediff-cli-prompt-travel/data/share/Stable-diffusion/mistoonAnime_v20.safetensors
^C

Is 12GB RAM is not enough ? Tried to set -L to 64 and still same result

TIA

-r just repeats the same seed even if -1

I set it to -1 and did -r 5 but it just came back with the same seed.

Any tips for upscaling?

Whenever I run an uspcale (usually with controlnet_tile) it seems to introduce a lot of jittering and variance frame to frame. Curious if anyone has had success with upscale settings to polish up generations

[Feature Request] ControlNet-Reference Support for Video Generate

Currently ControlNet references are only available for upscale. Is it possible to add to the video generation as well? It would be very useful if possible! Thanks!

An error occurs when creating more than one video using a preprocessor.

An error occurs when creating more than one video using a preprocessor.
I tested 2 types of line art_anime and soft edge. 1 video saves normally, no error occurs, but raising -r to 2 or higher results in an error.
Images processed by the preprocessor are also saved normally.

Preprocessing images (controlnet_lineart_anime) 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/2 [ 0:00:00 < -:--:-- , ? it/s ]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\toyxy\animatediff-cli-prompt-travel\src\animatediff\cli.py:363 in generate │
│ │
│ 360 │ │ │ │ if int(k) < length: │
│ 361 │ │ │ │ │ prompt_map[int(k)]=model_config.prompt_map[k] │
│ 362 │ │ │ │
│ ❱ 363 │ │ │ output = run_inference( │
│ 364 │ │ │ │ pipeline=pipeline, │
│ 365 │ │ │ │ prompt="this is dummy string", │
│ 366 │ │ │ │ n_prompt=n_prompt, │
│ │
│ C:\Users\toyxy\animatediff-cli-prompt-travel\src\animatediff\generate.py:472 in run_inference │
│ │
│ 469 │ │ │ │ │ │ │ if frame_no < duration: │
│ 470 │ │ │ │ │ │ │ │ if frame_no not in controlnet_image_map: │
│ 471 │ │ │ │ │ │ │ │ │ controlnet_image_map[frame_no] = {} │
│ ❱ 472 │ │ │ │ │ │ │ │ controlnet_image_map[frame_no][c] = get_preprocessed_img │
│ 473 │ │ │ │ │ │ │ │ processed = True │
│ 474 │ │ │ │
│ 475 │ │ │ if save_detectmap and processed: │
│ │
│ C:\Users\toyxy\animatediff-cli-prompt-travel\src\animatediff\generate.py:172 in │
│ get_preprocessed_img │
│ │
│ 169 │ if type_str in ( "controlnet_tile", "controlnet_ip2p", "controlnet_inpaint"): │
│ 170 │ │ return img │
│ 171 │ else: │
│ ❱ 172 │ │ return get_preprocessor(type_str, device_str)(img) if use_preprocessor else img │
│ 173 │
│ 174 │
│ 175 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: 'NoneType' object is not callable

[Test] Multi ControlNet tests!

I prepared an image sequence of ControlNet (openpose/line art) of full frame (16). And I used these to generate the video. works very well!

8.mp4

6.mp4

What does prompt_fixed_ratio do here?

Thanks for your great work. But prompt_fixed_ratio seems a little confusing. What is it used for? Any response will be greatly appreciated.

[Test] Stylize Video

Tile -> upscale
motion_module : mm_sd_v14.ckpt
steps : 20
guidance_scale : 10
[0]
ip_adapter_plus ("is_plus_face": false, "is_plus": true) / scale : 0.5
controlnet_tile / controlnet_conditioning_scale : 0.75
size : 512x512
context : 16
[1]
ip_adapter_plus / scale : 0.5
controlnet_tile / controlnet_conditioning_scale : 1.0
size : 1024x1024
context : 8

style_tile_sample.mp4

Lineart -> upscale
motion_module : mm_sd_v15_v2.ckpt
steps : 20
guidance_scale : 10
[0]
ip_adapter_plus / scale : 0.5
controlnet_lineart / controlnet_conditioning_scale : 1.0
controlnet_ip2p / controlnet_conditioning_scale : 0.5
size : 512x512
context : 16
[1]
ip_adapter_plus / scale : 0.5
controlnet_tile / controlnet_conditioning_scale : 1.0
controlnet_ip2p / controlnet_conditioning_scale : 0.5
size : 768x768
context : 8

lineart_style_sample.mp4

Openpose -> upscale
motion_module : mm_sd_v15_v2.ckpt
steps : 20
guidance_scale : 10
[0]
ip_adapter_plus / scale : 0.5
controlnet_openpose / controlnet_conditioning_scale : 1.0
size : 512x512
context : 16
[1]
ip_adapter_plus / scale : 0.5
controlnet_tile / controlnet_conditioning_scale : 1.0
size : 1024x1024
context : 8

openpose_style_sample.mp4

Softedge -> upscale
motion_module : mm-Stabilized_high.pth
steps : 20
guidance_scale : 10
[0]
ip_adapter_plus / scale : 0.5
controlnet_softedge / controlnet_conditioning_scale : 1.0
controlnet_ip2p / controlnet_conditioning_scale : 0.5
size : 512x760
context : 16
[1]
ip_adapter_plus / scale : 0.5
controlnet_tile / controlnet_conditioning_scale : 1.0
size : 768x1136
context : 8

softedge_style_sample.mp4

RuntimeError: CUDA error: invalid configuration argument

【How to solve this problem? Here is the error info】
17:01:40 INFO NumExpr defaulting to 6 threads. utils.py:160
INFO Using generation config: config/prompts/prompt_travel.json cli.py:279
INFO Using base model: runwayml/stable-diffusion-v1-5 cli.py:295
INFO Will save outputs to ./output/2023-09-16T17-01-40-animation-videos-dreamshaper_8 cli.py:303
Preprocessing images (controlnet_openpose) 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/3 [ 0:00:00 < -:--:-- , ? it/s ] INFO Loading openpose_full processor.py:94
Preprocessing images (controlnet_openpose) 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3/3 [ 0:00:08 < 0:00:00 , 1 it/s ]
Saving Preprocessed images (controlnet_openpose) 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/3 [ 0:00:00 < -:--:-- , ? it/s ]
Preprocessing images (controlnet_softedge) 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/3 [ 0:00:00 < -:--:-- , ? it/s ]17:01:48 INFO Loading softedge_pidsafe processor.py:94
Preprocessing images (controlnet_softedge) 33% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/3 [ 0:00:00 < -:--:-- , ? it/s ]
Saving Preprocessed images (controlnet_softedge) 0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/3 [ 0:00:00 < -:--:-- , ? it/s ]
17:01:49 INFO Checking motion module... generate.py:261
INFO Loading tokenizer... generate.py:275
INFO Loading text encoder... generate.py:277
17:01:50 INFO Loading VAE... generate.py:279
INFO Loading UNet... generate.py:281
17:01:59 INFO Loaded 453.20928M-parameter motion module unet.py:578
INFO Using scheduler "ddim" (DDIMScheduler) generate.py:293
INFO Loading weights from /home/pymo/animatediff-cli/data/models/sd/dreamshaper_8.safetensors generate.py:298
17:02:03 INFO Merging weights into UNet... generate.py:315
INFO Enabling xformers memory-efficient attention generate.py:330
17:02:04 INFO Creating AnimationPipeline... generate.py:342
INFO No TI embeddings found ti.py:102
INFO loading c='controlnet_openpose' model generate.py:371
17:02:05 INFO loading c='controlnet_softedge' model generate.py:371
17:02:06 INFO Sending pipeline to device "cuda" pipeline.py:22
INFO Selected data types: unet_dtype=torch.float16, tenc_dtype=torch.float16, vae_dtype=torch.bfloat16 device.py:90
INFO Using channels_last memory format for UNet and VAE device.py:111
17:02:09 INFO Saving prompt config to output directory cli.py:354
INFO Initialization complete! cli.py:362
INFO Generating 1 animations cli.py:363
INFO Running generation 1 of 1 cli.py:373
INFO Generation seed: 341774366206100 cli.py:383
0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/120 [ 0:00:00 < -:--:-- , ? it/s ]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /home/pymo/animatediff-cli/src/animatediff/cli.py:396 in generate │
│ │
│ 393 │ │ │ │ │ │
│ 394 │ │ │ │ │ prompt_map[int(k)]=pr │
│ 395 │ │ │ │
│ ❱ 396 │ │ │ output = run_inference( │
│ 397 │ │ │ │ pipeline=g_pipeline, │
│ 398 │ │ │ │ prompt="this is dummy string", │
│ 399 │ │ │ │ n_prompt=n_prompt, │
│ │
│ /home/pymo/animatediff-cli/src/animatediff/generate.py:680 in run_inference │
│ │
│ 677 │ │
│ 678 │ seed_everything(seed) │
│ 679 │ │
│ ❱ 680 │ pipeline_output = pipeline( │
│ 681 │ │ prompt=prompt, │
│ 682 │ │ negative_prompt=n_prompt, │
│ 683 │ │ num_inference_steps=steps, │
│ │
│ /home/pymo/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py:115 in │
│ decorate_context │
│ │
│ 112 │ @functools.wraps(func) │
│ 113 │ def decorate_context(*args, **kwargs): │
│ 114 │ │ with ctx_factory(): │
│ ❱ 115 │ │ │ return func(*args, **kwargs) │
│ 116 │ │
│ 117 │ return decorate_context │
│ 118 │
│ │
│ /home/pymo/animatediff-cli/src/animatediff/pipelines/animation.py:2348 in call │
│ │
│ 2345 │ │ │ │ │ # predict the noise residual │
│ 2346 │ │ │ │ │ │
│ 2347 │ │ │ │ │ stopwatch_record("normal unet start") │
│ ❱ 2348 │ │ │ │ │ pred = self.unet( │
│ 2349 │ │ │ │ │ │ latent_model_input.to(self.unet.device, self.unet.dtype), │
│ 2350 │ │ │ │ │ │ t, │
│ 2351 │ │ │ │ │ │ encoder_hidden_states=cur_prompt, │
│ │
│ /home/pymo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1501 in _call_impl │
│ │
│ 1498 │ │ if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks │
│ 1499 │ │ │ │ or _global_backward_pre_hooks or _global_backward_hooks │
│ 1500 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1501 │ │ │ return forward_call(*args, **kwargs) │
│ 1502 │ │ # Do not call functions when jit is used │
│ 1503 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1504 │ │ backward_pre_hooks = [] │
│ │
│ /home/pymo/animatediff-cli/src/animatediff/models/unet.py:427 in forward │
│ │
│ 424 │ │ down_block_res_samples = (sample,) │
│ 425 │ │ for downsample_block in self.down_blocks: │
│ 426 │ │ │ if hasattr(downsample_block, "has_cross_attention") and downsample_block.has │
│ ❱ 427 │ │ │ │ sample, res_samples = downsample_block( │
│ 428 │ │ │ │ │ hidden_states=sample, │
│ 429 │ │ │ │ │ temb=emb, │
│ 430 │ │ │ │ │ encoder_hidden_states=encoder_hidden_states, │
│ │
│ /home/pymo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1501 in _call_impl │
│ │
│ 1498 │ │ if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks │
│ 1499 │ │ │ │ or _global_backward_pre_hooks or _global_backward_hooks │
│ 1500 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1501 │ │ │ return forward_call(*args, **kwargs) │
│ 1502 │ │ # Do not call functions when jit is used │
│ 1503 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1504 │ │ backward_pre_hooks = [] │
│ │
│ /home/pymo/animatediff-cli/src/animatediff/models/unet_blocks.py:439 in forward │
│ │
│ 436 │ │ │ │ )[0] │
│ 437 │ │ │ │ # add motion module │
│ 438 │ │ │ │ hidden_states = ( │
│ ❱ 439 │ │ │ │ │ motion_module(hidden_states, temb, encoder_hidden_states=encoder_hid │
│ 440 │ │ │ │ │ if motion_module is not None │
│ 441 │ │ │ │ │ else hidden_states │
│ 442 │ │ │ │ ) │
│ │
│ /home/pymo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1501 in _call_impl │
│ │
│ 1498 │ │ if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks │
│ 1499 │ │ │ │ or _global_backward_pre_hooks or _global_backward_hooks │
│ 1500 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1501 │ │ │ return forward_call(*args, **kwargs) │
│ 1502 │ │ # Do not call functions when jit is used │
│ 1503 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1504 │ │ backward_pre_hooks = [] │
│ │
│ /home/pymo/animatediff-cli/src/animatediff/models/motion_module.py:67 in forward │
│ │
│ 64 │ │
│ 65 │ def forward(self, input_tensor, temb, encoder_hidden_states, attention_mask=None, an │
│ 66 │ │ hidden_states = input_tensor │
│ ❱ 67 │ │ hidden_states = self.temporal_transformer(hidden_states, encoder_hidden_states, │
│ 68 │ │ │
│ 69 │ │ output = hidden_states │
│ 70 │ │ return output │
│ │
│ /home/pymo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1501 in _call_impl │
│ │
│ 1498 │ │ if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks │
│ 1499 │ │ │ │ or _global_backward_pre_hooks or _global_backward_hooks │
│ 1500 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1501 │ │ │ return forward_call(*args, **kwargs) │
│ 1502 │ │ # Do not call functions when jit is used │
│ 1503 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1504 │ │ backward_pre_hooks = [] │
│ │
│ /home/pymo/animatediff-cli/src/animatediff/models/motion_module.py:148 in forward │
│ │
│ 145 │ │ │
│ 146 │ │ # Transformer Blocks │
│ 147 │ │ for block in self.transformer_blocks: │
│ ❱ 148 │ │ │ hidden_states = block( │
│ 149 │ │ │ │ hidden_states, encoder_hidden_states=encoder_hidden_states, video_length │
│ 150 │ │ │ ) │
│ 151 │
│ │
│ /home/pymo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1501 in _call_impl │
│ │
│ 1498 │ │ if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks │
│ 1499 │ │ │ │ or _global_backward_pre_hooks or _global_backward_hooks │
│ 1500 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1501 │ │ │ return forward_call(*args, **kwargs) │
│ 1502 │ │ # Do not call functions when jit is used │
│ 1503 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1504 │ │ backward_pre_hooks = [] │
│ │
│ /home/pymo/animatediff-cli/src/animatediff/models/motion_module.py:218 in forward │
│ │
│ 215 │ │ for attention_block, norm in zip(self.attention_blocks, self.norms): │
│ 216 │ │ │ norm_hidden_states = norm(hidden_states) │
│ 217 │ │ │ hidden_states = ( │
│ ❱ 218 │ │ │ │ attention_block( │
│ 219 │ │ │ │ │ norm_hidden_states, │
│ 220 │ │ │ │ │ encoder_hidden_states=encoder_hidden_states │
│ 221 │ │ │ │ │ if attention_block.is_cross_attention │
│ │
│ /home/pymo/.local/lib/python3.10/site-packages/torch/nn/modules/module.py:1501 in _call_impl │
│ │
│ 1498 │ │ if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks │
│ 1499 │ │ │ │ or _global_backward_pre_hooks or _global_backward_hooks │
│ 1500 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1501 │ │ │ return forward_call(*args, **kwargs) │
│ 1502 │ │ # Do not call functions when jit is used │
│ 1503 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1504 │ │ backward_pre_hooks = [] │
│ │
│ /home/pymo/animatediff-cli/src/animatediff/models/motion_module.py:296 in forward │
│ │
│ 293 │ │ │ raise NotImplementedError │
│ 294 │ │ │
│ 295 │ │ # attention processor makes this easy so that's nice │
│ ❱ 296 │ │ hidden_states = self.processor(self, hidden_states, encoder_hidden_states, atten │
│ 297 │ │ │
│ 298 │ │ if self.attention_mode == "Temporal": │
│ 299 │ │ │ hidden_states = rearrange(hidden_states, "(b d) f c -> (b f) d c", d=d) │
│ │
│ /home/pymo/miniconda3/envs/animatediff/lib/python3.10/site-packages/diffusers/models/attention_p │
│ rocessor.py:1046 in call │
│ │
│ 1043 │ │ key = attn.head_to_batch_dim(key).contiguous() │
│ 1044 │ │ value = attn.head_to_batch_dim(value).contiguous() │
│ 1045 │ │ │
│ ❱ 1046 │ │ hidden_states = xformers.ops.memory_efficient_attention( │
│ 1047 │ │ │ query, key, value, attn_bias=attention_mask, op=self.attention_op, scale=att │
│ 1048 │ │ ) │
│ 1049 │ │ hidden_states = hidden_states.to(query.dtype) │
│ │
│ /home/pymo/miniconda3/envs/animatediff/lib/python3.10/site-packages/xformers/ops/fmha/init.p │
│ y:193 in memory_efficient_attention │
│ │
│ 190 │ │ and options. │
│ 191 │ :return: multi-head attention Tensor with shape [B, Mq, H, Kv] │
│ 192 │ """ │
│ ❱ 193 │ return _memory_efficient_attention( │
│ 194 │ │ Inputs( │
│ 195 │ │ │ query=query, key=key, value=value, p=p, attn_bias=attn_bias, scale=scale │
│ 196 │ │ ), │
│ │
│ /home/pymo/miniconda3/envs/animatediff/lib/python3.10/site-packages/xformers/ops/fmha/init.p │
│ y:291 in _memory_efficient_attention │
│ │
│ 288 ) -> torch.Tensor: │
│ 289 │ # fast-path that doesn't require computing the logsumexp for backward computation │
│ 290 │ if all(x.requires_grad is False for x in [inp.query, inp.key, inp.value]): │
│ ❱ 291 │ │ return _memory_efficient_attention_forward( │
│ 292 │ │ │ inp, op=op[0] if op is not None else None │
│ 293 │ │ ) │
│ 294 │
│ │
│ /home/pymo/miniconda3/envs/animatediff/lib/python3.10/site-packages/xformers/ops/fmha/init.p │
│ y:311 in _memory_efficient_attention_forward │
│ │
│ 308 │ else: │
│ 309 │ │ ensure_op_supports_or_raise(ValueError, "memory_efficient_attention", op, inp) │
│ 310 │ │
│ ❱ 311 │ out, * = op.apply(inp, needs_gradient=False) │
│ 312 │ return out.reshape(output_shape) │
│ 313 │
│ 314 │
│ │
│ /home/pymo/miniconda3/envs/animatediff/lib/python3.10/site-packages/xformers/ops/fmha/flash.py:2 │
│ 51 in apply │
│ │
│ 248 │ │ │ cu_seqlens_k, │
│ 249 │ │ │ max_seqlen_k, │
│ 250 │ │ ) = _convert_input_format(inp) │
│ ❱ 251 │ │ out, softmax_lse, rng_state = cls.OPERATOR( │
│ 252 │ │ │ inp.query, │
│ 253 │ │ │ inp.key, │
│ 254 │ │ │ inp.value, │
│ │
│ /home/pymo/.local/lib/python3.10/site-packages/torch/_ops.py:502 in call │
│ │
│ 499 │ │ # is still callable from JIT │
│ 500 │ │ # We save the function ptr as the op attribute on │
│ 501 │ │ # OpOverloadPacket to access it here. │
│ ❱ 502 │ │ return self._op(*args, **kwargs or {}) │
│ 503 │ │
│ 504 │ # TODO: use this to make a dir │
│ 505 │ def overloads(self): │
│ │
│ /home/pymo/miniconda3/envs/animatediff/lib/python3.10/site-packages/xformers/ops/fmha/flash.py:7 │
│ 9 in _flash_fwd │
│ │
│ 76 │ │ │ softmax_lse, │
│ 77 │ │ │ p, │
│ 78 │ │ │ rng_state, │
│ ❱ 79 │ │ ) = _C_flashattention.varlen_fwd( │
│ 80 │ │ │ query, │
│ 81 │ │ │ key, │
│ 82 │ │ │ value, │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: CUDA error: invalid configuration argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

[Feature Request]Options for advance controlnet settings

The controlnet extension in webui allow use of different control mode, from the repo wiki:
"control_mode" : see the related issue for usage. defaults to 0. Accepted values:

0 or "Balanced" : balanced, no preference between prompt and control model
1 or "My prompt is more important" : the prompt has more impact than the model
2 or "ControlNet is more important" : the controlnet model has more impact than the prompt

This would allow more control over how the controlnet influence the output
The pixel perfect function is also handy as it allow sizing the controlnet picture to the output size of the rendering

Hello, what should I do if I encounter this problem?

PS F:\diff\animatediff-cli-prompt-travel-main> animatediff generate -c config/prompts/000sb.json -W 256 -H 384 -L 128 -C 16
Error in sitecustomize; set PYTHONVERBOSE for traceback:
TypeError: expected str, bytes or os.PathLike object, not NoneType
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in _run_module_as_main:196 │
│ │
│ in run_code:86 │
│ │
│ in :4 │
│ │
│ 1 # -- coding: utf-8 -- │
│ 2 import re │
│ 3 import sys │
│ ❱ 4 from animatediff.cli import cli │
│ 5 if name == 'main': │
│ 6 │ sys.argv[0] = re.sub(r'(-script.pyw|.exe)?$', '', sys.argv[0]) │
│ 7 │ sys.exit(cli()) │
│ │
│ F:\diff\animatediff-cli-prompt-travel-main\src\animatediff\cli.py:15 in │
│ │
│ 12 from rich.logging import RichHandler │
│ 13 │
│ 14 from animatediff import version, console, get_dir │
│ ❱ 15 from animatediff.generate import (controlnet_preprocess, create_pipeline, │
│ 16 │ │ │ │ │ │ │ │ create_us_pipeline, ip_adapter_preprocess, │
│ 17 │ │ │ │ │ │ │ │ load_controlnet_models, run_inference, │
│ 18 │ │ │ │ │ │ │ │ run_upscale, save_output, │
│ │
│ F:\diff\animatediff-cli-prompt-travel-main\src\animatediff\generate.py:21 in │
│ │
│ 18 │ │ │ │ │ StableDiffusionPipeline) │
│ 19 from PIL import Image │
│ 20 from tqdm.rich import tqdm │
│ ❱ 21 from transformers import (AutoImageProcessor, CLIPImageProcessor, │
│ 22 │ │ │ │ │ │ CLIPTextModel, CLIPTokenizer, │
│ 23 │ │ │ │ │ │ UperNetForSemanticSegmentation) │
│ 24 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ImportError: cannot import name 'UperNetForSemanticSegmentation' from 'transformers' (F:\SD\sd-webui-aki\1sd-webui-aki-v4\py310\lib\site-packages\transformers_init.py)

Where should I put the controlnet model

I already have ControlNet 1.1 models in my WebUI project,model extension is .pth, I don't see the settings path in the config, and if I generate,it will download controlnet models, but the extension is .safetensors. how do I use my controlnet files,

init image support

great work

can this support init image?

[Test] Guess mode test!

I tested the newly added Guess mode support!

When Guess mode is used, the effect on the background seems to be stronger. I guess more testing is needed.

26.mp4

[Feature Request] Options for the Control Net Preprocessor Enable/Disalbe

Thank you for your wonderful work! Currently, when I run the script, the preprocessor of the controlnet works together. It would be nice if the Enable/Disable option was added so that the user could use the line and open pose images directly.

Some questions about controlNET and LoRAs.

I have 6 punctual questions:

Which LoRAs are compatible and which are incompatible with animatediff-cli-prompt-travel? So far some LoRAs i have tried work great with this.
Do embeddings (textual inversion) work with this?
The main page says i have to change the 999 for another number here, any suggestion which number i should choose?

 "controlnet_map": {
    "input_image_dir" : "controlnet_image/test",
    "max_samples_on_vram": 999,
    "save_detectmap": true,
    "preprocess_on_gpu": true,

How do i make ControlNET work with this? I drag the PNG and rename them JUST like the example says, but they are not used in the generation.

And they are enabled:

   "controlnet_openpose":{
      "enable": true,
      "use_preprocessor":true,
      "guess_mode":false,
      "controlnet_conditioning_scale": 1.0,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0,
      "control_scale_list":[0.5,0.4,0.3,0.2,0.1]
    },

    "controlnet_canny": {
      "enable": true,
      "use_preprocessor":true,
      "guess_mode":false,
      "controlnet_conditioning_scale": 1.0,
      "control_guidance_start": 0.0,
      "control_guidance_end": 1.0,
      "control_scale_list":[0.5,0.4,0.3,0.2,0.1]
    },

I am putting the PNG on openpose and canny but they are completely ignored, even in the output they do not appear while the examples have a subfolder in the output to show the controlNET used.

How do i "order" the controlnet? I want canny as a background (so the entire animation has a stable background), and open pose as the pose of the subject, i suppose this is done automatically? Or i have to order the "layers" of controlNET?
Any way to remove that "shutterstock" text on the bottom? I tried to put (watermark, text:1.5), logo, text on the negative prompt but nothing, the text is still there.

Frozen when trying stylize command

Getting stuck after running the command:
animatediff stylize create-config stylize/snaptik.mp4

Looks similar to issue #31, however I don't have a '.' in my model name so I believe the path should be valid.

Running a standard txt2vid works fine, just seems to be a problem with stylize (vid2vid).

In the output folder is several controlnet folders with some generated frames (around 400 or so), but the number of files doesn't appear to increase even though Powershell>Python shows about 7% CPU usage and 0% GPU.

Usable LoRA

Sorry to bother you. Can you provide me some publicly available LoRA model ? I have tried my LoRA model (trained based on mistoonAnime_v20) but it doesn't work.
I'd like to confirm if the problem is with my lora model or if the code needs to be modified.

Where is inference images are located?

While producing images with more than one prompt, we cannot see the images until the entire production process is completed. I wonder where the images produced during production are stored, is there a temp file or a place where we can see these intermediate images?

can I change the number of steps in the upscale stylize?

I'm curious if it really needs the equal number of steps, I'm just comparing to hires.fix process and don't really know, but I tried to add steps to the config and it didn't work.

How do I disable prompt travel and only use a single prompt?

Removing the prompt map and tail prompt from the json (only leaving head_prompt) breaks generation.

The video program has frozen or become unresponsive

The images can be extracted, but the configuration file cannot be generated, and the program continues to run indefinitely without completing. I have to force close it.
Only the model and prompts were changed in the settings, everything else remained unchanged

'animatediff' is an internal or external command, It is not recognized as an operable program or batch file.

Hi, perhaps this is silly question but please help.
I followed the instructions of installation but this error came out.
Do I have to combine this library with original Animatediff to run?

(venv) D:\animatediff-cli-prompt-travel>animatediff generate -c config/prompts/prompt_travel.json -W 256 -H 384 -L 128 -C 16
'animatediff' is an internal or external command,
It is not recognized as an operable program or batch file.

(venv) D:\animatediff-cli-prompt-travel>animatediff --help
'animatediff' is an internal or external command,
It is not recognized as an operable program or batch file.

Questions about control_scale_list!

What I understand is as follows. "control_scale_list" is a parameter that adjusts the effect of the input controlnet image frame on each adjacent frame. If the list is [0.5, 0.4], the controlnet image inserted in frame 5 has a scale of 0.5 in frames 4 and 6, and a scale of 0.4 in frames 3 and 7.

And the two parameters are calculated as follows.

control_scale_list * controlnet_conditioning_scale = Result scale

Therefore, if the scale is 0, all scale lists are also 0. Is my understanding correct?

And I wonder if the control scale list also affects between the first frame 0 and the last frame. I used controlnet tiles, placed 2 images on frame 0 and 15, then created 16 frames of video at "control_scale_list":[1.0]. In frames 0 and 1, the image entered in 0 was almost duplicated, but in the case of frame 15, the first frame, 0, was clearly influenced, so the front and back views were mixed.

Next I changed the controlnet 0 image to 2. Controlnet image 2 affects frames 1, 2, and 3, and controlnet image 15 affects frames 0, 14, and 15. is this intended? If so, I think it would be nice if an option was added to disable the scale list from looping. thanks!

OpenPose preprocessing halts indefinitely on step 0

As described in the title. Not sure what is going on. Had a similar issue with stylize as well just halting. This is on Windows.

Vae loading error

```

INFO Loading vae from generate.py:350
I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\data\data\models\V
ae\kl-f8-anime2.ckpt
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\venv\lib\site-packages\diffu │
│ sers\configuration_utils.py:348 in load_config │
│ │
│ 345 │ │ else: │
│ 346 │ │ │ try: │
│ 347 │ │ │ │ # Load from URL or cache if already cached │
│ ❱ 348 │ │ │ │ config_file = hf_hub_download( │
│ 349 │ │ │ │ │ pretrained_model_name_or_path, │
│ 350 │ │ │ │ │ filename=cls.config_name, │
│ 351 │ │ │ │ │ cache_dir=cache_dir, │
│ │
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\venv\lib\site-packages\huggi │
│ ngface_hub\utils_validators.py:110 in inner_fn │
│ │
│ 107 │ │ │ kwargs.items(), # Kwargs values │
│ 108 │ │ ): │
│ 109 │ │ │ if arg_name in ["repo_id", "from_id", "to_id"]: │
│ ❱ 110 │ │ │ │ validate_repo_id(arg_value) │
│ 111 │ │ │ │
│ 112 │ │ │ elif arg_name == "token" and arg_value is not None: │
│ 113 │ │ │ │ has_token = True │
│ │
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\venv\lib\site-packages\huggi │
│ ngface_hub\utils_validators.py:164 in validate_repo_id │
│ │
│ 161 │ │ ) │
│ 162 │ │
│ 163 │ if not REPO_ID_REGEX.match(repo_id): │
│ ❱ 164 │ │ raise HFValidationError( │
│ 165 │ │ │ "Repo id must use alphanumeric chars or '-', '', '.', '--' and '..' are" │
│ 166 │ │ │ " forbidden, '-' and '.' cannot start or end the name, max length is 96:" │
│ 167 │ │ │ f" '{repo_id}'." │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot
start or end the name, max length is 96:
'I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\data\data\models\Vae\kl-f8-anime2.ckpt'.

During handling of the above exception, another exception occurred:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\src\animatediff\stylize.py:4 │
│ 01 in generate │
│ │
│ 398 │ │ config_org = tmp_config_path │
│ 399 │ │
│ 400 │ │
│ ❱ 401 │ output_0_dir = generate( │
│ 402 │ │ config_path=config_org, │
│ 403 │ │ width=model_config.stylize_config["0"]["width"], │
│ 404 │ │ height=model_config.stylize_config["0"]["height"], │
│ │
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\src\animatediff\cli.py:323 │
│ in generate │
│ │
│ 320 │ global g_pipeline │
│ 321 │ global last_model_path │
│ 322 │ if g_pipeline is None or last_model_path != model_config.path.resolve(): │
│ ❱ 323 │ │ g_pipeline = create_pipeline( │
│ 324 │ │ │ base_model=base_model_path, │
│ 325 │ │ │ model_config=model_config, │
│ 326 │ │ │ infer_config=infer_config, │
│ │
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\src\animatediff\generate.py: │
│ 351 in create_pipeline │
│ │
│ 348 │ if model_config.vae_path: │
│ 349 │ │ vae_path = data_dir.joinpath(model_config.vae_path) │
│ 350 │ │ logger.info(f"Loading vae from {vae_path}") │
│ ❱ 351 │ │ vae = AutoencoderKL.from_pretrained(vae_path) │
│ 352 │ │
│ 353 │ │
│ 354 │ # enable xformers if available │
│ │
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\venv\lib\site-packages\diffu │
│ sers\models\modeling_utils.py:511 in from_pretrained │
│ │
│ 508 │ │ } │
│ 509 │ │ │
│ 510 │ │ # load config │
│ ❱ 511 │ │ config, unused_kwargs, commit_hash = cls.load_config( │
│ 512 │ │ │ config_path, │
│ 513 │ │ │ cache_dir=cache_dir, │
│ 514 │ │ │ return_unused_kwargs=True, │
│ │
│ I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\venv\lib\site-packages\diffu │
│ sers\configuration_utils.py:384 in load_config │
│ │
│ 381 │ │ │ │ │ f" {pretrained_model_name_or_path}:\n{err}" │
│ 382 │ │ │ │ ) │
│ 383 │ │ │ except ValueError: │
│ ❱ 384 │ │ │ │ raise EnvironmentError( │
│ 385 │ │ │ │ │ f"We couldn't connect to '{HUGGINGFACE_CO_RESOLVE_ENDPOINT}' to load │
│ 386 │ │ │ │ │ f" in the cached files and it looks like {pretrained_model_name_or_p │
│ 387 │ │ │ │ │ f" directory containing a {cls.config_name} file.\nCheckout your int │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
OSError: We couldn't connect to 'https://huggingface.co' to load this model, couldn't find it in the cached files and it
looks like I:\AnimateDiff\animatediff-cli-travel\animatediff-cli-prompt-travel\data\data\models\Vae\kl-f8-anime2.ckpt is
not the path to a directory containing a config.json file.
Checkout your internet connection or see how to run the library in offline mode at
'https://huggingface.co/docs/diffusers/installation#offline-mode'.

Could we use upscalers during the upscale step instead?

while I don't mind the scheduler, I'd like to use something like animesharp4x instead of a latent sampler, is that possible?

s9roll7 / animatediff-cli-prompt-travel Goto Github PK

animatediff-cli-prompt-travel's People

Contributors

Stargazers

Watchers

Forkers

animatediff-cli-prompt-travel's Issues

Recommend Projects

Recommend Topics

Recommend Org