brycedrennan / imaginairy Goto Github PK

View Code? Open in Web Editor NEW

7.8K 51.0 424.0 321.45 MB

Pythonic AI generation of images and videos

License: MIT License

Makefile 0.61% Python 99.36% Dockerfile 0.03%

imaginairy's Introduction

ImaginAIry 🤖🧠

AI imagined images. Pythonic generation of stable diffusion images and videos *!.

"just works" on Linux and macOS(M1) (and sometimes windows).

# on macOS, make sure rust is installed first
# be sure to use Python 3.10, Python 3.11 is not supported at the moment
>> pip install imaginairy
>> imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman" "a bluejay"
# Make an AI video
>> aimg videogen --start-image rocket.png

Stable Video Diffusion

Rushed release of Stable Diffusion Video!

Works with Nvidia GPUs. Does not work on Mac or CPU.

On Windows you'll need to install torch 2.0 first via https://pytorch.org/get-started/locally/

Usage: aimg videogen [OPTIONS]

  AI generate a video from an image

  Example:

      aimg videogen --start-image assets/rocket-wide.png

Options:
  --start-image TEXT       Input path for image file.
  --num-frames INTEGER     Number of frames.
  --num-steps INTEGER      Number of steps.
  --model TEXT             Model to use. One of: svd, svd_xt, svd_image_decoder, svd_xt_image_decoder
  --fps INTEGER            FPS for the AI to target when generating video
  --output-fps INTEGER     FPS for the output video
  --motion-amount INTEGER  How much motion to generate. value between 0 and 255.
  -r, --repeats INTEGER    How many times to repeat the renders.   [default: 1]
  --cond-aug FLOAT         Conditional augmentation.
  --seed INTEGER           Seed for random number generator.
  --decoding_t INTEGER     Number of frames decoded at a time.
  --output_folder TEXT     Output folder.
  --help                   Show this message and exit.

Images

Whats New

See full Changelog here

14.3.0

feature: integrates spandrel for upscaling
fix: allow loading sdxl models from local paths.

14.2.0

🎉 feature: add image prompt support via --image-prompt and --image-prompt-strength

14.1.1

tests: add installation tests for windows, mac, and conda
fix: dependency issues

14.1.0

🎉 feature: make video generation smooth by adding frame interpolation
feature: SDXL weights in the compvis format can now be used
feature: allow video generation at any size specified by user
feature: video generations output in "bounce" format
feature: choose video output format: mp4, webp, or gif
feature: fix random seed handling in video generation
docs: auto-publish docs on push to master
build: remove imageio dependency
build: vendorize facexlib so we don't install its unneeded dependencies

14.0.4

docs: add a documentation website at https://brycedrennan.github.io/imaginAIry/
build: remove fairscale dependency
fix: video generation was broken

14.0.3

fix: several critical bugs with package
tests: add a wheel smoketest to detect these issues in the future

14.0.0

🎉 video generation using Stable Video Diffusion
- add --videogen to any image generation to create a short video from the generated image
- or use aimg videogen to generate a video from an image
🎉 SDXL (Stable Diffusion Extra Large) models are now supported.
- try --model opendalle or --model sdxl
- inpainting and controlnets are not yet supported for SDXL
🎉 imaginairy is now backed by the refiners library
- This was a huge rewrite which is why some features are not yet supported. On the plus side, refiners supports cutting edge features (SDXL, image prompts, etc) which will be added to imaginairy soon.
- self-attention guidance which makes details of images more accurate
🎉 feature: larger image generations now work MUCH better and stay faithful to the same image as it looks at a smaller size. For example --size 720p --seed 1 and --size 1080p --seed 1 will produce the same image for SD15
🎉 feature: loading diffusers based models now supported. Example --model https://huggingface.co/ainz/diseny-pixar --model-architecture sd15
🎉 feature: qrcode controlnet!

Run API server and StableStudio web interface (alpha)

Generate images via API or web interface. Much smaller featureset compared to the command line tool.

>> aimg server

Visit http://localhost:8000/ and http://localhost:8000/docs

Image Structure Control by ControlNet

(Not supported for SDXL yet)

Generate images guided by body poses, depth maps, canny edges, hed boundaries, or normal maps.

Openpose Control

imagine --control-image assets/indiana.jpg  --control-mode openpose --caption-text openpose "photo of a polar bear"

Canny Edge Control

imagine --control-image assets/lena.png  --control-mode canny "photo of a woman with a hat looking at the camera"

HED Boundary Control

imagine --control-image dog.jpg  --control-mode hed  "photo of a dalmation"

Depth Map Control

imagine --control-image fancy-living.jpg  --control-mode depth  "a modern living room"

Normal Map Control

imagine --control-image bird.jpg  --control-mode normal  "a bird"

Image Shuffle Control

Generates the image based on elements of the control image. Kind of similar to style transfer.

imagine --control-image pearl-girl.jpg  --control-mode shuffle  "a clown"

The middle image is the "shuffled" input image

Editing Instructions Control

Similar to instructPix2Pix (below) but works with any SD 1.5 based model.

imagine --control-image pearl-girl.jpg  --control-mode edit --init-image-strength 0.01 --steps 30  --negative-prompt "" --model openjourney-v2 "make it anime" "make it at the beach"

Add Details Control (upscaling/super-resolution)

Replaces existing details in an image. Good to use with --init-image-strength 0.2

imagine --control-image "assets/wishbone.jpg" --control-mode details "sharp focus, high-resolution" --init-image-strength 0.2 --steps 30 -w 2048 -h 2048

Image (re)Colorization (using brightness control)

Colorize black and white images or re-color existing images.

The generated colors will be applied back to the original image. You can either provide a caption or allow the tool to generate one for you.

aimg colorize pearl-girl.jpg --caption "photo of a woman"

Instruction based image edits by InstructPix2Pix

(Broken as of 14.0.0)

Just tell imaginairy how to edit the image and it will do it for you!

Click to see shell commands

Use prompt strength to control how strong the edit is. For extra control you can combine with prompt-based masking.

# enter imaginairy shell
>> aimg
🤖🧠> edit scenic_landscape.jpg -p "make it winter" --prompt-strength 20
🤖🧠> edit dog.jpg -p "make the dog red" --prompt-strength 5
🤖🧠> edit bowl_of_fruit.jpg -p "replace the fruit with strawberries"
🤖🧠> edit freckled_woman.jpg -p "make her a cyborg" --prompt-strength 13
🤖🧠> edit bluebird.jpg -p "make the bird wear a cowboy hat" --prompt-strength 10
🤖🧠> edit flower.jpg -p "make the flower out of paper origami" --arg-schedule prompt-strength[1:11:0.3]  --steps 25 --compilation-anim gif

# create a comparison gif
🤖🧠> edit pearl_girl.jpg -p "make her wear clown makeup" --compare-gif
# create an animation showing the edit with increasing prompt strengths
🤖🧠> edit mona-lisa.jpg -p "make it a color professional photo headshot" --negative-prompt "old, ugly, blurry" --arg-schedule "prompt-strength[2:8:0.5]" --compilation-anim gif
🤖🧠> edit gg-bridge.jpg -p "make it night time" --prompt-strength 15  --steps 30 --arg-schedule prompt-strength[1:15:1] --compilation-anim gif

Quick Image Edit Demo

Want just quickly have some fun? Try edit-demo to apply some pre-defined edits.

>> aimg edit-demo pearl_girl.jpg

Prompt Based Masking by clipseg

Specify advanced text based masks using boolean logic and strength modifiers. Mask syntax:

mask descriptions must be lowercase
keywords (AND, OR, NOT) must be uppercase
parentheses are supported
mask modifiers may be appended to any mask or group of masks. Example: (dog OR cat){+5} means that we'll select any dog or cat and then expand the size of the mask area by 5 pixels. Valid mask modifiers:
- {+n} - expand mask by n pixels
- {-n} - shrink mask by n pixels
- {*n} - multiply mask strength. will expand mask to areas that weakly matched the mask description
- {/n} - divide mask strength. will reduce mask to areas that most strongly matched the mask description. probably not useful

When writing strength modifiers keep in mind that pixel values are between 0 and 1.

>> imagine \
    --init-image pearl_earring.jpg \
    --mask-prompt "face AND NOT (bandana OR hair OR blue fabric){*6}" \
    --mask-mode keep \
    --init-image-strength .2 \
    --fix-faces \
    "a modern female president" "a female robot" "a female doctor" "a female firefighter"

➡️

>> imagine \
    --init-image fruit-bowl.jpg \
    --mask-prompt "fruit OR fruit stem{*6}" \
    --mask-mode replace \
    --mask-modify-original \
    --init-image-strength .1 \
    "a bowl of kittens" "a bowl of gold coins" "a bowl of popcorn" "a bowl of spaghetti"

➡️

Face Enhancement by CodeFormer

>> imagine "a couple smiling" --steps 40 --seed 1 --fix-faces

➡️

Image Upscaling

Upscale images easily.

=== "CLI" bash aimg upscale assets/000206_856637805_PLMS40_PS7.5_colorful_smoke.jpg --upscale-model real-hat

=== "Python" ```py from imaginairy.api.upscale import upscale

img = upscale(img="assets/000206_856637805_PLMS40_PS7.5_colorful_smoke.jpg")
img.save("colorful_smoke.upscaled.jpg")

```

➡️

Upscaling uses Spandrel to make it easy to use different upscaling models. You can view different integrated models by running aimg upscale --list-models, and then use it with --upscale-model <model-name>. Also accepts url's if you want to upscale an image with a different model. Control the new file format/location with --format.

from imaginairy.enhancers.upscale_realesrgan import upscale_image
from PIL import Image
img = Image.open("my-image.jpg")
big_img = upscale_image(i)

Tiled Images

>> imagine  "gold coins" "a lush forest" "piles of old books" leaves --tile

360 degree images

imagine --tile-x -w 1024 -h 512 "360 degree equirectangular panorama photograph of the desert"  --upscale

Image-to-Image

Use depth maps for amazing "translations" of existing images.

>> imagine --init-image girl_with_a_pearl_earring_large.jpg --init-image-strength 0.05  "professional headshot photo of a woman with a pearl earring" -r 4 -w 1024 -h 1024 --steps 50

➡️

Outpainting

Given a starting image, one can generate it's "surroundings".

Example: imagine --init-image pearl-earring.jpg --init-image-strength 0 --outpaint all250,up0,down600 "woman standing"

➡️

Work with different generation models

Click to see shell command

imagine "valley, fairytale treehouse village covered, , matte painting, highly detailed, dynamic lighting, cinematic, realism, realistic, photo real, sunset, detailed, high contrast, denoised, centered, michael whelan" --steps 60 --seed 1 --arg-schedule model[sd14,sd15,sd20,sd21,openjourney-v1,openjourney-v2] --arg-schedule "caption-text[sd14,sd15,sd20,sd21,openjourney-v1,openjourney-v2]"

Prompt Expansion

You can use {} to randomly pull values from lists. A list of values separated by | and enclosed in { } will be randomly drawn from in a non-repeating fashion. Values that are surrounded by _ _ will pull from a phrase list of the same name. Folders containing .txt phraselist files may be specified via --prompt_library_path. The option may be specified multiple times. Built-in categories:

  3d-term, adj-architecture, adj-beauty, adj-detailed, adj-emotion, adj-general, adj-horror, animal, art-scene, art-movement, 
  art-site, artist, artist-botanical, artist-surreal, aspect-ratio, bird, body-of-water, body-pose, camera-brand,
  camera-model, color, cosmic-galaxy, cosmic-nebula, cosmic-star, cosmic-term, desktop-background, dinosaur, eyecolor, f-stop, 
  fantasy-creature, fantasy-setting, fish, flower, focal-length, food, fruit, games, gen-modifier, hair, hd,
  iso-stop, landscape-type, national-park, nationality, neg-weight, noun-beauty, noun-fantasy, noun-general, 
  noun-horror, occupation, painting-style, photo-term, pop-culture, pop-location, punk-style, quantity, rpg-item, scenario-desc, 
  skin-color, spaceship, style, tree-species, trippy, world-heritage-site

Examples:

imagine "a {lime|blue|silver|aqua} colored dog" -r 4 --seed 0 (note that it generates a dog of each color without repetition)

imagine "a {_color_} dog" -r 4 --seed 0 will generate four, different colored dogs. The colors will be pulled from an included phraselist of colors.

imagine "a {_spaceship_|_fruit_|hot air balloon}. low-poly" -r 4 --seed 0 will generate images of spaceships or fruits or a hot air balloon

Python example

from imaginairy.enhancers.prompt_expansion import expand_prompts

my_prompt = "a giant {_animal_}"

expanded_prompts = expand_prompts(n=10, prompt_text=my_prompt, prompt_library_paths=["./prompts"])

Credit to noodle-soup-prompts where most, but not all, of the wordlists originate.

Generate image captions (via BLIP)

>> aimg describe assets/mask_examples/bowl001.jpg
a bowl full of gold bars sitting on a table

Example Use Cases

>> aimg
# Generate endless 8k art
🤖🧠> imagine -w 1920 -h 1080 --upscale "{_art-scene_}. {_painting-style_} by {_artist_}" -r 1000 --steps 30 --model sd21v

# generate endless desktop backgrounds 
🤖🧠> imagine --tile "{_desktop-background_}" -r 100

# convert a folder of images to pencil sketches
🤖🧠> edit other/images/*.jpg -p "make it a pencil sketch"

# upscale a folder of images
🤖🧠> upscale my-images/*.jpg

# generate kitchen remodel ideas
🤖🧠> imagine --control-image kitchen.jpg -w 1024 -h 1024 "{_interior-style_} kitchen" --control-mode depth -r 100 --init-image 0.01 --upscale --steps 35 --caption-text "{prompt}"

Additional Features

Generate images either in code or from command line.
It just works. Proper requirements are installed. Model weights are automatically downloaded. No huggingface account needed. (if you have the right hardware... and aren't on windows)
Noisy logs are gone (which was surprisingly hard to accomplish)
WeightedPrompts let you smash together separate prompts (cat-dog)
Prompt metadata saved into image file metadata
Have AI generate captions for images aimg describe <filename-or-url>
Interactive prompt: just run aimg

How To

For full command line instructions run aimg --help

from imaginairy import imagine, imagine_image_files, ImaginePrompt, WeightedPrompt, LazyLoadingImage

url = "https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/Thomas_Cole_-_Architect%E2%80%99s_Dream_-_Google_Art_Project.jpg/540px-Thomas_Cole_-_Architect%E2%80%99s_Dream_-_Google_Art_Project.jpg"
prompts = [
    ImaginePrompt("a scenic landscape", seed=1, upscale=True),
    ImaginePrompt("a bowl of fruit"),
    ImaginePrompt([
        WeightedPrompt("cat", weight=1),
        WeightedPrompt("dog", weight=1),
    ]),
    ImaginePrompt(
        "a spacious building", 
        init_image=LazyLoadingImage(url=url)
    ),
    ImaginePrompt(
        "a bowl of strawberries", 
        init_image=LazyLoadingImage(filepath="mypath/to/bowl_of_fruit.jpg"),
        mask_prompt="fruit OR stem{*2}",  # amplify the stem mask x2
        mask_mode="replace",
        mask_modify_original=True,
    ),
    ImaginePrompt("strawberries", tile_mode=True),
]
for result in imagine(prompts):
    # do something
    result.save("my_image.jpg")

# or

imagine_image_files(prompts, outdir="./my-art")

Requirements

~10 gb space for models to download
A CUDA supported graphics card with >= 11gb VRAM (and CUDA installed) or an M1 processor.
Python installed. Preferably Python 3.10. (not conda)
For macOS rust and setuptools-rust must be installed to compile the tokenizer library. They can be installed via: curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh and pip install setuptools-rust

Running in Docker

See example Dockerfile (works on machine where you can pass the gpu into the container)

docker build . -t imaginairy
# you really want to map the cache or you end up wasting a lot of time and space redownloading the model weights
docker run -it --gpus all -v $HOME/.cache/huggingface:/root/.cache/huggingface -v $HOME/.cache/torch:/root/.cache/torch -v `pwd`/outputs:/outputs imaginairy /bin/bash

Running on Google Colab

Example Colab

Q&A

Q: How do I change the cache directory for where models are stored?

A: Set the HUGGINGFACE_HUB_CACHE environment variable.

Q: How do I free up disk space?

A: The AI models are cached in ~/.cache/ (or HUGGINGFACE_HUB_CACHE). To delete the cache remove the following folders:

~/.cache/imaginairy
~/.cache/clip
~/.cache/torch
~/.cache/huggingface

Not Supported

exploratory features that don't work well

imaginairy's People

Contributors

Stargazers

Watchers

Forkers

wseagar heitorrapela ianderrington nopeanuts fastflair rsh4d0w touhonoob engrmahabub lcw techthiyanes lifenglsf minamikik coolashel sanqiaiziji nikhgupta crazier9527 anylee2021 marcus-arcadius luiscerto llllimbo rockystevejobs johndpope myhz0606 brandnewx dfyx metavai prnvd h4rk8s ivoider hmillerbakewell niclaseriksen ggozad samheather un1tz3r0 ikasumi xjohnxjohn infinitydreamai alaincavel thedigitaloctopus ylatuya i10s nilutz phbou72 hbcbh1999 tarbaig anantkamath djoffrey artisr usernamx42 designium mbrukman andersbl suryatmodulus mikkmeelak huaishui colviz-forks bandronic danyelb themasterr maxmouchet chrissexton shuangliu1992 kianmeng shalevy1 adityapurwa siknet jcbyts idvorkin marceloneppel mikalv lwneal paulmest shengyf3 tingdahaideshengyin gg-big-org crosstyan codeaudit chewtoys devinfu fmxexpress rootvar karbon0x leftomelas annihilatorrrr hirajanwin muhammadmuzammilzia27 julian-becker crywas cartertsai marcosantonastasi rock3125 blka kenny-ngo newlog mrcodechef marcin johnnyzhou shivamsinha15 kustomzone varverd

imaginairy's Issues

feature request - subtraction by another image

https://twitter.com/ygantigravity/status/1572357902354251776?s=20&t=TA_m3smm3h1YAtsEoil3sg

https://t.co/O5YGKXikf9

Larger size images always trigger safety filter on the M1

-w 640 -h 896 and other larger-sized images seem to always trigger the safety filter no matter the prompt.

RuntimeError: CUDA out of memory.

Device 0 [NVIDIA GeForce GTX 1080] PCIe GEN 3@16x RX: 25.00 MB/s TX: 6.000 MB/s GPU 1898MHz MEM 4513MHz TEMP 72°C FAN 60% POW 189 / 200 W

miniforge3's ldm env, install from InvokeVI(lstein/stable-diffusion) Ubuntu .yml

Do 60 image generation at a time, args following:

imagine "xxxxxxx " \ --repeats 60 \ --steps 50 \ --prompt-strength 7.5 \ --sampler-type k_lms \ --width 512 \ --height 512 \ --fix-faces \ --upscale

It runs smoothly at first, very fast, producing a new image in about 20 seconds (For my GPU card), but after a while (usually a few minutes or so), it starts reporting errors. It looks like every iteration there are hundreds of megabytes of VRAM that are not freed up.

RuntimeError: CUDA out of memory. Tried to allocate 534.00 MiB (GPU 0; 7.93 GiB total capacity; 6.19 GiB already allocated; 528.81 MiB free; 6.79 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I've tried the following , it maybe work.(We went from a dozen iterations to 30+)

imaginairy/utils.py step(1)

def clean_memory():
if get_device() == 'cuda':
gc.collect()
torch.cuda.empty_cache()

imaginairy/schema.py step(2)

ImaginePrompt.init()
...
self.tile_mode = tile_mode
clean_memory()

imaginairy/enhancers/upscale_realesrgan.py step(3)
L37
clean_memory()
`
I think it's the memory fragmentation.

CLI option to select model

I have created a model with Textual Inversion, and I was wondering if there can be an easy CLI option to select the model to generate from?

v1.1.4 doesn't seem to allow generations

Hey,

Thanks for repsonding to my previous issue!

I upgraded my imaginairy today to v1.1.4, and found the imagine command no longer worked. When i reduced the version to v1.1.2, it worked again.

Can it be faster on M1 machine? Torch update

I am wondering if it could run a bit faster on M1 machine with latest PyTorch?
https://pytorch.org/blog/PyTorch-1.13-release

Also, "Loading mode" part seems to be slow and need to run each time, can this step also be skipped?

k_dpm_fast seemingly not working

steps always runs --steps XX + 1 (no idea whether related)
images always just 'large blob noise', e.g.

my replicate test was:
imagine "{_animal_} by {_artist_}" -r 10 --sampler-type k_dpm_fast --seed 1

(machinery: Imaginairy release 4.1, M1 Pro, MacOS 12.6, Python 3.10.7 in py-venv)

steps/iteration problems with k_dpm_adaptive

thanks for #73!
when testing that I ran into a problem with k_dpm_fast - it seems to initialise at --steps XX +1?

e.g. imagine "an art deco cat" --sampler-type k_dpm_fast
will run an iteration with 41 steps, not 40

when this happens from #73 example script:
imagine_image_files(prompts, outdir="output")
it looses the plot entirely and just keeps iterating..

minor: progress display suggestion

another minor suggestion..
with your new lists I'm sometimes using, say, {artist} with a -r 40 setting..
thus a 'where we're up to' included in progress outputs would be convenient..
something like:

Generating 🖼 : "prompt"
becomes
Generating 🖼 (x/40): "prompt"

or just on the end after, say, sampler-type:k_dpm_2, repeat:x/40

I realise that it might not be as simple as it looks due to way prompts + repeats can spawn many variations on the final image counts.. but thought I'd suggest anyway.

Work going by default to my cpu instead of my gpu

Hello,
how can i in commandline execution make that imaginairy use my RTX 2070 instead of my cpu ?

Is there a particular reason `enhance_faces()` has to be run twice on upscaled images?

if prompt.fix_faces:
    logger.info("    Fixing 😊 's in 🖼  using CodeFormer...")
    img = enhance_faces(img, fidelity=0.2)
if prompt.upscale:
    logger.info("    Upscaling 🖼  using real-ESRGAN...")
    upscaled_img = upscale_image(img)
    if prompt.fix_faces:
        logger.info("    Fixing 😊 's in big 🖼  using CodeFormer...")
        upscaled_img = enhance_faces(upscaled_img, fidelity=0.8)

The upscaled version has significantly more "washed out" features as a consequence, with much less detail than its original scale counterpart.

Regular image with fix_faces:

Upscaled image with fix_faces:

It's particularly visible in the hair. Could it be an idea to move the upscaling to happen before the face fixing to avoid this?

if prompt.upscale:
    logger.info("    Upscaling 🖼  using real-ESRGAN...")
    upscaled_img = upscale_image(img)
    if prompt.fix_faces:
        logger.info("    Fixing 😊 's in big 🖼  using CodeFormer...")
        upscaled_img = enhance_faces(upscaled_img, fidelity=0.8)
elif prompt.fix_faces:
    logger.info("    Fixing 😊 's in 🖼  using CodeFormer...")
    img = enhance_faces(img, fidelity=0.2)

Or simply remove the enhance_faces() call for the upscaled img, or make it an optional flag?

urlopen error when running the imagine command for Automated Replacement or Face Enhancement purpose

OS: macOS 12.6
Hardware: MacBookPro M1 Pro

I get this error whenever I run the imagine command for Automated Replacement or Face Enhancement purpose:
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)>

Any idea? Thanks for this great tool!

treating sampler types like lists via command line?

just another suggestion..
interesting testing with same prompt + seed with different sampler types led to a thought that having the option for 'each', or 'series' (ddim+plms), or 'random' options on --sampler-type might come in handy..

not sure what syntax would be consistent with command line options but maybe using {} or _option_ as per prompt syntaxes might work e.g.:
_each_ = do 1 each of sampler types available (currently 10, thereby creating ten images)
_random_ = choose 1 sampler randomly
'+ syntax' for 'series' thereby "--sampler-type ddim + plms + k_huen" = step through sampler series list thereby creating three images
or maybe [] easier to parse, e.g. --sampler-type [ddim plms k_huen]

not sure, leave as whether a sensible/useful idea, and implementation details, for you to decide :-)

integrate stable diffusion 2.0

https://github.com/Stability-AI/stablediffusion

FR: Add ability to preload models

When using this library programatically it can take a bit to load the model the first time, which makes it so that the first image generation takes significantly longer than subsequent ones. It would be nice if there was a way to preload models that you know you're going to be using so that this doesn't happen.

aimg describe got FileNotFoundError

aimg describe outputs/000320_526397716_ddim40_PS7.5_ID_photo_of_a_child.jpg

FileNotFoundError: [Errno 2] No such file or directory: '/home/iwater/miniconda3/envs/imaginAIry/lib/python3.10/site-packages/imaginairy/vendored/blip/configs/med_config.json'

How to switch to GPU mode ?

Hello !

I've juste installed and started to use ImaginAIry on my laptop and I have a problem.
It seems to be stuck on CPU mode (every time I run it I get the "it's gonna be sloooooooow message").
How can I change that ? Thanks !

I'm running it in the windows Powershell, with a Nvidia GPU, for infos.

The option --show-work does not work

Hello,

I understand that --show-work can be used to get an image each time a step is accomplished (please correct me if I'm wrong).

When I add --show-work to the command line, the image generation fails

imagine "text" --precision full --fix-faces --steps 50 --show-work
imaginAIry received 1 prompt(s) and will repeat them 1 times to create 1 images.
Running in CPU mode. it's gonna be slooooooow.
Generating 🖼 1/1: "text" 512x512px seed:187796535 prompt-strength:7.5 steps:27 sampler-type:plms
Loading model /home/g/.cache/huggingface/transformers/d12e71b67e29abaf317bf9d0e31644872fd2072509a4b3582cbb0c30f70824e9.98fc1312797017a8bac6993df565908fd18f09319b40d9bd35457dfa1459ecf0 onto cpu backend...
Traceback (most recent call last):
File "/home/g/./.local/bin/imagine", line 8, in
sys.exit(imagine_cmd())
File "/home/g/.local/lib/python3.10/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/home/g/.local/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/home/g/.local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/g/.local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/home/g/.local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/home/g/.local/lib/python3.10/site-packages/imaginairy/cmds.py", line 238, in imagine_cmd
imagine_image_files(
File "/home/g/.local/lib/python3.10/site-packages/imaginairy/api.py", line 74, in imagine_image_files
for result in imagine(
File "/home/g/.local/lib/python3.10/site-packages/imaginairy/api.py", line 246, in imagine
log_latent(init_latent_noised, "init_latent_noised")
File "/home/g/.local/lib/python3.10/site-packages/imaginairy/log_utils.py", line 28, in log_latent
_CURRENT_LOGGING_CONTEXT.log_latents(latents, description)
File "/home/g/.local/lib/python3.10/site-packages/imaginairy/log_utils.py", line 72, in log_latents
if latents.shape[1] != 4:
AttributeError: 'NoneType' object has no attribute 'shape'

Do you know why this issue ?

ISSUE importing imaginary - numpy version

>>> import imaginairy
/Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: dlopen(/Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/torchvision/image.so, 0x0006): Symbol not found: __ZN3c106detail19maybe_wrap_dim_slowExxb
  Referenced from: <8080486D-E510-3000-AA6A-F3AD49ACC172> /Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/torchvision/image.so
  Expected in:     <8ADDD67A-2C07-3290-B140-DF1BD644AB8B> /Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/torch/lib/libc10.dylib
  warn(f"Failed to load image Python extension: {e}")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/imaginairy/__init__.py", line 7, in <module>
    from .api import imagine, imagine_image_files  # noqa
  File "/Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/imaginairy/api.py", line 16, in <module>
    from imaginairy.enhancers.clip_masking import get_img_mask
  File "/Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/imaginairy/enhancers/clip_masking.py", line 8, in <module>
    from torchvision import transforms
  File "/Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/torchvision/__init__.py", line 7, in <module>
    from torchvision import models
  File "/Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/torchvision/models/__init__.py", line 18, in <module>
    from . import quantization
  File "/Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/torchvision/models/quantization/__init__.py", line 3, in <module>
    from .mobilenet import *
  File "/Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/torchvision/models/quantization/mobilenet.py", line 1, in <module>
    from .mobilenetv2 import *  # noqa: F401, F403
  File "/Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/torchvision/models/quantization/mobilenetv2.py", line 6, in <module>
    from torch.ao.quantization import QuantStub, DeQuantStub
ImportError: cannot import name 'QuantStub' from 'torch.ao.quantization' (/Users/jimmygunawan/miniforge3/lib/python3.9/site-packages/torch/ao/quantization/__init__.py)

And during the pip installation of ImaginAIry it complains:

ERROR: Failed building wheel for numpy
Failed to build numpy
ERROR: Could not build wheels for numpy, which is required to install pyproject.toml-based projects
(base) jimmygunawan@192-168-1-114 ~ % pip3 install numpy
Collecting numpy
Using cached numpy-1.23.3-cp39-cp39-macosx_11_0_arm64.whl (13.4 MB)
Installing collected packages: numpy
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow-macos 2.6.0 requires numpy~=1.19.2, but you have numpy 1.23.3 which is incompatible.
scipy 1.7.1 requires numpy<1.23.0,>=1.16.5, but you have numpy 1.23.3 which is incompatible.

An error occurred while installing imaginairy

HI
I try to install imaginairy on GoogleCloud VM with GPU , debian 10.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

imaginairy.log

Regards
Pierre

--fix-faces doesn't work

Hello!

I'm using imaginAIry version 4.1.0 installed via pip3. Python 3.10 from Homebrew. MacBook Pro 2021 M1, macOS 12.5. I had Python initially installed via asdf, but I had a problems with the SD, so I've reinstalled it from Homebrew.

When trying to use --fix-faces, I'm getting this error

Traceback (most recent call last):
  File "/opt/homebrew/bin/imagine", line 8, in <module>
    sys.exit(imagine_cmd())
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/homebrew/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/opt/homebrew/lib/python3.10/site-packages/imaginairy/cmds.py", line 238, in imagine_cmd
    imagine_image_files(
  File "/opt/homebrew/lib/python3.10/site-packages/imaginairy/api.py", line 74, in imagine_image_files
    for result in imagine(
  File "/opt/homebrew/lib/python3.10/site-packages/imaginairy/api.py", line 294, in imagine
    img = enhance_faces(img, fidelity=prompt.fix_faces_fidelity)
  File "/opt/homebrew/lib/python3.10/site-packages/imaginairy/enhancers/face_restoration_codeformer.py", line 56, in enhance_faces
    net = codeformer_model()
  File "/opt/homebrew/lib/python3.10/site-packages/imaginairy/enhancers/face_restoration_codeformer.py", line 28, in codeformer_model
    checkpoint = torch.load(ckpt_path)["params_ema"]
  File "/opt/homebrew/lib/python3.10/site-packages/torch/serialization.py", line 777, in load
    with _open_zipfile_reader(opened_file) as opened_zipfile:
  File "/opt/homebrew/lib/python3.10/site-packages/torch/serialization.py", line 282, in __init__
    super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

I've googled a bit and it seems that the problem with the Torch model (I'm not quite into that stuff so bear with me). Any ideas, where to look?

RuntimeError: expected scalar type BFloat16 but found Float

HI Just thought I'd let you know I'm getting the same thing on PoP OS:

File "/home/jasper/anaconda3/lib/python3.9/site-packages/torch/nn/functional.py", line 2516, in group_norm return torch.group_norm(input, num_groups, weight, bias, eps, torch.backends.cudnn.enabled)RuntimeError: expected scalar type BFloat16 but found Float

https://www.reddit.com/r/StableDiffusion/comments/xdcgh0/comment/ip5m1hh/?context=3

Install error on M1 with grpcio and conda

"just works" was not valid for me on my M1
With

brew install cmake protobuf rust
python3 -m venv env
. env/bin/activate
pip install imaginairy

I run into

      creating None/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/tmpzh5sjcuo
      clang -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /Users/hannes/miniforge3/include -arch arm64 -fPIC -O2 -isystem /Users/hannes/miniforge3/include -arch arm64 -I/opt/homebrew/opt/openssl@3/include -I/Users/hannes/git/ai/imaginAIry/env/include -I/Users/hannes/miniforge3/include/python3.9 -c /var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/tmpzh5sjcuo/a.c -o None/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/tmpzh5sjcuo/a.o
      Traceback (most recent call last):
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/unixccompiler.py", line 117, in _compile
          self.spawn(compiler_so + cc_args + [src, '-o', obj] +
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/src/python/grpcio/_spawn_patch.py", line 54, in _commandfile_spawn
          _classic_spawn(self, command)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/ccompiler.py", line 910, in spawn
          spawn(cmd, dry_run=self.dry_run)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/spawn.py", line 91, in spawn
          raise DistutilsExecError(
      distutils.errors.DistutilsExecError: command '/opt/homebrew/opt/llvm/bin/clang' failed with exit code 1
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/src/python/grpcio/commands.py", line 280, in build_extensions
          build_ext.build_ext.build_extensions(self)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/command/build_ext.py", line 449, in build_extensions
          self._build_extensions_serial()
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/command/build_ext.py", line 474, in _build_extensions_serial
          self.build_extension(ext)
        File "/Users/hannes/git/ai/imaginAIry/env/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
          _build_ext.build_extension(self, ext)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/command/build_ext.py", line 529, in build_extension
          objects = self.compiler.compile(sources,
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/src/python/grpcio/_parallel_compile_patch.py", line 58, in _parallel_compile
          multiprocessing.pool.ThreadPool(BUILD_EXT_COMPILER_JOBS).map(
        File "/Users/hannes/miniforge3/lib/python3.9/multiprocessing/pool.py", line 364, in map
          return self._map_async(func, iterable, mapstar, chunksize).get()
        File "/Users/hannes/miniforge3/lib/python3.9/multiprocessing/pool.py", line 771, in get
          raise self._value
        File "/Users/hannes/miniforge3/lib/python3.9/multiprocessing/pool.py", line 125, in worker
          result = (True, func(*args, **kwds))
        File "/Users/hannes/miniforge3/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
          return list(map(*args))
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/src/python/grpcio/_parallel_compile_patch.py", line 54, in _compile_single_file
          self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/src/python/grpcio/commands.py", line 263, in new_compile
          return old_compile(obj, src, ext, cc_args, extra_postargs,
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/unixccompiler.py", line 120, in _compile
          raise CompileError(msg)
      distutils.errors.CompileError: command '/opt/homebrew/opt/llvm/bin/clang' failed with exit code 1
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/setup.py", line 540, in <module>
          setuptools.setup(
        File "/Users/hannes/git/ai/imaginAIry/env/lib/python3.9/site-packages/setuptools/__init__.py", line 153, in setup
          return distutils.core.setup(**attrs)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/core.py", line 148, in setup
          dist.run_commands()
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/dist.py", line 966, in run_commands
          self.run_command(cmd)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/dist.py", line 985, in run_command
          cmd_obj.run()
        File "/Users/hannes/git/ai/imaginAIry/env/lib/python3.9/site-packages/setuptools/command/install.py", line 61, in run
          return orig.install.run(self)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/command/install.py", line 546, in run
          self.run_command('build')
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/cmd.py", line 313, in run_command
          self.distribution.run_command(command)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/dist.py", line 985, in run_command
          cmd_obj.run()
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/command/build.py", line 135, in run
          self.run_command(cmd_name)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/cmd.py", line 313, in run_command
          self.distribution.run_command(command)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/dist.py", line 985, in run_command
          cmd_obj.run()
        File "/Users/hannes/git/ai/imaginAIry/env/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 79, in run
          _build_ext.run(self)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/command/build_ext.py", line 340, in run
          self.build_extensions()
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/src/python/grpcio/commands.py", line 284, in build_extensions
          raise CommandError(
      commands.CommandError: Failed `build_ext` step:
      Traceback (most recent call last):
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/unixccompiler.py", line 117, in _compile
          self.spawn(compiler_so + cc_args + [src, '-o', obj] +
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/src/python/grpcio/_spawn_patch.py", line 54, in _commandfile_spawn
          _classic_spawn(self, command)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/ccompiler.py", line 910, in spawn
          spawn(cmd, dry_run=self.dry_run)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/spawn.py", line 91, in spawn
          raise DistutilsExecError(
      distutils.errors.DistutilsExecError: command '/opt/homebrew/opt/llvm/bin/clang' failed with exit code 1
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/src/python/grpcio/commands.py", line 280, in build_extensions
          build_ext.build_ext.build_extensions(self)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/command/build_ext.py", line 449, in build_extensions
          self._build_extensions_serial()
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/command/build_ext.py", line 474, in _build_extensions_serial
          self.build_extension(ext)
        File "/Users/hannes/git/ai/imaginAIry/env/lib/python3.9/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
          _build_ext.build_extension(self, ext)
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/command/build_ext.py", line 529, in build_extension
          objects = self.compiler.compile(sources,
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/src/python/grpcio/_parallel_compile_patch.py", line 58, in _parallel_compile
          multiprocessing.pool.ThreadPool(BUILD_EXT_COMPILER_JOBS).map(
        File "/Users/hannes/miniforge3/lib/python3.9/multiprocessing/pool.py", line 364, in map
          return self._map_async(func, iterable, mapstar, chunksize).get()
        File "/Users/hannes/miniforge3/lib/python3.9/multiprocessing/pool.py", line 771, in get
          raise self._value
        File "/Users/hannes/miniforge3/lib/python3.9/multiprocessing/pool.py", line 125, in worker
          result = (True, func(*args, **kwds))
        File "/Users/hannes/miniforge3/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
          return list(map(*args))
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/src/python/grpcio/_parallel_compile_patch.py", line 54, in _compile_single_file
          self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
        File "/private/var/folders/np/7ckckk8s5fd169kvnnfhjkk40000gr/T/pip-install-bfttvlxs/grpcio_2e6589fc53b54149beaca903ff651a0d/src/python/grpcio/commands.py", line 263, in new_compile
          return old_compile(obj, src, ext, cc_args, extra_postargs,
        File "/Users/hannes/miniforge3/lib/python3.9/distutils/unixccompiler.py", line 120, in _compile
          raise CompileError(msg)
      distutils.errors.CompileError: command '/opt/homebrew/opt/llvm/bin/clang' failed with exit code 1
      
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> grpcio

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

expected scalar type BFloat16 but found Float

julien@pop7550:~$ imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman"
🤖🧠 imaginAIry received 4 prompt(s) and will repeat them 1 times to create 4 images.
Running in CPU mode. it's gonna be slooooooow.
Generating 🖼  1/4: "a scenic landscape" 512x512px seed:913183355 prompt-strength:7.5 steps:40 sampler-type:plms
Loading model /home/julien/.cache/huggingface/transformers/d12e71b67e29abaf317bf9d0e31644872fd2072509a4b3582cbb0c30f70824e9.98fc1312797017a8bac6993df565908fd18f09319b40d9bd35457dfa1459ecf0 onto cpu backend...
  0%|                                                                            | 0/40 [00:04<?, ?it/s]
/usr/lib/python3/dist-packages/apport/report.py:13: DeprecationWarning: the imp module is deprecated in favour of importlib and slated for removal in Python 3.12; see the module's documentation for alternative uses
  import fnmatch, glob, traceback, errno, sys, atexit, locale, imp, stat
Traceback (most recent call last):
  File "/home/julien/.local/bin/imagine", line 8, in <module>
    sys.exit(imagine_cmd())
  File "/usr/lib/python3/dist-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/cmds.py", line 238, in imagine_cmd
    imagine_image_files(
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/api.py", line 74, in imagine_image_files
    for result in imagine(
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/api.py", line 248, in imagine
    samples = sampler.sample(
  File "/home/julien/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/samplers/plms.py", line 98, in sample
    noisy_latent, predicted_latent, noise_pred = self.p_sample_plms(
  File "/home/julien/.local/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/samplers/plms.py", line 139, in p_sample_plms
    noise_pred = get_noise_prediction(
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/samplers/base.py", line 130, in get_noise_prediction
    noise_pred_neutral, noise_pred_positive = denoise_func(
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/modules/diffusion/ddpm.py", line 763, in apply_model
    x_recon = self.model(x_noisy, t, **cond)
  File "/home/julien/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/modules/diffusion/ddpm.py", line 888, in forward
    out = self.diffusion_model(x, t, context=cc)
  File "/home/julien/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/modules/diffusion/openaimodel.py", line 778, in forward
    h = module(h, emb, context)
  File "/home/julien/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/modules/diffusion/openaimodel.py", line 85, in forward
    x = layer(x, context)
  File "/home/julien/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/modules/attention.py", line 320, in forward
    x = block(x, context=context)
  File "/home/julien/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/modules/attention.py", line 266, in forward
    return checkpoint(
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/modules/diffusion/util.py", line 150, in checkpoint
    return CheckpointFunction.apply(func, len(inputs), *args)
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/modules/diffusion/util.py", line 163, in forward
    output_tensors = ctx.run_function(*ctx.input_tensors)
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/modules/attention.py", line 272, in _forward
    x = self.attn1(self.norm1(x)) + x
  File "/home/julien/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/julien/.local/lib/python3.10/site-packages/torch/nn/modules/normalization.py", line 190, in forward
    return F.layer_norm(
  File "/home/julien/.local/lib/python3.10/site-packages/imaginairy/utils.py", line 102, in _fixed_layer_norm
    return torch.layer_norm(
RuntimeError: expected scalar type BFloat16 but found Float

Error building wheel for grpcio

Currently when running pip install imaginairy using Python 3.10.3.

I get this error:

Building wheel for grpcio (setup.py) ... error
...
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/stdlib.h:165:35: warning: pointer is missing a nullability type specifier (_Nonnull, _Nullable, or _Null_unspecified) [-Wnullability-completeness]
      double   strtod(const char *, char **) __DARWIN_ALIAS(strtod);
                                         ^
      /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/stdlib.h:296:37:/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/stdlib.h note: insert '_Nonnull' if the pointer should never be null
      :165:35: note: insert '_Nullable' if the pointer may be null
      char    *devname_r(dev_t, mode_t, char *buf, int len);
                                             ^
                                               _Nonnull
      /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/stdlib.h:296:6: double   strtod(const char *, char **) __DARWIN_ALIAS(strtod);warning:
      pointer is missing a nullability type specifier (_Nonnull, _Nullable, or _Null_unspecified) [-Wnullability-completeness]                                   ^

                                          _Nullable
...

System Details

System Version: macOS 12.6 (21G115)
Model Name: MacBook Pro
Model Identifier: MacBookPro18,1
Chip: Apple M1 Pro
Total Number of Cores: 10 (8 performance and 2 efficiency)
Memory: 32 GB
System Firmware Version: 7459.141.1
OS Loader Version: 7459.141.1

Feature request: REPL mode

The most time-consuming step seems to be startup and loading the model. On my system (Ubuntu 20.04 in WSL on Windows 10, Ryzen 3600X, RTX 3080), startup takes about 11 seconds compared to about 3-4 seconds for the actual image generation (at 40 steps).

It would be nice to have a REPL-like mode where the script stays running and you can enter new prompts and options until you explicitly quit. This would speed up generating a bunch of different images a lot.

minor: maybe just clarification (prompt weights)

I see:

ImaginePrompt([
        WeightedPrompt("cat", weight=1),
        WeightedPrompt("dog", weight=1),
    ]),

in examples, by can't quite figure out a couple of things:

how are these specified in command line prompts? (e.g. (dog,weight=1) or (dog:1) or something?)
are weights similar to mask weights e.g. 0->1?

thanks in advance

fix-faces difference after 2.2.0

the example in readme:
imagine "a couple smiling" --steps 40 --seed 1 --fix-faces
now no longer yields results as shown unless changed to:
imagine "a couple smiling" --steps 40 --seed 1 --fix-faces --fix-faces-fidelity 0

so not sure if that's intended, or the default value for an unspecified --fix-face-fidelity is wrong, or not working?

Add command-line option for enabling the NSFW filter

Currently, the NSFW filter cannot be disabled if you're using the imagine command (at least I haven't found how). Since mostly everyone wants the filter off, it would be good if the default were off and it had an option to enable it.

openSsl error when running in conda environment

I tried a bunch of workarounds but I always get this error if I install /run it in a conda environment:

URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1091)>

Works fine in my main environment. I think downloading the model from github requires SSL on requests.

can be extremely slow if running in unsupported CPU mode

When I force it to run on CPU it runs extremely slow. Sounds similar to what's described here:

pytorch/pytorch#75458

minor: filenames and underscore seperators

It'd be handy if auto generated filenames could consistently be 'split' on underscores, this is mostly the case, except, e.g.
imagine "cat in an art deco setting. photo realistic by {artist}" --sampler-type k_dpm_2
creates something like:
000155_458176395_k_dpm_240_PS7.5_cat_in_an_art_deco_setting.photo_realistic_by_saturno_butto[generated].jpg

maybe take underscores out of sampler names all together.. or.. just strip them out before creating filename?

FEEDBACK imaginAIry to run a bit faster on M1 8GB machine?

If I use DiffusionBee app, I can have result under 4 minutes, I wonder now if imaginAIry with it's beautiful text based Terminal command can also perform more or less in such speed?

What could be a way to run this faster?

note for install docs

just setting up a brand new M1 mac out of box.. and as well as noted requirements, i.e.:
-homebrew
-python3/pip3
-rust

found that
sudo pip3 install setuptools-rust

was required for the pip installation of imaginairy to be able to do its 'rust things'..

BTW I previously had imaginairy running (and working!) on a 2015 i7 intel Macbook Pro.. granted it took 15 mins to do a 40 step 'imagine' but I thought that it worked at all was a minor miracle! :-)

still issue if no -fix-faces-fidelity supplied

so, did a, "pip install --upgrade imaginairy" to go up to v 2.3.0

now the following happens with no -fix-faces-fidelity supplied

(py-venv) shaun@Shauns-MBPro14 dev % imagine "a couple smiling" --steps 40 --seed 1 --fix-faces
🤖🧠 imaginAIry received 1 prompt(s) and will repeat them 1 times to create 1 images.
Loading model onto mps:0 backend...
Generating 🖼 : "a couple smiling" 512x512px seed:1 prompt-strength:7.5 steps:40 sampler-type:plms
PLMS Sampler: 100%|█████████████████████████| 40/40 [00:51<00:00, 1.29s/it]
Fixing 😊 's in 🖼 using CodeFormer...
Enhancing 2 faces
Failed inference for CodeFormer: '>' not supported between instances of 'NoneType' and 'int'
Failed inference for CodeFormer: '>' not supported between instances of 'NoneType' and 'int'
🖼 [generated] saved to: ./outputs/generated/000015_1_plms40_PS7.5_a_couple_smiling_[generated].jpg
(py-venv) shaun@Shauns-MBPro14 dev %

adding --fix-faces-fidelity 0.2 suppresses these..

so not sure, type error in default?

change to work with less VRAM - "CUDA out of memory"

My graphics card is a NVIDIA GeForce GTX 1050 Ti, 4G memory.

When I run the imagine command I receive this error:

  File "/home/user1/.pyenv/versions/3.10.6/envs/imaginairy-3.10.6/lib/python3.10/site-packages/torch/nn/modules/module.py", line 925, in convert

    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)

RuntimeError: CUDA out of memory. Tried to allocate 58.00 MiB (GPU 0; 3.95 GiB total capacity; 2.82 GiB already allocated; 69.38 MiB free; 2.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  

See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I didn't find a minimum requirements in project's README, so I am assuming a 4G GPU isn't enough to run this application. However I was able to run this stable diffusion project, so I am hoping some configuration could solve this for imaginAlry.

does it work with ubuntu wsl ?

I am getting this message :
CUDA error: no kernel image is available for execution on the device wsl2

thank you

Not enought memory.

(imaginairi) C:\Users\franc>imagine "A picture of an Icelandic landscape"
🤖🧠 imaginAIry received 1 prompt(s) and will repeat them 1 times to create 1 images.
Generating 🖼  1/1: "A picture of an Icelandic landscape" 512x512px seed:495117480 prompt-strength:7.5 steps:40 sampler-
type:plms
Loading model C:\Users\franc/.cache\huggingface\transformers\d12e71b67e29abaf317bf9d0e31644872fd2072509a4b3582cbb0c30f70824e9.98fc1312797017a8bac6993df565908fd18f09319b40d9bd35457dfa1459ecf0 onto cuda backend...
Traceback (most recent call last):
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\franc\anaconda3\envs\imaginairi\Scripts\imagine.exe\__main__.py", line 7, in <module>
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\imaginairy\cmds.py", line 238, in imagine_cmd
    imagine_image_files(
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\imaginairy\api.py", line 74, in imagine_image_files
    for result in imagine(
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\imaginairy\api.py", line 130, in imagine
    model = get_diffusion_model(
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\imaginairy\model_manager.py", line 127, in get_diffusion_model
    model.num_timesteps_cond  # noqa
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\imaginairy\model_manager.py", line 50, in __getattr__
    model = load_model_from_config(
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\imaginairy\model_manager.py", line 98, in load_model_from_config
    model.to(get_device())
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\pytorch_lightning\core\mixins\device_dtype_mixin.py", line 109, in to
    return super().to(*args, **kwargs)
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\torch\nn\modules\module.py", line 987, in to
    return self._apply(convert)
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\torch\nn\modules\module.py", line 639, in _apply
    module._apply(fn)
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\torch\nn\modules\module.py", line 639, in _apply
    module._apply(fn)
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\torch\nn\modules\module.py", line 639, in _apply
    module._apply(fn)
  [Previous line repeated 4 more times]
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\torch\nn\modules\module.py", line 662, in _apply
    param_applied = fn(param)
  File "C:\Users\franc\anaconda3\envs\imaginairi\lib\site-packages\torch\nn\modules\module.py", line 985, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 4.00 GiB total capacity; 3.41 GiB already allocated; 0 bytes free; 3.46 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

(imaginairi) C:\Users\franc>

installation imaginAIry.txt

Requirements to make it work on M1

While installing pip install imaginairy on Mac (M1) it needs a dependency basicsr, this will build from the source and fails: error: can't find Rust compiler

After installing rust on the recommended way and restarting the terminal it was able to install the package.

After successfully installing there was a runtime issue:
The protobuf version 3.19.5 has an issue. need to downgrade / pin protobuf (pip install protobuf==3.19.4)

Downloading,.... waiting for first image to appear… :)

Random crashing iMac M1 8GB

In general, I could generate 512 x 512 image using ImaginAIry, I love how easy the command line execution can run, with some functions.

However, I have been experiencing random crashing.

Also I noticed the "model loading" is also might be slowing down the process each time I run the command from Terminal.

Crashing randomly, not sure if it's because of the prompt or the seed should be specified.

[E thread_pool.cpp:113] Exception in thread pool task: mutex lock failed: Invalid argument
[E thread_pool.cpp:113] Exception in thread pool task: mutex lock failed: Invalid argument
[E thread_pool.cpp:113] Exception in thread pool task: mutex lock failed: Invalid argument
[E thread_pool.cpp:113] Exception in thread pool task: mutex lock failed: Invalid argument
[E thread_pool.cpp:113] Exception in thread pool task: mutex lock failed: Invalid argument
zsh: segmentation fault imagine "an old man carrying a chihuahua dog"

Crashing when making tileable texture: (4 specified prompts, crash after making 2 images)

  Image Generated. Timings: conditioning:0.51s sampling:272.66s safety-filter:4.32s total:290.51s
    🖼  [generated] saved to: ./outputs/generated/000003_827498268_kdpmpp2m15_PS7.5_hawaiian_okinawa_pattern_[generated].jpg
Generating 🖼  3/3: "pile of nuts" 512x512px seed:898575060 prompt-strength:7.5 steps:15 sampler-type:k_dpmpp_2m
  7%|███████▌                                                                                                          | 1/15 [00:20<04:53, 20.97s/it]
Traceback (most recent call last):
  File "/Users/blendersushi/miniconda3/bin/imagine", line 8, in <module>
    sys.exit(imagine_cmd())
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/cmds.py", line 238, in imagine_cmd
    imagine_image_files(
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/api.py", line 74, in imagine_image_files
    for result in imagine(
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/api.py", line 308, in imagine
    samples = sampler.sample(
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/samplers/kdiff.py", line 117, in sample
    samples = self.sampler_func(
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/vendored/k_diffusion/sampling.py", line 735, in sample_dpmpp_2m
    denoised = model(x, sigmas[i] * s_in, **extra_args)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/samplers/base.py", line 102, in forward
    noise_pred = get_noise_prediction(
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/samplers/base.py", line 150, in get_noise_prediction
    noise_pred_neutral, noise_pred_positive = denoise_func(
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/samplers/base.py", line 85, in _wrapper
    return self.inner_model(
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/vendored/k_diffusion/external.py", line 130, in forward
    eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/vendored/k_diffusion/external.py", line 160, in get_eps
    return self.inner_model.apply_model(*args, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/modules/diffusion/ddpm.py", line 764, in apply_model
    x_recon = self.model(x_noisy, t, **cond)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/modules/diffusion/ddpm.py", line 889, in forward
    out = self.diffusion_model(x, t, context=cc)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/modules/diffusion/openaimodel.py", line 778, in forward
    h = module(h, emb, context)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/modules/diffusion/openaimodel.py", line 85, in forward
    x = layer(x, context)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/modules/attention.py", line 331, in forward
    x = block(x, context=context)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/modules/attention.py", line 277, in forward
    return checkpoint(
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/modules/diffusion/util.py", line 149, in checkpoint
    return CheckpointFunction.apply(func, len(inputs), *args)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/modules/diffusion/util.py", line 162, in forward
    output_tensors = ctx.run_function(*ctx.input_tensors)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/modules/attention.py", line 283, in _forward
    x = self.attn1(self.norm1(x)) + x
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/modules/attention.py", line 167, in forward
    return self.forward_splitmem(x, context=context, mask=mask)
  File "/Users/blendersushi/miniconda3/lib/python3.9/site-packages/imaginairy/modules/attention.py", line 224, in forward_splitmem
    raise RuntimeError(
RuntimeError: Not enough memory, use lower resolution (max approx. 448x448). Need: 0.0GB free, Have:0.0GB free

No module named 'deep_daze'

After installation in a 3.10 virtualenv the imagine cli throws this error on first usage.

hdddrive/python/ai via v3.10.7 (ai) $ imagine 'a dog'
Traceback (most recent call last):
  File "/usr/bin/imagine", line 5, in <module>
    from deep_daze.cli import main
ModuleNotFoundError: No module named 'deep_daze'
hdddrive/python/ai via v3.10.7 (ai) $

Even after pip install deep_daze the error persists

Problems generating with --init-image

Hey!

Thank you so much for this resource, it's awesome!

I've come across an issue when i try to generate using the img2img functionality. I ran the command

imagine "painting of a dog surfing" --init-image "outputs/000001_528675053_plms40_PS7.5_a_dog_with_sunglasses_surfing_a_wave.jpg"

and I got this error

AttributeError: 'PLMSSampler' object has no attribute 'stochastic_encode'

I kept getting this error after also trying to use a seed, and changing number of steps.

CLIP Interrogation - possible solution found

Under Todo > Image Enhancement > image describe feature in the Readme is an entry for "https://github.com/pharmapsychotic/clip-interrogator (blip + clip)"

The ability to pass an image and phrases to CLIP is already present as enhancers.describe_image_clip.find_img_text_similarity. The ability to use BLIP is already marked as done on the Todo list. The only remaining part of the pharmapsychotic/clip-interrogator code that is not implement here is the long lists of artists, mediums, etc..

There is a package on pypi called clip-gaze that includes cleaned and expanded upon versions of those lists (as well as implementing a version of find_img_text_similarity). It has no dependencies beyond what is already needed in this repo.

If the lists are desirable I suggest importing them from clip_gaze, and if not then I think that part of the Todo can be checked off.

Error on imaginairy==6.0.0a0 Mac M1 Max

Using pip install imaginairy==6.0.0a0 on MacBook Pro M1 Max

prompt, doesn't matter fails on any prompt: imagine "a scenic landscape in a watercolour style artstation"

/miniforge3/lib/python3.9/site-packages/imaginairy/vendored/k_diffusion/external.py", line 97, in t_to_sigma log_sigma = (1 - w) * self.log_sigmas[low_idx] + w * self.log_sigmas[high_idx]

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

Sizes of tensors must match except in dimension 1. Expected size 4 but got size 3 for tensor number 1 in the list.

PLMS sampler fails with this error. This happens when an init image is smaller than the prompt size in both x and y.
I solved this by changing the pillow_fit_image_within() function from:

def pillow_fit_image_within(image: PIL.Image.Image, max_height=512, max_width=512):
    image = image.convert("RGB")
    w, h = image.size
    if w > max_width or h > max_height:
        resize_ratio = min(max_width / w, max_height / h)
        w, h = int(w * resize_ratio), int(h * resize_ratio)
        w, h = map(lambda x: x - x % 64, (w, h))  # resize to integer multiple of 64
        image = image.resize((w, h), resample=Image.Resampling.LANCZOS)
    return image

def pillow_fit_image_within(image: PIL.Image.Image, max_height=512, max_width=512):
    image = image.convert("RGB")
    w, h = image.size
    if w < max_width and h < max_height:
        # image size is smaller than both prompt width and prompt height
        resize_ratio = max(max_width / w, max_height / h)
        w, h = int(w * resize_ratio), int(h * resize_ratio)
        w, h = map(lambda x: x - x % 64, (w, h))  # resize to integer multiple of 64
        image = image.resize((w, h), resample=Image.Resampling.LANCZOS)
    elif w > max_width or h > max_height:
        resize_ratio = min(max_width / w, max_height / h)
        w, h = int(w * resize_ratio), int(h * resize_ratio)
        w, h = map(lambda x: x - x % 64, (w, h))  # resize to integer multiple of 64
        image = image.resize((w, h), resample=Image.Resampling.LANCZOS)

    return image

in order to scale the image up to match the prompt size. I've only tested this with a couple of undersized images, but it seems to be working as intended so far. I'd submit a pull request, but my fork is messy.

Include CrossAttentionControl as an alternative to clipseg

https://github.com/bloc97/CrossAttentionControl/

By the looks of it, this technique seems more intuitive and easier for beginners to use to get the desired output from SD, compared to mask prompts/images with clipseg. I feel like it could fit in well with this project as the goal of both seem to be ease of use.

Is this something that sounds reasonable, or am I overestimating the usefulness of this technique?

minor: link add suggestion

thought I'd mention this as a possible add to ref. links in docs because its generic enough to be useful to imaginairy users and I learnt a lot in just 100 slides or so..
https://openart.ai/promptbook

issues with clothing replacements using masks

I've been trying imaginary to swap clothes in photos (example mask is t-shirt), replace the mask with "yellow t shirt", but the results don't seem to work

Sometimes the mask isn't replaced at all, and sometimes it is replaced with something that doesn't match the prompt.

I've tried changing the prompt strength and init image strength, but can't seem to get what I want. Any suggestions on how I can tweak things to look like reasonable clothing replacements?

failed generations show up as flagged by safety checker