divamgupta / stable-diffusion-tensorflow Goto Github PK

View Code? Open in Web Editor NEW

1.6K 25.0 227.0 2.08 MB

Stable Diffusion in TensorFlow / Keras

License: Other

Python 100.00%

stable-diffusion-tensorflow's People

Stargazers

Watchers

Forkers

alabarga fchollet peteralexandercharles cyber-handle-enterprise ns23k hephaex urbanist-ai svngoku hazemessamm rubenszimbres masdc sergey-rubtsov juangon stanleyjacob ogsiji fitsumreda yyetim sztsiit techthiyanes polybius12 1ncursio sanjaykrishnamurthy muharremokutan faithfulnguyen automata-studio tcapelle hbcbh1999 aniketmaurya subhrm ypeleg thearchiver sammriddhgupta m-mburu mewtyunjay nicolasbuitragob neuralch justcherie an-gu99 donniet n0madic kechan yudiwbs nopeanuts codzart bladewaltz1 lime-cakes andre96maria marcus-arcadius gansel51 luiscerto entrpn ak391 dr-gareth-roberts osean-man aravinda89 mind-forks sivanmehta gtamilselvan07 howler-technologies-limited kotori05 yesthing quincykx tonyliupc jonathanjao zhangchn innat francoasdev miguelcalado xiongs masoud-al nicolasperezdeo baizidjilani lukewood mfranzon renesugar jags111 virtualramblas baiydaavi ssghost 8000net natepsacc kazukimiyazato2021 soten355 southmost costiash cyber-machine shaunxz fahimf ryuseiri srisco bayutop vmedina-rod fangyuchuan jford49 zoraman26 templeblock ch-wong jaycedowns42 kenuku jpiabrantes

stable-diffusion-tensorflow's Issues

Running on Multiple GPUs

To run on several GPUs, do I set a setting within stable_diffusion_tf, or do I set that in tensorflow/keras and then run the stable_d code?

Requesting onnx for the diffusion model.

Hi,

Requesting for onnx support for the following model weights you have provided here : https://huggingface.co/fchollet/stable-diffusion/resolve/main/diffusion_model.h5

Thanks

InvalidArgumentError when using GPU Colab + Mixed Precision with input_image

Hello, first and foremost, thank you for this fantastic repository and the provided colab demonstrations!
When I attempted the GPU Colab + Mixed Precision and tried to send an input image argument to the generator, I received the following error:
`
InvalidArgumentError Traceback (most recent call last)

in
7 temperature=1,
8 batch_size=1,
----> 9 input_image="/content/gen.png"
10 )

4 frames

/usr/local/lib/python3.7/dist-packages/stable_diffusion_tf/stable_diffusion.py in generate(self, prompt, batch_size, num_steps, unconditional_guidance_scale, temperature, seed, input_image, input_image_strength)
74 input_img_noise_t = timesteps[ int(len(timesteps)*input_image_strength) ]
75 latent, alphas, alphas_prev = self.get_starting_parameters(
---> 76 timesteps, batch_size, seed , input_image=input_image, input_img_noise_t=input_img_noise_t
77 )
78

/usr/local/lib/python3.7/dist-packages/stable_diffusion_tf/stable_diffusion.py in get_starting_parameters(self, timesteps, batch_size, seed, input_image, input_img_noise_t)
159 else:
160 latent = self.encoder(input_image[None])
--> 161 latent = self.add_noise(latent, input_img_noise_t)
162 latent = tf.repeat(latent , batch_size , axis=0)
163 return latent, alphas, alphas_prev

/usr/local/lib/python3.7/dist-packages/stable_diffusion_tf/stable_diffusion.py in add_noise(self, x, t)
108 sqrt_one_minus_alpha_prod = (1 - _ALPHAS_CUMPROD[t]) ** 0.5
109
--> 110 return sqrt_alpha_prod * x + sqrt_one_minus_alpha_prod * noise
111
112 def timestep_embedding(self, timesteps, dim=320, max_period=10000):

/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/traceback_utils.py in error_handler(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.traceback)
--> 153 raise e.with_traceback(filtered_tb) from None
154 finally:
155 del filtered_tb

/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
7207 def raise_from_not_ok_status(e, name):
7208 e.message += (" name: " + name if name is not None else "")
-> 7209 raise core._status_to_exception(e) from None # pylint: disable=protected-access
7210
7211

InvalidArgumentError: cannot compute AddV2 as input #1(zero-based) was expected to be a half tensor but is a float tensor [Op:AddV2]
`

first. I ran the StableDiffusion generator instantiation and created the first picture, which worked as expected. Then I tried running the generator again, passing the picture made in the previous run as the 'input image' option, but I got the above issue.
I also adjusted the installation in the Colab demo to be from the most recent commit of this repo:
pip install git+https://github.com/divamgupta/stable-diffusion-tensorflow --upgrade --quiet

Thanks a lot in advance!

abormal behavior

I followed and copied the code on your main page, however it did not behave normally:
ValueError Traceback (most recent call last)
Cell In[5], line 4
1 from stable_diffusion_tf.stable_diffusion import StableDiffusion
2 from PIL import Image
----> 4 generator = StableDiffusion()

File ~/anaconda3/envs/art_intel/lib/python3.9/site-packages/stable_diffusion_tf/stable_diffusion.py:24, in StableDiffusion.init(self, img_height, img_width, jit_compile, download_weights)
21 self.img_width = img_width
22 self.tokenizer = SimpleTokenizer()
---> 24 text_encoder, diffusion_model, decoder, encoder = get_models(img_height, img_width, download_weights=download_weights)
25 self.text_encoder = text_encoder
26 self.diffusion_model = diffusion_model

File ~/anaconda3/envs/art_intel/lib/python3.9/site-packages/stable_diffusion_tf/stable_diffusion.py:238, in get_models(img_height, img_width, download_weights)
235 latent = keras.layers.Input((n_h, n_w, 4))
236 unet = UNetModel()
237 diffusion_model = keras.models.Model(
--> 238 [latent, t_emb, context], unet([latent, t_emb, context])
239 )
241 # Create decoder
242 latent = keras.layers.Input((n_h, n_w, 4))

File ~/anaconda3/envs/art_intel/lib/python3.9/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.traceback)
68 # To get the full stack trace, call:
69 # tf.debugging.disable_traceback_filtering()
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb

File /tmp/autograph_generated_file2lo7vt15.py:84, in outer_factory..inner_factory..tf__call(self, inputs)
82 layer = ag.Undefined('layer')
83 b = ag__.Undefined('b')
---> 84 ag__.for_stmt(ag__.ld(self).input_blocks, None, loop_body_1, get_state_3, set_state_3, ('x',), {'iterate_names': 'b'})
86 def get_state_4():
87 return (x,)

File /tmp/autograph_generated_file2lo7vt15.py:80, in outer_factory..inner_factory..tf__call..loop_body_1(itr_1)
78 layer = itr
79 x = ag.converted_call(ag__.ld(apply), (ag__.ld(x), ag__.ld(layer)), None, fscope)
---> 80 ag__.for_stmt(ag__.ld(b), None, loop_body, get_state_2, set_state_2, ('x',), {'iterate_names': 'layer'})
81 ag__.converted_call(ag__.ld(saved_inputs).append, (ag__.ld(x),), None, fscope)

File /tmp/autograph_generated_file2lo7vt15.py:79, in outer_factory..inner_factory..tf__call..loop_body_1..loop_body(itr)
77 nonlocal x
78 layer = itr
---> 79 x = ag.converted_call(ag__.ld(apply), (ag__.ld(x), ag__.ld(layer)), None, fscope)

File /tmp/autograph_generated_file2lo7vt15.py:48, in outer_factory..inner_factory..tf__call..apply(x, layer)
46 x = ag.converted_call(ag__.ld(layer), (ag__.ld(x),), None, fscope_1)
47 ag__.if_stmt(ag__.converted_call(ag__.ld(isinstance), (ag__.ld(layer), ag__.ld(SpatialTransformer)), None, fscope_1), if_body, else_body, get_state, set_state, ('x',), 1)
---> 48 ag__.if_stmt(ag__.converted_call(ag__.ld(isinstance), (ag__.ld(layer), ag__.ld(ResBlock)), None, fscope_1), if_body_1, else_body_1, get_state_1, set_state_1, ('x',), 1)
49 try:
50 do_return_1 = True

File /tmp/autograph_generated_file2lo7vt15.py:28, in outer_factory..inner_factory..tf__call..apply..if_body_1()
26 def if_body_1():
27 nonlocal x
---> 28 x = ag.converted_call(ag__.ld(layer), ([ag__.ld(x), ag__.ld(emb)],), None, fscope_1)

File /tmp/autograph_generated_filem_kzpxnn.py:11, in outer_factory..inner_factory..tf__call(self, inputs)
9 retval = ag_.UndefinedReturnValue()
10 (x, emb) = ag__.ld(inputs)
---> 11 h = ag__.converted_call(ag__.ld(apply_seq), (ag__.ld(x), ag__.ld(self).in_layers), None, fscope)
12 emb_out = ag__.converted_call(ag__.ld(apply_seq), (ag__.ld(emb), ag__.ld(self).emb_layers), None, fscope)
13 h = ag__.ld(h) + ag__.ld(emb_out)[:, None, None]

File /tmp/autograph_generated_file612zmgqy.py:23, in outer_factory..inner_factory..tf__apply_seq(x, layers)
21 x = ag.converted_call(ag__.ld(l), (ag__.ld(x),), None, fscope)
22 l = ag__.Undefined('l')
---> 23 ag__.for_stmt(ag__.ld(layers), None, loop_body, get_state, set_state, ('x',), {'iterate_names': 'l'})
24 try:
25 do_return = True

File /tmp/autograph_generated_file612zmgqy.py:21, in outer_factory..inner_factory..tf__apply_seq..loop_body(itr)
19 nonlocal x
20 l = itr
---> 21 x = ag.converted_call(ag__.ld(l), (ag__.ld(x),), None, fscope)

File ~/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py:110, in GroupNormalization.build(self, input_shape)
108 self._check_if_input_shape_is_none(input_shape)
109 self._set_number_of_groups_for_instance_norm(input_shape)
--> 110 self._check_size_of_dimensions(input_shape)
111 self._create_input_spec(input_shape)
113 self._add_gamma_weight(input_shape)

File ~/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py:227, in GroupNormalization._check_size_of_dimensions(self, input_shape)
225 dim = input_shape[self.axis]
226 if dim < self.groups:
--> 227 raise ValueError(
228 "Number of groups (" + str(self.groups) + ") cannot be "
229 "more than the number of channels (" + str(dim) + ")."
230 )
232 if dim % self.groups != 0:
233 raise ValueError(
234 "Number of groups (" + str(self.groups) + ") must be a "
235 "multiple of the number of channels (" + str(dim) + ")."
236 )

ValueError: Exception encountered when calling layer "u_net_model_1" (type UNetModel).

in user code:

File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/stable_diffusion_tf/diffusion_model.py", line 199, in apply  *
    x = layer([x, emb])
File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler  **
    raise e.with_traceback(filtered_tb) from None
File "/tmp/__autograph_generated_filem_kzpxnn.py", line 11, in tf__call
    h = ag__.converted_call(ag__.ld(apply_seq), (ag__.ld(x), ag__.ld(self).in_layers), None, fscope)
File "/tmp/__autograph_generated_file612zmgqy.py", line 23, in tf__apply_seq

ag__.for_stmt(ag__.ld(layers), None, loop_body, get_state, set_state, ('x',), {'iterate_names': 'l'})
File "/tmp/autograph_generated_file612zmgqy.py", line 21, in loop_body
x = ag.converted_call(ag__.ld(l), (ag__.ld(x),), None, fscope)
File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py", line 110, in build
self._check_size_of_dimensions(input_shape)
File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py", line 227, in _check_size_of_dimensions
raise ValueError(

ValueError: Exception encountered when calling layer "res_block_22" "                 f"(type ResBlock).

in user code:

    File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/stable_diffusion_tf/diffusion_model.py", line 31, in call  *
        h = apply_seq(x, self.in_layers)
    File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/stable_diffusion_tf/layers.py", line 41, in apply_seq  *
        x = l(x)
    File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler  **
        raise e.with_traceback(filtered_tb) from None
    File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py", line 110, in build
        self._check_size_of_dimensions(input_shape)
    File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py", line 227, in _check_size_of_dimensions
        raise ValueError(

    ValueError: Number of groups (32) cannot be more than the number of channels (4).


Call arguments received by layer "res_block_22" "                 f"(type ResBlock):
  • inputs=['tf.Tensor(shape=(None, 320, 125, 4), dtype=float32)', 'tf.Tensor(shape=(None, 1280), dtype=float32)']

Call arguments received by layer "u_net_model_1" (type UNetModel):
• inputs=['tf.Tensor(shape=(None, 125, 125, 4), dtype=float32)', 'tf.Tensor(shape=(None, 320), dtype=float32)', 'tf.Tensor(shape=(None, 77, 768), dtype=float32)']

How to train this?

I can't find codes to train this model, neither text_encoder nor img_diffuser

Decoder for img2img

Do you know the structure of the Decoder for 512x512x3 => 64x64x4? It would be good to have it too for img2img functionality, probably for inpainting too, and for upscaling.

temperature parameter purpose

stable-diffusion-tensorflow/stable_diffusion_tf/stable_diffusion.py

Lines 141 to 150 in c172a40

    
           def get_x_prev_and_pred_x0(self, x, e_t, index, a_t, a_prev, temperature, seed): 
        
               sigma_t = 0 
        
               sqrt_one_minus_at = math.sqrt(1 - a_t) 
        
               pred_x0 = (x - sqrt_one_minus_at * e_t) / math.sqrt(a_t) 
        
               # Direction pointing to x_t 
        
               dir_xt = math.sqrt(1.0 - a_prev - sigma_t**2) * e_t 
        
               noise = sigma_t * tf.random.normal(x.shape, seed=seed) * temperature 
        
               x_prev = math.sqrt(a_prev) * pred_x0 + dir_xt 
        
               return x_prev, pred_x0

there is a parameter named temperature. Changing it does not affect the output. Are there future plans for it?

sigma_t always 0 => no noise being added?

noticed in get_x_prev_and_pred_x0 that sigma_t is always set to 0.
which means the later noise calculation will always be zero at

https://github.com/divamgupta/stable-diffusion-tensorflow/blob/master/stable_diffusion_tf/stable_diffusion.py#L124

furthermore noise is never used ? ( which means that the temperature is not used either )

i notice in the DDIMSampler (and PLMS) that noise is usually added to x_prev e.g.

https://github.com/CompVis/stable-diffusion/blob/main/ldm/models/diffusion/ddim.py#L203

not sure if this is a bug (in which case it should be added) or if it's not required for inference? (in which case the noise and sigma can just be removed from code)

how to resolve depency issues?

Sorry for the noob question.
But is there a requirements.txt somewhere?

Running the setup.py is not working for me.
How do I install all the required packages for an M1 mac?

Prompt is too long (should be < 77 tokens)

Is there any way to have longer prompts? (I'm a complete ML noob, would love to learn!)

NSFW filter for results

Thanks for making this wonderful repo. I was using the video generation function in the given colab file with this prompt.

At least 35 dead from mysterious meningitis outbreak in Mexico

I ended up getting images that had nudity, and genital areas exposed. Is it possible to add some sort of filter to remove such results.

text-guided image-to-image translation and upscaling

Thank you for doing this - I tried a few tutorials to get Stable-Diffusion to work and found this one the best.

How can I upscale an image with your fork?

Thank you,
Maurice

why bad results?

(dlenv) D:\SourceCodes\Diffusion\stable-diffusion-tensorflow>python text2image.py --prompt="Ruins of a castle in Scotland" --output="my_image.png"
0 1: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:54<00:00, 1.09s/it]
saved at my_image.png

Rectangular image generation?

Tried creating a generator where width != height, fails with this error:

ValueError                                Traceback (most recent call last)
[<ipython-input-4-1026659721fc>](https://localhost:8080/#) in <module>
      5     img_height=640,
      6     img_width=340,
----> 7     jit_compile=False,  # You can try True as well (different performance profile)
      8 )

4 frames
[/usr/local/lib/python3.7/dist-packages/stable_diffusion_tf/diffusion_model.py](https://localhost:8080/#) in loop_body_4(itr_4)
    107                     nonlocal x
    108                     b = itr_4
--> 109                     x = ag__.converted_call(ag__.ld(tf).concat, ([ag__.ld(x), ag__.converted_call(ag__.ld(saved_inputs).pop, (), None, fscope)],), dict(axis=(- 1)), fscope)
    110 
    111                     def get_state_5():

ValueError: Exception encountered when calling layer "u_net_model_1" (type UNetModel).

in user code:

    File "/usr/local/lib/python3.7/dist-packages/stable_diffusion_tf/diffusion_model.py", line 216, in call  *
        x = tf.concat([x, saved_inputs.pop()], axis=-1)

    ValueError: Dimension 2 in both shapes must be equal, but are 12 and 11. Shapes are [?,20,12] and [?,20,11]. for '{{node u_net_model_1/concat_3}} = ConcatV2[N=2, T=DT_HALF, Tidx=DT_INT32](u_net_model_1/upsample_2/padded_conv2d_129/conv2d_129/BiasAdd, u_net_model_1/spatial_transformer_18/add, u_net_model_1/concat_3/axis)' with input shapes: [?,20,12,1280], [?,20,11,1280], [] and with computed input tensors: input[2] = <-1>.


Call arguments received by layer "u_net_model_1" (type UNetModel):
  • inputs=['tf.Tensor(shape=(None, 80, 42, 4), dtype=float16)', 'tf.Tensor(shape=(None, 320), dtype=float16)', 'tf.Tensor(shape=(None, 77, 768), dtype=float16)']```

Is it possible to generate rectangular images using this implementation?

Finetune the model on custome dataset

I am wondering if it is possible to finetune the model on my own dataset? I know keras-cv has released something for finetuning stable diffusion, but I encountered installation issues (most probably because it requires tf2.11 which is not supported on native windows). I can run the model on tf2.10. It would be great if we can train the model as well. Many thanks.

25+ Stable Diffusion Tutorials And Guides - Very Useful For Stable Diffusion Users

Hello dear Divam Gupta, I hope you let this thread stay to help newcomers. This is not an issue thread. Thank you.

Expert-Level Tutorials on Stable Diffusion: Master Advanced Techniques and Strategies

Greetings everyone. I am Dr. Furkan Gözükara. I am an Assistant Professor in Software Engineering department of a private university (have PhD in Computer Engineering). My professional programming skill is unfortunately C# not Python :)

My linkedin : https://www.linkedin.com/in/furkangozukara/

Our channel address if you like to subscribe : https://www.youtube.com/@SECourses

Our discord to get more help : https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

I am keeping this list up-to-date. I got upcoming new awesome video ideas. Trying to find time to do that.

I am open to any criticism you have. I am constantly trying to improve the quality of my tutorial guide videos. Please leave comments with both your suggestions and what you would like to see in future videos.

All videos have manually fixed subtitles and properly prepared video chapters. You can watch with these perfect subtitles or look for the chapters you are interested in.

Since my profession is teaching, I usually do not skip any of the important parts. Therefore, you may find my videos a little bit longer.

Playlist link on YouTube: Stable Diffusion Tutorials, Automatic1111 and Google Colab Guides, DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Pix2Pix, Img2Img

1.) Automatic1111 Web UI - PC - Free
Easiest Way to Install & Run Stable Diffusion Web UI on PC by Using Open Source Automatic Installer

2.) Automatic1111 Web UI - PC - Free
How to use Stable Diffusion V2.1 and Different Models in the Web UI - SD 1.5 vs 2.1 vs Anything V3

3.) Automatic1111 Web UI - PC - Free
Zero To Hero Stable Diffusion DreamBooth Tutorial By Using Automatic1111 Web UI - Ultra Detailed

4.) Automatic1111 Web UI - PC - Free
DreamBooth Got Buffed - 22 January Update - Much Better Success Train Stable Diffusion Models Web UI

5.) Automatic1111 Web UI - PC - Free
How to Inject Your Trained Subject e.g. Your Face Into Any Custom Stable Diffusion Model By Web UI

6.) Automatic1111 Web UI - PC - Free
How To Do Stable Diffusion LORA Training By Using Web UI On Different Models - Tested SD 1.5, SD 2.1

7.) Automatic1111 Web UI - PC - Free
8 GB LoRA Training - Fix CUDA & xformers For DreamBooth and Textual Inversion in Automatic1111 SD UI

8.) Automatic1111 Web UI - PC - Free
How To Do Stable Diffusion Textual Inversion (TI) / Text Embeddings By Automatic1111 Web UI Tutorial

9.) Automatic1111 Web UI - PC - Free
How To Generate Stunning Epic Text By Stable Diffusion AI - No Photoshop - For Free - Depth-To-Image

10.) Python Code - Hugging Face Diffusers Script - PC - Free
How to Run and Convert Stable Diffusion Diffusers (.bin Weights) & Dreambooth Models to CKPT File

11.) NMKD Stable Diffusion GUI - Open Source - PC - Free
Forget Photoshop - How To Transform Images With Text Prompts using InstructPix2Pix Model in NMKD GUI

12.) Google Colab Free - Cloud - No PC Is Required
Transform Your Selfie into a Stunning AI Avatar with Stable Diffusion - Better than Lensa for Free

13.) Google Colab Free - Cloud - No PC Is Required
Stable Diffusion Google Colab, Continue, Directory, Transfer, Clone, Custom Models, CKPT SafeTensors

14.) Automatic1111 Web UI - PC - Free
Become A Stable Diffusion Prompt Master By Using DAAM - Attention Heatmap For Each Used Token - Word

15.) Python Script - Gradio Based - ControlNet - PC - Free
Transform Your Sketches into Masterpieces with Stable Diffusion ControlNet AI - How To Use Tutorial

16.) Automatic1111 Web UI - PC - Free
Sketches into Epic Art with 1 Click: A Guide to Stable Diffusion ControlNet in Automatic1111 Web UI

17.) RunPod - Automatic1111 Web UI - Cloud - Paid - No PC Is Required
Ultimate RunPod Tutorial For Stable Diffusion - Automatic1111 - Data Transfers, Extensions, CivitAI

18.) Automatic1111 Web UI - PC - Free
Fantastic New ControlNet OpenPose Editor Extension & Image Mixing - Stable Diffusion Web UI Tutorial

19.) Automatic1111 Web UI - PC - Free
Automatic1111 Stable Diffusion DreamBooth Guide: Optimal Classification Images Count Comparison Test

20.) Automatic1111 Web UI - PC - Free
Epic Web UI DreamBooth Update - New Best Settings - 10 Stable Diffusion Training Compared on RunPods

21.) Automatic1111 Web UI - PC - Free
New Style Transfer Extension, ControlNet of Automatic1111 Stable Diffusion T2I-Adapter Color Control

22.) Automatic1111 Web UI - RunPod - Paid
How To Install New DreamBooth Extension On RunPod - Automatic1111 Web UI - Stable Diffusion

23.) Automatic1111 Web UI - PC - Free
Generate Text Arts & Fantastic Logos By Using ControlNet Stable Diffusion Web UI For Free Tutorial

24.) Automatic1111 Web UI - PC - Free
For downgrade to older version if you don't like Torch 2 : first delete venv, let it reinstall, then activate venv and run this command pip install -r "path_of_SD_Extension\requirements.txt"
How To Install New DREAMBOOTH & Torch 2 On Automatic1111 Web UI PC For Epic Performance Gains Guide

25.) Automatic1111 Web UI - PC - Free
Training Midjourney Level Style And Yourself Into The SD 1.5 Model via DreamBooth Stable Diffusion

Apple M1 8GB Speed comparison using Python vs using DiffusionBee

I am not sure if I have installed the Python version correctly, but using DiffusionBee, I believe I am getting 3-4 minutes per image at default.

While with the python version running it's 10 minutes per image.

I was wondering if the tensorflow gpu is properly installed on my M1 iMac 8GB memory? How do I check?

Passing image through encoder and decoder yields shifted result

After passing an image into encoder and straight to decoder, the image is shifted slightly. This seemed to happen at various different size.

Here's a basic colab to reproduce the weird result. It's edited from img2img notebook linked on readme.
https://colab.research.google.com/drive/1lwCF5vdys5U4u9skE_DFrK8ecUP8Pmnm?usp=sharing

Load model without the image size

The current version of TF Stable Diffusion needs the image width and height to load the model.

def get_models(img_height, img_width, download_weights=True):

Indeed, the graph is built again if there is another image with a different size.

Is it possible to load the model without specifying the width and height? It would save a lot of time...

Question about model converting & license

What is the license for the implementation?

Also, can you release the script you used to convert the model? I wish to port finetuned models to this codebase. Thanks!

[QUESTION] Can it run with 2 or 3 GB vram

I have two gpu's. One with 2 gb and one with 3 gb vram. Will I be able to run this on those gpu's?

Note: The GPU's are in different machines.

Negative prompts

As implemented, is it currently possible to use negative prompts?

Is there a float16 model availabel?

With GTX1660 6G VRAM, the provided model at https://huggingface.co/fchollet/stable-diffusion cannot be loaded into memory. Or any advice how I can convert it into float16 model? THX!

Support for StableDiffusion 2.1

At the outset , great work divam !
Does this repo support the latest released stable-diffusion 2.1 ?

Why prompt is limited to 77

In a pipeline I replaced the pytorch version with this implementation, but found the maximum prompt is limited to 77. Is this a compromise for some reasons?

tensor shape mismatch error when height and width not equals to 512

when I try to change img_width to 768,I got this error:

ValueError: Input 0 of layer "model_1" is incompatible with the layer: expected shape=(None, 64, 64, 4), found shape=(1, 64, 96, 4)

Failed while running with GPU

I ran using just the CPU, to improve performance, I wish to run using GPU, but received the following error:

Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not initialized.

The complete message is:

File "/home/claudino/Projetos/OpenSource/stable-diffusion-tensorflow/stable_diffusion_tf/stable_diffusion.py", line 270, in get_models
    diffusion_model.load_weights(diffusion_model_weights_fpath)
  File "/home/claudino/miniconda3/envs/stable-diffusion/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/claudino/miniconda3/envs/stable-diffusion/lib/python3.10/site-packages/keras/backend.py", line 4302, in batch_set_value
    x.assign(np.asarray(value, dtype=dtype_numpy(x)))
tensorflow.python.framework.errors_impl.InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not initialized.

Cant determine the cause.

My environment:

(stable-diffusion) $> lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.1 LTS
Release:        22.04
Codename:       jamm

(stable-diffusion) $> nvidia-smi 
Tue Jan 17 17:31:00 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.13    Driver Version: 525.60.13    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| N/A   56C    P8     7W /  N/A |    208MiB /  6144MiB |     35%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1572      G   /usr/lib/xorg/Xorg                109MiB |
|    0   N/A  N/A      3293    C+G   ...014073573827879945,131072       96MiB |
+-----------------------------------------------------------------------------+

Cuda 11.2, tensorflow 2.10.0, cudnn 8.1.0

img2img question, guidance scale and input strength

In the diffusionBee UI the guidance scale is between 0-20 and the input strength is 10-90
What does that map to in the following.. I want to reproduce the output I get in diffusionBee
def generate(
self,
prompt,
negative_prompt=None,
batch_size=1,
num_steps=25,
unconditional_guidance_scale=7.5,
temperature=1,
seed=None,
input_image=None,
input_mask=None,
input_image_strength=0.5,
):

Does tflite support Android GPU or npu?

Hello, I am preparing to port stable diffusion to Android to run with TensorFlow, but I have a question, does the current tflite support GPU or npu, I have run stable diffusion on Android using onnxruntime, unfortunately it does not Support GPU or npu.

this is my project: https://github.com/ZTMIDGO/Android-Stable-diffusion-ONNX

[Inquiry] VAEEncoder bug in KerasCV

Hello @divamgupta ! As you know, we ported the image encoder in https://github.com/keras-team/keras-cv from your library! We've supported inpainting via a method on StableDiffusion, and we're seeing a very strange issue! keras-team/keras-cv#1172 is the github issue.

We've tried just passing images through the encoder/decoder and this padding shows up.

My question for you is: did you have this bug in your repo? Did you submit a fix at some point? Anything that you may know that may be helpful?

Thanks in advance for any guidance! We appreciate your efforts a ton

Question for m1 max GPU performance

prompt:a bit cat

steps: 50
size: 512 * 512
cfg: 7.5
number: 1

pytorch: 36.77s / 1.2~1.4it/s

tf: 42.95s / 1.1 ~ 1.4 it/s

WHY ?

First generated image is different from the following ones using same settings

For some reason, the first generated image is different from the following ones using the code below:

from tensorflow import keras
from stable_diffusion_tf.stable_diffusion import Text2Image
from PIL import Image

# Prompt and seed copied from
# https://lexica.art/?prompt=715596cf-84bd-497f-8413-6e9bb8f39c5e
prompt = "cat seahorse fursona, autistic bisexual graphic designer, attractive fluffy humanoid character design, sharp focus, weirdcore voidpunk digital art by artgerm, akihiko yoshida, louis wain, simon stalenhag, wlop, noah bradley, furaffinity, artstation hd, trending on deviantart"

generator = Text2Image(img_height=512, img_width=512, jit_compile=False)
for num in range(3):
    img = generator.generate(
        prompt,
        num_steps=25,
        unconditional_guidance_scale=7.0,
        temperature=1,
        batch_size=1,
        seed=4030098432,
    )
    Image.fromarray(img[0]).save(f"output{num+1}.png")

Here are the images:

First generated image:

Second generated image:

Third generated image:

I get the same results even if I create a new generator for each image in the for-loop.

How do I run this in colab TPU?

Any example notebook on how to run this on TPU, if it needs modification?

Custum dataset and pretrain

Can I train custom pretrain data in tensorflow

how to use to remove object in picture

without any words input, how to use to this to remove object in picture.
only Restore background image

Random initialization and diffuse

Hey, thanks for putting this together =).

The code is actually substantially faster than the PyTorch counterpart on M1Pro. Even from the CoreML version.
I am getting the same results every time; it appears that the random noise is kind of "deterministic" in the text2image function.
Here I call the diffusion twice:
https://github.com/tcapelle/stable-diffusion-tensorflow/blob/master/02_inference.ipynb
Can you explain a little bit why all these hardcoded alphas are necessary?

support to run models like juggernaut-xl

I want try with xl model wanted to check if it is already tried and not working and if any challenges, where to place the model files for checkpoints

README and instructions now referencing fchollet PR?

I noticed these, has this gotten in by accidents from @fchollet PR? It may cause confusion down the road if one or the other repos is no longer maintained.

TF Lite convert error

Hi,
Leaving this here in case someone is also trying to convert to a TF lite model.

From the keras_cv documentation:

!pip install --upgrade keras-cv
!pip install --upgrade tensorflow

Load the model:

import time
import keras_cv
from tensorflow import keras
import matplotlib.pyplot as plt


model = keras_cv.models.StableDiffusion(img_width=512, img_height=512)
# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model.diffusion_model)
tflite_model = converter.convert()

It seems a similar error occurs when trying to save the model.

model.diffusion_model.save(save_dir)

The error in conversion:

[/usr/local/lib/python3.7/dist-packages/keras_cv/models/generative/stable_diffusion/__internal__/layers/group_normalization.py](https://localhost:8080/#) in _create_broadcast_shape(self, input_shape)
     85 
     86     def _create_broadcast_shape(self, input_shape):
---> 87      broadcast_shape = [1] * len(input_shape) 
     88         broadcast_shape[self.axis] = input_shape[self.axis] // self.groups

TypeError: Exception encountered when calling layer 'group_normalization_60' (type GroupNormalization).

len is not well defined for a symbolic Tensor (Shape:0). Please call `x.shape` rather than `len(x)` for shape information.

Call arguments received by layer 'group_normalization_60' (type GroupNormalization):
  • args=('tf.Tensor(shape=(None, 64, 64, 320), dtype=float32)',)
  • kwargs=<class 'inspect._empty'>

Confusion about the new param download_weights

In stable_diffusion.py,

if download_weights:
        text_encoder_weights_fpath = keras.utils.get_file(
            origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/text_encoder.h5",
            file_hash="d7805118aeb156fc1d39e38a9a082b05501e2af8c8fbdc1753c9cb85212d6619",
        )
        diffusion_model_weights_fpath = keras.utils.get_file(
            origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/diffusion_model.h5",
            file_hash="a5b2eea58365b18b40caee689a2e5d00f4c31dbcb4e1d58a9cf1071f55bbbd3a",
        )
        decoder_weights_fpath = keras.utils.get_file(
            origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/decoder.h5",
            file_hash="6d3c5ba91d5cc2b134da881aaa157b2d2adc648e5625560e3ed199561d0e39d5",
        )

        text_encoder.load_weights(text_encoder_weights_fpath)
        diffusion_model.load_weights(diffusion_model_weights_fpath)
        decoder.load_weights(decoder_weights_fpath)

It seems to skip get_files(...) call and also load_weights(...). So this option is for randomized weights and only use it if you are training from absolute scratch?

How to use Inpainting function

I see, some codes had already written at stable_diffusion.py, but if use input_mask param directly It cause a error in runtime.

How to run on CUDA?

I can't seem to figure out how to move the model to my GPU. I've tried copying the cloab notebooks, to no avail. Opening task manager shows the code is running on my cpu.

Default sampler

What sampler does this implementation use by default? Is it K_LMS?

segmentation fault python text2image.py

Trying to run locally on Mac Pro (Late 2013) with AMD FirePro D500 3 GB using tensor flow metal with tensor flow v 2.10

$python text2image.py --prompt="An astronaut riding a horse"

segmentation fault python text2image.py

Script to convert PyTorch weights

Thanks and congratz for this awesome piece of work.

Could you add to the repo whatever script was used to convert the original SD PyTorch checkpoint to the TF weights used by this repo?

That would allow people to load fine-tuned pytorch models into tensorflow for inference purposes.

README mentions Inpainting

Besides the readme showing an image for Inpainting, there does not seem to be any more information for this feature/
Am I missing something ?

Feature Request: Legacy & New Encoder

Stable Diffusion 2.0 uses a new text encoder, so the PyTorch mapping for that model and any future models won't work any more. It's beyond my expertise, but can we write into clip_encoder.py the ability to create the text encoder model for the new version of SD2.0?

Creating a choice for legacy or new encoder is a simple bool variable that can be passed, but I have no clue on how to create the new text encoder.

tensorflow.js

Anyone tried these models (after exporting them) with tensorflow.js?

Is the VAE used in fine-tuning available anywhere?

Hey the @divamgupta ! I'm trying to work on a FineTuning workflow for your model and was wondering if the VAE used in original training is available anywhere in TensorFlow that you are aware of.

I've poked around the repo and did not see any obvious home for it.

If not, no problem - but figured I would ask here first before finding one on my own!

	def get_x_prev_and_pred_x0(self, x, e_t, index, a_t, a_prev, temperature, seed):
	sigma_t = 0
	sqrt_one_minus_at = math.sqrt(1 - a_t)
	pred_x0 = (x - sqrt_one_minus_at * e_t) / math.sqrt(a_t)

	# Direction pointing to x_t
	dir_xt = math.sqrt(1.0 - a_prev - sigma_t*2) e_t
	noise = sigma_t * tf.random.normal(x.shape, seed=seed) * temperature
	x_prev = math.sqrt(a_prev) * pred_x0 + dir_xt
	return x_prev, pred_x0