Code Monkey home page Code Monkey logo

stable-diffusion-tensorflow's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stable-diffusion-tensorflow's Issues

Running on Multiple GPUs

To run on several GPUs, do I set a setting within stable_diffusion_tf, or do I set that in tensorflow/keras and then run the stable_d code?

InvalidArgumentError when using GPU Colab + Mixed Precision with input_image

Hello, first and foremost, thank you for this fantastic repository and the provided colab demonstrations!
When I attempted the GPU Colab + Mixed Precision and tried to send an input image argument to the generator, I received the following error:
`
InvalidArgumentError Traceback (most recent call last)

in
7 temperature=1,
8 batch_size=1,
----> 9 input_image="/content/gen.png"
10 )

4 frames

/usr/local/lib/python3.7/dist-packages/stable_diffusion_tf/stable_diffusion.py in generate(self, prompt, batch_size, num_steps, unconditional_guidance_scale, temperature, seed, input_image, input_image_strength)
74 input_img_noise_t = timesteps[ int(len(timesteps)*input_image_strength) ]
75 latent, alphas, alphas_prev = self.get_starting_parameters(
---> 76 timesteps, batch_size, seed , input_image=input_image, input_img_noise_t=input_img_noise_t
77 )
78

/usr/local/lib/python3.7/dist-packages/stable_diffusion_tf/stable_diffusion.py in get_starting_parameters(self, timesteps, batch_size, seed, input_image, input_img_noise_t)
159 else:
160 latent = self.encoder(input_image[None])
--> 161 latent = self.add_noise(latent, input_img_noise_t)
162 latent = tf.repeat(latent , batch_size , axis=0)
163 return latent, alphas, alphas_prev

/usr/local/lib/python3.7/dist-packages/stable_diffusion_tf/stable_diffusion.py in add_noise(self, x, t)
108 sqrt_one_minus_alpha_prod = (1 - _ALPHAS_CUMPROD[t]) ** 0.5
109
--> 110 return sqrt_alpha_prod * x + sqrt_one_minus_alpha_prod * noise
111
112 def timestep_embedding(self, timesteps, dim=320, max_period=10000):

/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/traceback_utils.py in error_handler(*args, **kwargs)
151 except Exception as e:
152 filtered_tb = _process_traceback_frames(e.traceback)
--> 153 raise e.with_traceback(filtered_tb) from None
154 finally:
155 del filtered_tb

/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
7207 def raise_from_not_ok_status(e, name):
7208 e.message += (" name: " + name if name is not None else "")
-> 7209 raise core._status_to_exception(e) from None # pylint: disable=protected-access
7210
7211

InvalidArgumentError: cannot compute AddV2 as input #1(zero-based) was expected to be a half tensor but is a float tensor [Op:AddV2]
`

first. I ran the StableDiffusion generator instantiation and created the first picture, which worked as expected. Then I tried running the generator again, passing the picture made in the previous run as the 'input image' option, but I got the above issue.
I also adjusted the installation in the Colab demo to be from the most recent commit of this repo:
pip install git+https://github.com/divamgupta/stable-diffusion-tensorflow --upgrade --quiet

Thanks a lot in advance!

abormal behavior

I followed and copied the code on your main page, however it did not behave normally:
ValueError Traceback (most recent call last)
Cell In[5], line 4
1 from stable_diffusion_tf.stable_diffusion import StableDiffusion
2 from PIL import Image
----> 4 generator = StableDiffusion()

File ~/anaconda3/envs/art_intel/lib/python3.9/site-packages/stable_diffusion_tf/stable_diffusion.py:24, in StableDiffusion.init(self, img_height, img_width, jit_compile, download_weights)
21 self.img_width = img_width
22 self.tokenizer = SimpleTokenizer()
---> 24 text_encoder, diffusion_model, decoder, encoder = get_models(img_height, img_width, download_weights=download_weights)
25 self.text_encoder = text_encoder
26 self.diffusion_model = diffusion_model

File ~/anaconda3/envs/art_intel/lib/python3.9/site-packages/stable_diffusion_tf/stable_diffusion.py:238, in get_models(img_height, img_width, download_weights)
235 latent = keras.layers.Input((n_h, n_w, 4))
236 unet = UNetModel()
237 diffusion_model = keras.models.Model(
--> 238 [latent, t_emb, context], unet([latent, t_emb, context])
239 )
241 # Create decoder
242 latent = keras.layers.Input((n_h, n_w, 4))

File ~/anaconda3/envs/art_intel/lib/python3.9/site-packages/keras/utils/traceback_utils.py:70, in filter_traceback..error_handler(*args, **kwargs)
67 filtered_tb = _process_traceback_frames(e.traceback)
68 # To get the full stack trace, call:
69 # tf.debugging.disable_traceback_filtering()
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb

File /tmp/autograph_generated_file2lo7vt15.py:84, in outer_factory..inner_factory..tf__call(self, inputs)
82 layer = ag
.Undefined('layer')
83 b = ag__.Undefined('b')
---> 84 ag__.for_stmt(ag__.ld(self).input_blocks, None, loop_body_1, get_state_3, set_state_3, ('x',), {'iterate_names': 'b'})
86 def get_state_4():
87 return (x,)

File /tmp/autograph_generated_file2lo7vt15.py:80, in outer_factory..inner_factory..tf__call..loop_body_1(itr_1)
78 layer = itr
79 x = ag
.converted_call(ag__.ld(apply), (ag__.ld(x), ag__.ld(layer)), None, fscope)
---> 80 ag__.for_stmt(ag__.ld(b), None, loop_body, get_state_2, set_state_2, ('x',), {'iterate_names': 'layer'})
81 ag__.converted_call(ag__.ld(saved_inputs).append, (ag__.ld(x),), None, fscope)

File /tmp/autograph_generated_file2lo7vt15.py:79, in outer_factory..inner_factory..tf__call..loop_body_1..loop_body(itr)
77 nonlocal x
78 layer = itr
---> 79 x = ag
.converted_call(ag__.ld(apply), (ag__.ld(x), ag__.ld(layer)), None, fscope)

File /tmp/autograph_generated_file2lo7vt15.py:48, in outer_factory..inner_factory..tf__call..apply(x, layer)
46 x = ag
.converted_call(ag__.ld(layer), (ag__.ld(x),), None, fscope_1)
47 ag__.if_stmt(ag__.converted_call(ag__.ld(isinstance), (ag__.ld(layer), ag__.ld(SpatialTransformer)), None, fscope_1), if_body, else_body, get_state, set_state, ('x',), 1)
---> 48 ag__.if_stmt(ag__.converted_call(ag__.ld(isinstance), (ag__.ld(layer), ag__.ld(ResBlock)), None, fscope_1), if_body_1, else_body_1, get_state_1, set_state_1, ('x',), 1)
49 try:
50 do_return_1 = True

File /tmp/autograph_generated_file2lo7vt15.py:28, in outer_factory..inner_factory..tf__call..apply..if_body_1()
26 def if_body_1():
27 nonlocal x
---> 28 x = ag
.converted_call(ag__.ld(layer), ([ag__.ld(x), ag__.ld(emb)],), None, fscope_1)

File /tmp/autograph_generated_filem_kzpxnn.py:11, in outer_factory..inner_factory..tf__call(self, inputs)
9 retval
= ag
_.UndefinedReturnValue()
10 (x, emb) = ag__.ld(inputs)
---> 11 h = ag__.converted_call(ag__.ld(apply_seq), (ag__.ld(x), ag__.ld(self).in_layers), None, fscope)
12 emb_out = ag__.converted_call(ag__.ld(apply_seq), (ag__.ld(emb), ag__.ld(self).emb_layers), None, fscope)
13 h = ag__.ld(h) + ag__.ld(emb_out)[:, None, None]

File /tmp/autograph_generated_file612zmgqy.py:23, in outer_factory..inner_factory..tf__apply_seq(x, layers)
21 x = ag
.converted_call(ag__.ld(l), (ag__.ld(x),), None, fscope)
22 l = ag__.Undefined('l')
---> 23 ag__.for_stmt(ag__.ld(layers), None, loop_body, get_state, set_state, ('x',), {'iterate_names': 'l'})
24 try:
25 do_return = True

File /tmp/autograph_generated_file612zmgqy.py:21, in outer_factory..inner_factory..tf__apply_seq..loop_body(itr)
19 nonlocal x
20 l = itr
---> 21 x = ag
.converted_call(ag__.ld(l), (ag__.ld(x),), None, fscope)

File ~/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py:110, in GroupNormalization.build(self, input_shape)
108 self._check_if_input_shape_is_none(input_shape)
109 self._set_number_of_groups_for_instance_norm(input_shape)
--> 110 self._check_size_of_dimensions(input_shape)
111 self._create_input_spec(input_shape)
113 self._add_gamma_weight(input_shape)

File ~/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py:227, in GroupNormalization._check_size_of_dimensions(self, input_shape)
225 dim = input_shape[self.axis]
226 if dim < self.groups:
--> 227 raise ValueError(
228 "Number of groups (" + str(self.groups) + ") cannot be "
229 "more than the number of channels (" + str(dim) + ")."
230 )
232 if dim % self.groups != 0:
233 raise ValueError(
234 "Number of groups (" + str(self.groups) + ") must be a "
235 "multiple of the number of channels (" + str(dim) + ")."
236 )

ValueError: Exception encountered when calling layer "u_net_model_1" (type UNetModel).

in user code:

File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/stable_diffusion_tf/diffusion_model.py", line 199, in apply  *
    x = layer([x, emb])
File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler  **
    raise e.with_traceback(filtered_tb) from None
File "/tmp/__autograph_generated_filem_kzpxnn.py", line 11, in tf__call
    h = ag__.converted_call(ag__.ld(apply_seq), (ag__.ld(x), ag__.ld(self).in_layers), None, fscope)
File "/tmp/__autograph_generated_file612zmgqy.py", line 23, in tf__apply_seq

ag__.for_stmt(ag__.ld(layers), None, loop_body, get_state, set_state, ('x',), {'iterate_names': 'l'})
File "/tmp/autograph_generated_file612zmgqy.py", line 21, in loop_body
x = ag
.converted_call(ag__.ld(l), (ag__.ld(x),), None, fscope)
File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py", line 110, in build
self._check_size_of_dimensions(input_shape)
File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py", line 227, in _check_size_of_dimensions
raise ValueError(

ValueError: Exception encountered when calling layer "res_block_22" "                 f"(type ResBlock).

in user code:

    File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/stable_diffusion_tf/diffusion_model.py", line 31, in call  *
        h = apply_seq(x, self.in_layers)
    File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/stable_diffusion_tf/layers.py", line 41, in apply_seq  *
        x = l(x)
    File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler  **
        raise e.with_traceback(filtered_tb) from None
    File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py", line 110, in build
        self._check_size_of_dimensions(input_shape)
    File "/home/ec2-user/anaconda3/envs/art_intel/lib/python3.9/site-packages/tensorflow_addons/layers/normalizations.py", line 227, in _check_size_of_dimensions
        raise ValueError(

    ValueError: Number of groups (32) cannot be more than the number of channels (4).


Call arguments received by layer "res_block_22" "                 f"(type ResBlock):
  • inputs=['tf.Tensor(shape=(None, 320, 125, 4), dtype=float32)', 'tf.Tensor(shape=(None, 1280), dtype=float32)']

Call arguments received by layer "u_net_model_1" (type UNetModel):
• inputs=['tf.Tensor(shape=(None, 125, 125, 4), dtype=float32)', 'tf.Tensor(shape=(None, 320), dtype=float32)', 'tf.Tensor(shape=(None, 77, 768), dtype=float32)']

How to train this?

I can't find codes to train this model, neither text_encoder nor img_diffuser

Decoder for img2img

Do you know the structure of the Decoder for 512x512x3 => 64x64x4? It would be good to have it too for img2img functionality, probably for inpainting too, and for upscaling.

temperature parameter purpose

In

def get_x_prev_and_pred_x0(self, x, e_t, index, a_t, a_prev, temperature, seed):
sigma_t = 0
sqrt_one_minus_at = math.sqrt(1 - a_t)
pred_x0 = (x - sqrt_one_minus_at * e_t) / math.sqrt(a_t)
# Direction pointing to x_t
dir_xt = math.sqrt(1.0 - a_prev - sigma_t**2) * e_t
noise = sigma_t * tf.random.normal(x.shape, seed=seed) * temperature
x_prev = math.sqrt(a_prev) * pred_x0 + dir_xt
return x_prev, pred_x0
there is a parameter named temperature. Changing it does not affect the output. Are there future plans for it?

sigma_t always 0 => no noise being added?

noticed in get_x_prev_and_pred_x0 that sigma_t is always set to 0.
which means the later noise calculation will always be zero at

https://github.com/divamgupta/stable-diffusion-tensorflow/blob/master/stable_diffusion_tf/stable_diffusion.py#L124

furthermore noise is never used ? ( which means that the temperature is not used either )

i notice in the DDIMSampler (and PLMS) that noise is usually added to x_prev e.g.

https://github.com/CompVis/stable-diffusion/blob/main/ldm/models/diffusion/ddim.py#L203

not sure if this is a bug (in which case it should be added) or if it's not required for inference? (in which case the noise and sigma can just be removed from code)

how to resolve depency issues?

Sorry for the noob question.
But is there a requirements.txt somewhere?

Running the setup.py is not working for me.
How do I install all the required packages for an M1 mac?

NSFW filter for results

Thanks for making this wonderful repo. I was using the video generation function in the given colab file with this prompt.

At least 35 dead from mysterious meningitis outbreak in Mexico

I ended up getting images that had nudity, and genital areas exposed. Is it possible to add some sort of filter to remove such results.

why bad results?

(dlenv) D:\SourceCodes\Diffusion\stable-diffusion-tensorflow>python text2image.py --prompt="Ruins of a castle in Scotland" --output="my_image.png"
0 1: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:54<00:00, 1.09s/it]
saved at my_image.png

my_image

Rectangular image generation?

Tried creating a generator where width != height, fails with this error:

ValueError                                Traceback (most recent call last)
[<ipython-input-4-1026659721fc>](https://localhost:8080/#) in <module>
      5     img_height=640,
      6     img_width=340,
----> 7     jit_compile=False,  # You can try True as well (different performance profile)
      8 )

4 frames
[/usr/local/lib/python3.7/dist-packages/stable_diffusion_tf/diffusion_model.py](https://localhost:8080/#) in loop_body_4(itr_4)
    107                     nonlocal x
    108                     b = itr_4
--> 109                     x = ag__.converted_call(ag__.ld(tf).concat, ([ag__.ld(x), ag__.converted_call(ag__.ld(saved_inputs).pop, (), None, fscope)],), dict(axis=(- 1)), fscope)
    110 
    111                     def get_state_5():

ValueError: Exception encountered when calling layer "u_net_model_1" (type UNetModel).

in user code:

    File "/usr/local/lib/python3.7/dist-packages/stable_diffusion_tf/diffusion_model.py", line 216, in call  *
        x = tf.concat([x, saved_inputs.pop()], axis=-1)

    ValueError: Dimension 2 in both shapes must be equal, but are 12 and 11. Shapes are [?,20,12] and [?,20,11]. for '{{node u_net_model_1/concat_3}} = ConcatV2[N=2, T=DT_HALF, Tidx=DT_INT32](u_net_model_1/upsample_2/padded_conv2d_129/conv2d_129/BiasAdd, u_net_model_1/spatial_transformer_18/add, u_net_model_1/concat_3/axis)' with input shapes: [?,20,12,1280], [?,20,11,1280], [] and with computed input tensors: input[2] = <-1>.


Call arguments received by layer "u_net_model_1" (type UNetModel):
  • inputs=['tf.Tensor(shape=(None, 80, 42, 4), dtype=float16)', 'tf.Tensor(shape=(None, 320), dtype=float16)', 'tf.Tensor(shape=(None, 77, 768), dtype=float16)']```

Is it possible to generate rectangular images using this implementation?

Finetune the model on custome dataset

I am wondering if it is possible to finetune the model on my own dataset? I know keras-cv has released something for finetuning stable diffusion, but I encountered installation issues (most probably because it requires tf2.11 which is not supported on native windows). I can run the model on tf2.10. It would be great if we can train the model as well. Many thanks.

25+ Stable Diffusion Tutorials And Guides - Very Useful For Stable Diffusion Users

Hello dear Divam Gupta, I hope you let this thread stay to help newcomers. This is not an issue thread. Thank you.

Expert-Level Tutorials on Stable Diffusion: Master Advanced Techniques and Strategies

Greetings everyone. I am Dr. Furkan Gözükara. I am an Assistant Professor in Software Engineering department of a private university (have PhD in Computer Engineering). My professional programming skill is unfortunately C# not Python :)

My linkedin : https://www.linkedin.com/in/furkangozukara/

Our channel address if you like to subscribe : https://www.youtube.com/@SECourses

Our discord to get more help : https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

I am keeping this list up-to-date. I got upcoming new awesome video ideas. Trying to find time to do that.

I am open to any criticism you have. I am constantly trying to improve the quality of my tutorial guide videos. Please leave comments with both your suggestions and what you would like to see in future videos.

All videos have manually fixed subtitles and properly prepared video chapters. You can watch with these perfect subtitles or look for the chapters you are interested in.

Since my profession is teaching, I usually do not skip any of the important parts. Therefore, you may find my videos a little bit longer.

Playlist link on YouTube: Stable Diffusion Tutorials, Automatic1111 and Google Colab Guides, DreamBooth, Textual Inversion / Embedding, LoRA, AI Upscaling, Pix2Pix, Img2Img

1.) Automatic1111 Web UI - PC - Free
Easiest Way to Install & Run Stable Diffusion Web UI on PC by Using Open Source Automatic Installer
image

2.) Automatic1111 Web UI - PC - Free
How to use Stable Diffusion V2.1 and Different Models in the Web UI - SD 1.5 vs 2.1 vs Anything V3
image

3.) Automatic1111 Web UI - PC - Free
Zero To Hero Stable Diffusion DreamBooth Tutorial By Using Automatic1111 Web UI - Ultra Detailed
image

4.) Automatic1111 Web UI - PC - Free
DreamBooth Got Buffed - 22 January Update - Much Better Success Train Stable Diffusion Models Web UI
image

5.) Automatic1111 Web UI - PC - Free
How to Inject Your Trained Subject e.g. Your Face Into Any Custom Stable Diffusion Model By Web UI
image

6.) Automatic1111 Web UI - PC - Free
How To Do Stable Diffusion LORA Training By Using Web UI On Different Models - Tested SD 1.5, SD 2.1
image

7.) Automatic1111 Web UI - PC - Free
8 GB LoRA Training - Fix CUDA & xformers For DreamBooth and Textual Inversion in Automatic1111 SD UI
image

8.) Automatic1111 Web UI - PC - Free
How To Do Stable Diffusion Textual Inversion (TI) / Text Embeddings By Automatic1111 Web UI Tutorial
image

9.) Automatic1111 Web UI - PC - Free
How To Generate Stunning Epic Text By Stable Diffusion AI - No Photoshop - For Free - Depth-To-Image
image

10.) Python Code - Hugging Face Diffusers Script - PC - Free
How to Run and Convert Stable Diffusion Diffusers (.bin Weights) & Dreambooth Models to CKPT File
image

11.) NMKD Stable Diffusion GUI - Open Source - PC - Free
Forget Photoshop - How To Transform Images With Text Prompts using InstructPix2Pix Model in NMKD GUI
image

12.) Google Colab Free - Cloud - No PC Is Required
Transform Your Selfie into a Stunning AI Avatar with Stable Diffusion - Better than Lensa for Free
image

13.) Google Colab Free - Cloud - No PC Is Required
Stable Diffusion Google Colab, Continue, Directory, Transfer, Clone, Custom Models, CKPT SafeTensors
image

14.) Automatic1111 Web UI - PC - Free
Become A Stable Diffusion Prompt Master By Using DAAM - Attention Heatmap For Each Used Token - Word
image

15.) Python Script - Gradio Based - ControlNet - PC - Free
Transform Your Sketches into Masterpieces with Stable Diffusion ControlNet AI - How To Use Tutorial
image

16.) Automatic1111 Web UI - PC - Free
Sketches into Epic Art with 1 Click: A Guide to Stable Diffusion ControlNet in Automatic1111 Web UI
image

17.) RunPod - Automatic1111 Web UI - Cloud - Paid - No PC Is Required
Ultimate RunPod Tutorial For Stable Diffusion - Automatic1111 - Data Transfers, Extensions, CivitAI
image

18.) Automatic1111 Web UI - PC - Free
Fantastic New ControlNet OpenPose Editor Extension & Image Mixing - Stable Diffusion Web UI Tutorial
image

19.) Automatic1111 Web UI - PC - Free
Automatic1111 Stable Diffusion DreamBooth Guide: Optimal Classification Images Count Comparison Test
image

20.) Automatic1111 Web UI - PC - Free
Epic Web UI DreamBooth Update - New Best Settings - 10 Stable Diffusion Training Compared on RunPods
image

21.) Automatic1111 Web UI - PC - Free
New Style Transfer Extension, ControlNet of Automatic1111 Stable Diffusion T2I-Adapter Color Control
image

22.) Automatic1111 Web UI - RunPod - Paid
How To Install New DreamBooth Extension On RunPod - Automatic1111 Web UI - Stable Diffusion
image

23.) Automatic1111 Web UI - PC - Free
Generate Text Arts & Fantastic Logos By Using ControlNet Stable Diffusion Web UI For Free Tutorial
image

24.) Automatic1111 Web UI - PC - Free
For downgrade to older version if you don't like Torch 2 : first delete venv, let it reinstall, then activate venv and run this command pip install -r "path_of_SD_Extension\requirements.txt"
How To Install New DREAMBOOTH & Torch 2 On Automatic1111 Web UI PC For Epic Performance Gains Guide
image

25.) Automatic1111 Web UI - PC - Free
Training Midjourney Level Style And Yourself Into The SD 1.5 Model via DreamBooth Stable Diffusion
image

Apple M1 8GB Speed comparison using Python vs using DiffusionBee

I am not sure if I have installed the Python version correctly, but using DiffusionBee, I believe I am getting 3-4 minutes per image at default.

While with the python version running it's 10 minutes per image.

I was wondering if the tensorflow gpu is properly installed on my M1 iMac 8GB memory? How do I check?

Load model without the image size

The current version of TF Stable Diffusion needs the image width and height to load the model.

def get_models(img_height, img_width, download_weights=True):

Indeed, the graph is built again if there is another image with a different size.

Is it possible to load the model without specifying the width and height? It would save a lot of time...

Question about model converting & license

What is the license for the implementation?

Also, can you release the script you used to convert the model? I wish to port finetuned models to this codebase. Thanks!

Negative prompts

As implemented, is it currently possible to use negative prompts?

Why prompt is limited to 77

In a pipeline I replaced the pytorch version with this implementation, but found the maximum prompt is limited to 77. Is this a compromise for some reasons?

Failed while running with GPU

I ran using just the CPU, to improve performance, I wish to run using GPU, but received the following error:

Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not initialized.

The complete message is:

File "/home/claudino/Projetos/OpenSource/stable-diffusion-tensorflow/stable_diffusion_tf/stable_diffusion.py", line 270, in get_models
    diffusion_model.load_weights(diffusion_model_weights_fpath)
  File "/home/claudino/miniconda3/envs/stable-diffusion/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/claudino/miniconda3/envs/stable-diffusion/lib/python3.10/site-packages/keras/backend.py", line 4302, in batch_set_value
    x.assign(np.asarray(value, dtype=dtype_numpy(x)))
tensorflow.python.framework.errors_impl.InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not initialized.

Cant determine the cause.

My environment:

(stable-diffusion) $> lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.1 LTS
Release:        22.04
Codename:       jamm

(stable-diffusion) $> nvidia-smi 
Tue Jan 17 17:31:00 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.13    Driver Version: 525.60.13    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0  On |                  N/A |
| N/A   56C    P8     7W /  N/A |    208MiB /  6144MiB |     35%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1572      G   /usr/lib/xorg/Xorg                109MiB |
|    0   N/A  N/A      3293    C+G   ...014073573827879945,131072       96MiB |
+-----------------------------------------------------------------------------+

Cuda 11.2, tensorflow 2.10.0, cudnn 8.1.0

img2img question, guidance scale and input strength

In the diffusionBee UI the guidance scale is between 0-20 and the input strength is 10-90
What does that map to in the following.. I want to reproduce the output I get in diffusionBee
def generate(
self,
prompt,
negative_prompt=None,
batch_size=1,
num_steps=25,
unconditional_guidance_scale=7.5,
temperature=1,
seed=None,
input_image=None,
input_mask=None,
input_image_strength=0.5,
):

[Inquiry] VAEEncoder bug in KerasCV

Hello @divamgupta ! As you know, we ported the image encoder in https://github.com/keras-team/keras-cv from your library! We've supported inpainting via a method on StableDiffusion, and we're seeing a very strange issue! keras-team/keras-cv#1172 is the github issue.

We've tried just passing images through the encoder/decoder and this padding shows up.

My question for you is: did you have this bug in your repo? Did you submit a fix at some point? Anything that you may know that may be helpful?

Thanks in advance for any guidance! We appreciate your efforts a ton

First generated image is different from the following ones using same settings

For some reason, the first generated image is different from the following ones using the code below:

from tensorflow import keras
from stable_diffusion_tf.stable_diffusion import Text2Image
from PIL import Image

# Prompt and seed copied from
# https://lexica.art/?prompt=715596cf-84bd-497f-8413-6e9bb8f39c5e
prompt = "cat seahorse fursona, autistic bisexual graphic designer, attractive fluffy humanoid character design, sharp focus, weirdcore voidpunk digital art by artgerm, akihiko yoshida, louis wain, simon stalenhag, wlop, noah bradley, furaffinity, artstation hd, trending on deviantart"

generator = Text2Image(img_height=512, img_width=512, jit_compile=False)
for num in range(3):
    img = generator.generate(
        prompt,
        num_steps=25,
        unconditional_guidance_scale=7.0,
        temperature=1,
        batch_size=1,
        seed=4030098432,
    )
    Image.fromarray(img[0]).save(f"output{num+1}.png")

Here are the images:

First generated image:
output1

Second generated image:
output2

Third generated image:
output3

I get the same results even if I create a new generator for each image in the for-loop.

Random initialization and diffuse

Hey, thanks for putting this together =).

The code is actually substantially faster than the PyTorch counterpart on M1Pro. Even from the CoreML version.
I am getting the same results every time; it appears that the random noise is kind of "deterministic" in the text2image function.
Here I call the diffusion twice:
https://github.com/tcapelle/stable-diffusion-tensorflow/blob/master/02_inference.ipynb
Can you explain a little bit why all these hardcoded alphas are necessary?

TF Lite convert error

Hi,
Leaving this here in case someone is also trying to convert to a TF lite model.

From the keras_cv documentation:

!pip install --upgrade keras-cv
!pip install --upgrade tensorflow

Load the model:

import time
import keras_cv
from tensorflow import keras
import matplotlib.pyplot as plt


model = keras_cv.models.StableDiffusion(img_width=512, img_height=512)
# Convert the model.
converter = tf.lite.TFLiteConverter.from_keras_model(model.diffusion_model)
tflite_model = converter.convert()

It seems a similar error occurs when trying to save the model.

model.diffusion_model.save(save_dir)

The error in conversion:

[/usr/local/lib/python3.7/dist-packages/keras_cv/models/generative/stable_diffusion/__internal__/layers/group_normalization.py](https://localhost:8080/#) in _create_broadcast_shape(self, input_shape)
     85 
     86     def _create_broadcast_shape(self, input_shape):
---> 87      broadcast_shape = [1] * len(input_shape) 
     88         broadcast_shape[self.axis] = input_shape[self.axis] // self.groups

TypeError: Exception encountered when calling layer 'group_normalization_60' (type GroupNormalization).

len is not well defined for a symbolic Tensor (Shape:0). Please call `x.shape` rather than `len(x)` for shape information.

Call arguments received by layer 'group_normalization_60' (type GroupNormalization):
  • args=('tf.Tensor(shape=(None, 64, 64, 320), dtype=float32)',)
  • kwargs=<class 'inspect._empty'>

Confusion about the new param download_weights

In stable_diffusion.py,

if download_weights:
        text_encoder_weights_fpath = keras.utils.get_file(
            origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/text_encoder.h5",
            file_hash="d7805118aeb156fc1d39e38a9a082b05501e2af8c8fbdc1753c9cb85212d6619",
        )
        diffusion_model_weights_fpath = keras.utils.get_file(
            origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/diffusion_model.h5",
            file_hash="a5b2eea58365b18b40caee689a2e5d00f4c31dbcb4e1d58a9cf1071f55bbbd3a",
        )
        decoder_weights_fpath = keras.utils.get_file(
            origin="https://huggingface.co/fchollet/stable-diffusion/resolve/main/decoder.h5",
            file_hash="6d3c5ba91d5cc2b134da881aaa157b2d2adc648e5625560e3ed199561d0e39d5",
        )

        text_encoder.load_weights(text_encoder_weights_fpath)
        diffusion_model.load_weights(diffusion_model_weights_fpath)
        decoder.load_weights(decoder_weights_fpath)

It seems to skip get_files(...) call and also load_weights(...). So this option is for randomized weights and only use it if you are training from absolute scratch?

How to use Inpainting function

I see, some codes had already written at stable_diffusion.py, but if use input_mask param directly It cause a error in runtime.

How to run on CUDA?

I can't seem to figure out how to move the model to my GPU. I've tried copying the cloab notebooks, to no avail. Opening task manager shows the code is running on my cpu.

Default sampler

What sampler does this implementation use by default? Is it K_LMS?

segmentation fault python text2image.py

Trying to run locally on Mac Pro (Late 2013) with AMD FirePro D500 3 GB using tensor flow metal with tensor flow v 2.10

$python text2image.py --prompt="An astronaut riding a horse"

segmentation fault python text2image.py

Script to convert PyTorch weights

Thanks and congratz for this awesome piece of work.

Could you add to the repo whatever script was used to convert the original SD PyTorch checkpoint to the TF weights used by this repo?

That would allow people to load fine-tuned pytorch models into tensorflow for inference purposes.

README mentions Inpainting

Besides the readme showing an image for Inpainting, there does not seem to be any more information for this feature/
Am I missing something ?

Feature Request: Legacy & New Encoder

Stable Diffusion 2.0 uses a new text encoder, so the PyTorch mapping for that model and any future models won't work any more. It's beyond my expertise, but can we write into clip_encoder.py the ability to create the text encoder model for the new version of SD2.0?

Creating a choice for legacy or new encoder is a simple bool variable that can be passed, but I have no clue on how to create the new text encoder.

tensorflow.js

Anyone tried these models (after exporting them) with tensorflow.js?

Is the VAE used in fine-tuning available anywhere?

Hey the @divamgupta ! I'm trying to work on a FineTuning workflow for your model and was wondering if the VAE used in original training is available anywhere in TensorFlow that you are aware of.

I've poked around the repo and did not see any obvious home for it.

If not, no problem - but figured I would ask here first before finding one on my own!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.