v0xie / sd-webui-cads Goto Github PK
View Code? Open in Web Editor NEWGreatly increase the diversity of your generated images in Automatic1111 WebUI through Condition-Annealed Sampling.
License: GNU General Public License v3.0
Greatly increase the diversity of your generated images in Automatic1111 WebUI through Condition-Annealed Sampling.
License: GNU General Public License v3.0
For example, when an error like NaN in U-Net happens, image generation will stop but subsequent image generations will still call the same hook. The hook can be undone by clicking Generate and then Skip immediately afterwards.
Please add a separate option to set start and stop steps for HR,fix
Labels should be under the sliders, not above
Thanks for the great work !
Just wondering, can your repo support the DiT model (which has been used in the original paper) ?
Refactor the main functions (cads_linear_schedule, add_noise, on_cfg_denoiser_callback) into its own file for easier implementation into Forge, Diffusers, etc.
Allow for parameters to be controlled by X/Y/Z plot scripts for plot comparisons - probably related to info text:
Tau 1/2
Noise scale
Mixing factor
The descriptions for are not accurate to what they actually do
Trying to use this extension with reference only controlnet enabled led to this error. Said error then happens even after the extension is disabled. You would need to generate something with controlnet off, in order to clear the error after enabling controlnet back on, but not CADS.
Traceback (most recent call last):
File "G:\stable-webui\modules\call_queue.py", line 57, in f
res = list(func(*args, **kwargs))
File "G:\stable-webui\modules\call_queue.py", line 36, in f
res = func(*args, **kwargs)
File "G:\stable-webui\modules\txt2img.py", line 55, in txt2img
processed = processing.process_images(p)
File "G:\stable-webui\modules\processing.py", line 732, in process_images
res = process_images_inner(p)
File "G:\stable-webui\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack
return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
File "G:\stable-webui\modules\processing.py", line 867, in process_images_inner
samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
File "G:\stable-webui\extensions\sd-webui-controlnet\scripts\hook.py", line 451, in process_sample
return process.sample_before_CN_hack(*args, **kwargs)
File "G:\stable-webui\modules\processing.py", line 1140, in sample
samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
File "G:\stable-webui\modules\sd_samplers_kdiffusion.py", line 235, in sample
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "G:\stable-webui\modules\sd_samplers_common.py", line 261, in launch_sampling
return func()
File "G:\stable-webui\modules\sd_samplers_kdiffusion.py", line 235, in
samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
File "G:\stable-webui\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "G:\stable-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 594, in sample_dpmpp_2m
denoised = model(x, sigmas[i] * s_in, **extra_args)
File "G:\stable-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "G:\stable-webui\modules\sd_samplers_cfg_denoiser.py", line 201, in forward
devices.test_for_nans(x_out, "unet")
File "G:\stable-webui\modules\devices.py", line 136, in test_for_nans
raise NansException(message)
modules.devices.NansException: A tensor with all NaNs was produced in Unet. This could be either because there's not enough precision to represent the picture, or because your video card does not support half type. Try setting the "Upcast cross attention layer to float32" option in Settings > Stable Diffusion or using the --no-half commandline argument to fix this. Use --disable-nan-check commandline argument to disable this check.
thx, your sd-webui-cads is the best,
but when i try to read info from png info (Hires. fix parametrers, aesthetic gradients) - no parameters in png info
and I tried to run the IPadapter ControlNet as an experiment, I also got an error.
I can hardly see any difference between having this active or not active. How can I increase the strength of the effect? Or do I need to use a specific sampler?
I basically just see the contrast go down a bit when it's active and that's it.
The "Mixing Factor" setting says "lowering this will increase the diversity", but I don't have the feeling that setting it to a low value like 0 or 0.1 improves diversity at all.
What combination of parameters should lead to the highest diversity?
Hi,
First of all, thanks for a unique and fun extension!
One bug I just noticed is that when CADs is enabled and the batch size is greater than 1, CADs is only applied to the first timage of the batch. Expected functionality would be for it to be applied to all image generations in the batch.
I did replicate this issue even w/ all other extensions in my A1111 install disabled. I am using the latest dev branch commit, but I doubt that's related to this issue.
To replice:
If this turns out to be user error somehow my sincere apologies. Thanks!
I'd like to thank you for implementing CADS from our paper.
Could you please update the citation command to the correct version?
@inproceedings{
sadat2024cads,
title={{CADS}: Unleashing the Diversity of Diffusion Models through Condition-Annealed Sampling},
author={Seyedmorteza Sadat and Jakob Buhmann and Derek Bradley and Otmar Hilliges and Romann M. Weber},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=zMoNrajk2X}
}
Using hi-res fix restarts the noising, destroying the image in the process
Images from session to session are not reproducible with the same settings. Related to RNG I think.
I would love to see Flux model support in CADS, if that architecture supports it.
you can implement it in diffusers
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.