cswry / seesr Goto Github PK

View Code? Open in Web Editor NEW

349.0 349.0 14.0 53.49 MB

[CVPR2024] SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution

License: Apache License 2.0

Python 99.90% Shell 0.10%

seesr stable-diffusion super-resolution

seesr's People

Stargazers

Watchers

Forkers

lucataco house-yuyu jackzhousz peterzs ip-superresolution marenan wangqi-xxxx ruitao-terry sijieliu518 mirage-ai zzksdu templeblock

seesr's Issues

Why disable shuffle in train dataloader?

Why disable shuffle in train dataloader? It is strange since random shuffle in training is standard practice. Furthermore, I try to finetune SeeSR in my data, the results become even worse than pretrained SeeSR, I use same training as yours, except with my own data.

Cant run Demo

hello
i followed your instructions for installation however when i got to run python gradio_seesr.py i get the following:

WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.0.1+cu118 with CUDA 1108 (you have 2.0.1+cpu)
Python 3.8.10 (you have 3.8.18)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
Traceback (most recent call last):
File "gradio_seesr.py", line 19, in <module>
from pipelines.pipeline_seesr import StableDiffusionControlNetPipeline
File "E:\AI\SeeSR\pipelines\pipeline_seesr.py", line 25, in <module>
from torchvision.utils import save_image
ModuleNotFoundError: No module named 'torchvision'

i tried reinstalling xformers, install torchvision, upgrading pytorch, upgrading python to 3.8.18 and also to 3.10.11, also after doing some of the steps i mentioned, it starts to ask me to install missing modules and so i install them and i keep installing until i get to a point where it say torch.cuda.is_available() should be true but it is false and then i couldn't do anything after that. (of course all of this with the conda environment is active)

i have a 4080 desktop pc and i use other ai interfaces such as automatic1111 and comfyui without an issue.

i am using Miniconda3-latest

I appreciate all the help you can give to enable me to run the demo locally.

About sd-turbo results

Hello, I tried the sd-turbo based demo but got a strange output😨. What's the possible reason? All args settings remain at default.

请问ram模型里面用到的bert和vit与transformer库里面直接调用的bert和vit有什么区别吗，代码里面的bert和vit好像是单独实现的

feature_extractor is miss in sd-turbo

I encountered the following problems when running the webui of sd-turbo:

error : OSError: Incorrect path_or_model_id: 'preset/models/models--stabilityai--sd-turbo/snapshots/1681ed09e0cff58eeb41e878a49893228b78b94c/feature_extractor'. Please provide either the path to a local folder or the repo_id of a model on the Hub.

The screenshot of the sd-turbo file on huggingface is as follows, there is no feature_extractor folder

OOM error happend when using accelerate but training works fine for single GPU

感谢你精彩的工作。我尝试微调SeeSR。由于显存限制，我将图片大小缩小为256×256。在单卡上使用
python train_seesr.py ##省略参数
会占据23GB的显存，可以在4090上运行。但当我尝试多卡
CUDA_VISIBLE_DEVICES="0,1" accelerate launch train_seesr.py ##省略参数
时总是发生OOM错误。我已经仔细检查输入tensor的形状，确保与单GPU时一致，但是找不到原因。感谢您的帮助！

Not work for people

The super resolution result of the given test image is blur. No errors are reported. I wonder where I may go wrong?

According to the "TCA module" in the article

The work you've done on this article is truly commendable, providing me with a wealth of inspiration and insight. I have a question. Regarding the "TCA module" discussed in your article, I haven't been able to locate the exact position and module within the code. Could you assist me in identifying its location?

The file missing

Hello
I followed the Quick Inference however when I got to run python test_seesr.py I get the following:

Traceback (most recent call last):
  File "test_seesr.py", line 268, in <module>
    main(args)
  File "test_seesr.py", line 167, in main
    pipeline = load_seesr_pipeline(args, accelerator, enable_xformers_memory_efficient_attention)
  File "test_seesr.py", line 83, in load_seesr_pipeline
    unet = UNet2DConditionModel.from_pretrained(args.seesr_model_path, subfolder="unet")
  File "D:\Compiler\anaconda3\envs\seesrf\lib\site-packages\diffusers\models\modeling_utils.py", line 618, in from_pretrained
    model_file = _get_model_file(
  File "D:\Compiler\anaconda3\envs\seesrf\lib\site-packages\diffusers\utils\hub_utils.py", line 284, in _get_model_file
    raise EnvironmentError(
OSError: Error no file named diffusion_pytorch_model.bin found in directory preset/models/seesr.

I checked the folder and found that the file was not in Google Drive either.

According to the latest code provided, retrain, can't fully reproduce the metrics in the paper, please tell me the results in the paper, are there any settings that need extra attention?

When input image is 480 * 960, there is CUDA out of memory error.

Issues with test_seesr_turbo with solutions

Great work on this - lots of improvements - this is a great, and memory efficient approach.

when I navigate to the sd-turbo model using the provided link there is no feature_extractor. I used the sd 2.1 feature extractor.

I think the below is a bug? Seems to be looking for unet in two places. I assume the correct path is args.seesr_model_path?

unet = UNet2DConditionModel.from_pretrained_orig(args.pretrained_model_path, args.seesr_model_path, subfolder="unet", use_image_cross_attention=True)

SD-Turbo results look good, but have color banding/look like painting

Very good results, but I am noticing color banding not present on the input images. Added "color banding" and "oil painting" to negative prompts, but it still appears. Happens even at 12 steps. Doesn't matter if I use "wavelet" "adain" or "nofix". Anything that can be done? Thank you.

Even on a color that is the same but different shades:

Also, so I don't open another issue, was wondering if there are any ways to get the output to look closer to the input. The upscale looks 95% close, and much cleaner, but still some slight differences happen. Basically I am wondering if it is possible to get 99% close to a 1:1 super resolution of the input image but with the enhanced clarity and details.

Sometimes areas of the input that are out of focus on purpose, get forced into focus, any way to prevent that?

code question

Thank you for sharing the code, but I found that the training code does not seem to introduce the hard prompts in DAPE during training, the screenshot is as follows, please point out whether I read wrong

some questions

Hello!
When will the code be released?
thanks

about train data make

hi，I want know how you make the seesr train data. about epoch and added face ratio.

OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory

OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory C:/SeeSR-main/preset/models/stable-diffusion-2-base.

seesr数据制作脚本问题

python utils_data/make_paired_data.py
--gt_path PATH_1 PATH_2 ...
--save_dir preset/datasets/train_datasets/training_for_dape
--epoch 1
这里保存目录是不是应该改成preset/datasets/train_datasets/training_for_seesr，否则会覆盖DPAE的数据

DAPE微调细节和收敛问题

我按照作者实验设置，将DIV2K，Flickr2K，FFHQ1w张，OST，共计2.3w张做成配对图片后放到DAPE中微调，设置也按照dape.yaml的配置做的，然后模型收敛不了（l_logits在0.5左右），放在推理部分也不能产生有效标签信息（全是null）。想问问大家是怎么解决收敛问题的？

Is it Not compatible with Win or Mac?

The model imports the module ‘triton’, but this module only has a Linux version and is not compatible with Windows or Mac. Has anyone succeeded in running it on Win10? What should I do?

Error information:

A matching Triton is not available, some optimizations will not be enabled
Traceback (most recent call last):
  File "D:\ProgramData\anaconda3\envs\SeeSR\lib\site-packages\xformers\__init__.py", line 55, in _is_triton_available
    from xformers.triton.softmax import softmax as triton_softmax  # noqa
  File "D:\ProgramData\anaconda3\envs\SeeSR\lib\site-packages\xformers\triton\softmax.py", line 11, in <module>
    import triton
ModuleNotFoundError: No module named 'triton'

encoded_inputs["attention_mask"] = encoded_inputs["attention_mask"] + [0] * difference OverflowError: cannot fit 'int' into an index-sized integer

Trying to run with 8GB VRAM

All models appear to load as expected and code runs up until the time the image is passed into the pipeline (ie right up to the inference point)

to avoid OOM issues have set:

--vae_decoder_tiled_size=64
--vae_encoder_tiled_size=512
--latent_tiled_size=40
--latent_tiled_overlap=2

Issue seems to be at the tokenizer:

Traceback (most recent call last):
File "/home/outsider/Desktop/coding/SeeSR/test_seesr.py", line 284, in
main(args)
File "/home/outsider/Desktop/coding/SeeSR/test_seesr.py", line 233, in main
image = pipeline(
File "/home/outsider/Desktop/coding/SeeSR/utils/vaehook.py", line 440, in wrapper
ret = fn(*args, **kwargs)
File "/home/outsider/anaconda3/envs/sd2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/outsider/Desktop/coding/SeeSR/pipelines/pipeline_seesr.py", line 944, in call
prompt_embeds, ram_encoder_hidden_states = self._encode_prompt(
File "/home/outsider/Desktop/coding/SeeSR/pipelines/pipeline_seesr.py", line 356, in _encode_prompt
text_inputs = self.tokenizer(
File "/home/outsider/anaconda3/envs/sd2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2561, in call
encodings = self._call_one(text=text, text_pair=text_pair, **all_kwargs)
File "/home/outsider/anaconda3/envs/sd2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2667, in _call_one
return self.encode_plus(
File "/home/outsider/anaconda3/envs/sd2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2740, in encode_plus
return self._encode_plus(
File "/home/outsider/anaconda3/envs/sd2/lib/python3.10/site-packages/transformers/tokenization_utils.py", line 652, in _encode_plus
return self.prepare_for_model(
File "/home/outsider/anaconda3/envs/sd2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3219, in prepare_for_model
encoded_inputs = self.pad(
File "/home/outsider/anaconda3/envs/sd2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3024, in pad
encoded_inputs = self._pad(
File "/home/outsider/anaconda3/envs/sd2/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3409, in _pad
encoded_inputs["attention_mask"] = encoded_inputs["attention_mask"] + [0] * difference
OverflowError: cannot fit 'int' into an index-sized integer

sd-turbo版本的技术上的疑问（请教）

你好，请问sd-turbo版本是需要重新训练controlNET吗？（时间步设置为2）还是说只需要把之前SD2.1训练好的用上去就行（只是更换SD的部分）

关于sd-turbo训练问题

Hi，想请教一下基于sd-turbo的训练具体是怎么做的呢？我尝试过仅把sd2-base换成turbo，发现训出来的结果相比baseline要更模糊一些

compared to PASD

为了方便，直接用中文了～
seesr相比PASD的改进是不是主要是在representation branch上（针对低质量图片做了训练），sd和controlnet部分差不多？
另外就是seesr没有像PASD那样在controlnet的输入上做显式地增强？

Please verify your scheduler_config.json configuration file.

The config attributes {'sigma_max': None, 'sigma_min': None, 'timestep_type': 'discrete'} were passed to DDPMScheduler, but are not expected and will be ignored. Please verify your scheduler_config.json configuration file.
The config attributes {'dropout': 0.0, 'reverse_transformer_layers_per_block': None} were passed to UNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.

Do these two error messages have any impact on the output?

Load pretrain controlnet error

hi, I am training my own model, I want to load pretrain controlnet use "--controlnet_model_name_or_path", but there is an error:

Traceback (most recent call last):
File "train_seesr.py", line 1000, in
down_block_res_samples, mid_block_res_sample = controlnet(
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/accelerate/utils/operations.py", line 687, in forward
return model_forward(*args, **kwargs)
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/accelerate/utils/operations.py", line 675, in call
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
return func(*args, **kwargs)
File "/XXX//SeeSR-main/models/controlnet.py", line 766, in forward
sample, res_samples = downsample_block(
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/XXX//SeeSR-main/models/unet_2d_blocks.py", line 1238, in forward
hidden_states = attn(
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/diffusers/models/transformer_2d.py", line 315, in forward
hidden_states = block(
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/diffusers/models/attention.py", line 218, in forward
attn_output = self.attn2(
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/diffusers/models/attention_processor.py", line 420, in forward
return self.processor(
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/diffusers/models/attention_processor.py", line 948, in call
key = attn.to_k(encoder_hidden_states, scale=scale)
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/diffusers/models/lora.py", line 224, in forward
out = super().forward(hidden_states)
File "/opt/conda/envs/seesr/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1232x1024 and 768x320)
Steps: 0%| | 0/8000 [00:02<?, ?it/s]

It's too slow for big size image

It's too slow for big size image. What can I do?

在处理SeeSR数据的时候，文件夹创建了，但是图像并没有保存到对应文件夹中

          我有个问题，为什么在处理SeeSR数据的时候发现，文件夹创建了，但是图像并没有保存到对应文件夹中？

Originally posted by @aulaywang in #20 (comment)

关于SD-2.1base模型的下载

我点击了HuggingFace连接，在里面下载了512-base-ema.ckpt的文件，也放在项目的preset/models/stable-diffusion-2-base文件夹下，但是为什么运行时会有一个缺少配置文件的错误
即：OSError: Error no file named scheduler_config.json found in directory preset/models/stable-diffusion-2-base.

我在HuggingFace连接里面也没有找到关于scheduler_config.json的文件啊

overfitting on small dataset?

Hi, I use 500 human face images(only contain 6 id) to train this model, but when I test the trained data, the result is strange, I test using A people, but the result is B people,why?

复现训练过程中，训练中间的checkpoint测试发现效果并不好，这正常吗？

Request: SeeSR with SUPIR

As it was done with SD-Turbo, could it also be done with SUPIR as well?

训练代码在加载SD的Unet的权重时会报错

我发现 UNet2DConditionModel的state_dict().keys() 和 SD的Unet的state_dict.keys()不匹配导致报错

argument about '--use_ram_encoder'

Hello, thank you for sharing the code of SeeSR!
When I read it, I found it did not seem to perform cross attention between the 'ram_encoder_hidden_states' and the resnet output during training.
The screen shot is as follows, please give me some advice.

From your command
CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7," accelerate launch train_seesr.py \ --pretrained_model_name_or_path="preset/models/stable-diffusion-2-base" \ --output_dir="./experience/seesr" \ --root_folders 'preset/datasets/training_datasets' \ --ram_ft_path 'preset/models/DAPE.pth' \ --enable_xformers_memory_efficient_attention \ --mixed_precision="fp16" \ --resolution=512 \ --learning_rate=5e-5 \ --train_batch_size=2 \ --gradient_accumulation_steps=2 \ --null_text_ratio=0.5 --dataloader_num_workers=0 \ --checkpointing_steps=10000 ,
there's no ----use_ram_encoder.

Thus,

use_image_cross_attention will be False.

While forwarding,

it will skip to else, omitting the image_encoder_hidden_states.

Please give me some instructions, thanks!

Question about the tag generation

In the utils_data/make_tags.py, why not use the lora finetuned model? I found in the training, you just encode the tag directly, and did not make any change.

RuntimeError: checkpoint url or path is invalid

Hi. Thank you for your wonderful project.
However I dhave not success to run it:

C:\Users\mikazmaj\SeeSR\pipelines\pipeline_seesr.py:42: FutureWarning: Importing `DiffusionPipeline` or `ImagePipelineOutput` from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
  from diffusers.pipeline_utils import DiffusionPipeline
/encoder/layer/0/crossattention/self/query is tied
/encoder/layer/0/crossattention/self/key is tied
/encoder/layer/0/crossattention/self/value is tied
/encoder/layer/0/crossattention/output/dense is tied
/encoder/layer/0/crossattention/output/LayerNorm is tied
/encoder/layer/0/intermediate/dense is tied
/encoder/layer/0/output/dense is tied
/encoder/layer/0/output/LayerNorm is tied
/encoder/layer/1/crossattention/self/query is tied
/encoder/layer/1/crossattention/self/key is tied
/encoder/layer/1/crossattention/self/value is tied
/encoder/layer/1/crossattention/output/dense is tied
/encoder/layer/1/crossattention/output/LayerNorm is tied
/encoder/layer/1/intermediate/dense is tied
/encoder/layer/1/output/dense is tied
/encoder/layer/1/output/LayerNorm is tied
Loading default thretholds from .txt....
--------------
preset/models/ram_swin_large_14m.pth
--------------
Traceback (most recent call last):
  File "C:\Users\mikazmaj\SeeSR\test_seesr.py", line 265, in <module>
    main(args)
  File "C:\Users\mikazmaj\SeeSR\test_seesr.py", line 168, in main
    model = load_tag_model(args, accelerator.device)
  File "C:\Users\mikazmaj\SeeSR\test_seesr.py", line 125, in load_tag_model
    model = ram(pretrained='preset/models/ram_swin_large_14m.pth',
  File "C:\Users\mikazmaj\SeeSR\ram\models\ram_lora.py", line 325, in ram
    model, msg = load_checkpoint_swinlarge(model, pretrained, kwargs)
  File "C:\Users\mikazmaj\SeeSR\ram\models\utils.py", line 296, in load_checkpoint_swinlarge
    raise RuntimeError('checkpoint url or path is invalid')
RuntimeError: checkpoint url or path is invalid

FID problem

Hello, thank you for sharing the code of SeeSR! That's an amazing work indeed!

When I tried to calculate the metric FID using the code proposed in basicsr/metrics/fid.py, the result was quite confusing.

Here is how I calculate the metric FID. If any problem, feel free to point out.

First, I defined a function which outputs a list composed of all images.

Then, I processed SR images and GT images, respectively, using inception_v3 to extract features and calculating FID.

As for the results, when the param normalize_input set to False, I got FID = 118.71415109991415.
And when the param normalize_input set to True, I got FID = 126.1874280351401.

Both of them are quite higher than the paper mentioned, this makes me realize which step seems to have gone wrong.
Could you please give me some advice? And whether the param normalize_input should be set to True or not?

Thanks a lot!

no shuffle when training ?

https://github.com/cswry/SeeSR/blob/main/train_seesr.py#L855

Metrics calculating

Excellent work!!, I follow the README to infer results, however, I found the MANIQA is lower than the original paper (0.5050 vs 0.6198). I would greatly appreciate it if you could provide the calculation code.

Question about the output of vae.encode

The input to vae.encode is 'pixel_values' [2, 3, 512, 512], however, the output 'latents' is [2, 4, 64, 64]. Why is the channel dimension different?

dape模型文件呢

你给的链接里面没有dape模型啊

Cuda out of memory(Training process)

Hellow !
I follow the following settings, and I used the NVIDIA GeForce RTX 3090 (24GB) to run the trianing code. However, I met the problem of cuda out of memory. Is it because the VRAM of the 3090ti graphics card is insufficient for training?

single gpu

CUDA_VISIBLE_DEVICES="0," accelerate launch train_seesr.py
--pretrained_model_name_or_path="preset/models/stable-diffusion-2-base"
--output_dir="./experience/seesr"
--root_folders 'preset/datasets/train_datasets/training_for_seesr'
--ram_ft_path 'preset/models/DAPE.pth'
--enable_xformers_memory_efficient_attention
--mixed_precision="fp16"
--resolution=512
--learning_rate=5e-5
--train_batch_size=1
--gradient_accumulation_steps=2
--null_text_ratio=0.5
--dataloader_num_workers=0
--checkpointing_steps=10000

论文中效果对比有什么意义呢？这么大的模型。。

raise EnvironmentError( OSError: Can't load tokenizer for 'bert-base-uncased'

SeeSR-main/test_seesr.py", line 125, in load_tag_model
model = ram(pretrained='preset/models/ram_swin_large_14m.pth',

SeeSR-main/ram/models/ram_lora.py", line 319, in ram
model = RAMLora(**kwargs)

SeeSR-main/ram/models/ram_lora.py", line 107, in init
self.tokenizer = init_tokenizer()

SeeSR-main/ram/models/utils.py", line 131, in init_tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

python3.10/site-packages/transformers/tokenization_utils_base.py", line 1785, in from_pretrained
raise EnvironmentError( OSError: Can't load tokenizer for 'bert-base-uncased'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'bert-base-uncased' is the correct path to a directory containing all relevant files for a BertTokenizer tokenizer.

我下载了整个bert-base-uncased目录中的所有文件和ram_swin_large_14m.pth，但一直报这个错！不知道是什么问题

The checkpoints can not load

Hello! Your work is excellent! In the process of replicating the experiment, I loaded the pre-training file on google drive, but the loading failed, I want to know how to solve it, thank you!
Traceback (most recent call last):
File "test_seesr.py", line 265, in
main(args)
File "test_seesr.py", line 167, in main
pipeline = load_seesr_pipeline(args, accelerator, enable_xformers_memory_efficient_attention)
File "test_seesr.py", line 83, in load_seesr_pipeline
unet = UNet2DConditionModel.from_pretrained(args.seesr_model_path, subfolder="unet")
File "/root/miniconda3/lib/python3.8/site-packages/diffusers/models/modeling_utils.py", line 646, in from_pretrained
raise ValueError(
ValueError: Cannot load <class 'models.unet_2d_condition.UNet2DConditionModel'> from preset/models/seesr because the following keys are missing:

The seesr-model_path is ''Seesr-main/preset/models/seesr''
I don't know if it's because of the safetensor file format.The command line parameters are the same as those in readme.

Can not load the unet and controlnet

Thanks for your wonderful work! I'm encountering an error when trying to load the 'unet' or 'controlnet' models. The error mentions that it couldn't connect to 'https://huggingface.co/' to load the model, even though I believe my internet connection is working fine. Do you have any suggestions on how to fix this issue?

The error message is as follows:
OSError: We couldn't connect to 'https://huggingface.co' to load this model, couldn't find it in the cached files and it looks like preset/models/seesr is not the path to a directory containing a config.json file. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/diffusers/installation#offline-mode'.

OSError: Error no file named diffusion_pytorch_model.bin found in directory preset/models/seesr.

Thanks for your greatwork. I put the weight in the dir like this:
/SeeSR-main/preset/models
--DAPE.pth
--seesr
--stable-diffusion-2-base
And run the command:
python test_seesr.py
--pretrained_model_path preset/models/stable-diffusion-2-base
--prompt None
--seesr_model_path preset/models/seesr
--ram_ft_path preset/models/DAPE.pth
--image_path preset/datasets/test_datasets
--output_dir preset/datasets/output
--start_point lr
--num_inference_steps 50
--guidance_scale 5.5
--process_size 512
However, it turns out to be OSError:Error no file named diffusion_pytorch_model.bin found in directory preset/models/seesr.
How can i fix this problem? Thanks a lot.

How many epochs do you set when creating training datasets?

The default setting is 1; is this sufficient for fine-tuning the model?

What's the difference between from_pretrained_orig() and from_pretrained()? Blurry turbo output?

I get this error when trying to use from_pretrained_orig():

File "C:\Stuff\Apps\SeeSR\gradio_seesr_turbo.py", line 48, in <module> unet = UNet2DConditionModel.from_pretrained_orig(seesr_model_path, subfolder="unet") TypeError: UNet2DConditionModel.from_pretrained_orig() missing 1 required positional argument: 'seesr_model_path'

Changed it to from_pretrained() and it appears to work fine, however the turbo output is blurrier than I would have expected.
https://imgsli.com/MjY3ODY4/0/1

Is the blurriness just the nature of the turbo model or is it because I'm loading something wrong?