yangxy / pasd Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
I test according to readme.md and report an error.
"cusolver error: CUSOLVER_STATUS_EXECUTION_FAILED, when calling cusolverDnSgetrf( handle, m, n, dA, ldda, static_cast<float*>(dataPtr.get()), ipiv, info)
. This error may appear if the input matrix contains NaN."
There is a problem with "latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]" in line 1151, pipelines/pipeline_pasd.py.
I hope you can give me some guidance, thank you very much!
Dear authors,
I have read about your PASD work, it mentioned PASD also support colorization task, may I ask whether the currently released PASD/PASD-light/PASD-RRDB including the colorization or not?
if not, may I ask will the colorization model be released? Many thanks.
Hi,
Thanks so much for making this amazing model! Would you consider making it open source by adding a OSI-approved license (ie MIT/Apache 2.0/ISC)?
Thank you!
C:\PASD-main>python gradio_pasd.py
C:\PASD-main\pipelines\pipeline_pasd.py:42: FutureWarning: Importing `DiffusionPipeline` or `ImagePipelineOutput` from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
from diffusers.pipeline_utils import DiffusionPipeline
C:\Python310\lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
C:\Python310\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=None`.
warnings.warn(msg)
Traceback (most recent call last):
File "C:\PASD-main\gradio_pasd.py", line 29, in <module>
from models.pasd.unet_2d_condition import UNet2DConditionModel
File "C:\PASD-main\models\pasd\unet_2d_condition.py", line 27, in <module>
from diffusers.models.embeddings import (
ImportError: cannot import name 'PositionNet' from 'diffusers.models.embeddings' (C:\Python310\lib\site-packages\diffusers\models\embeddings.py)
when I inference 1024*1024 image , [Tiled VAE]: the input size is tiny and unnecessary to tile. [Tiled VAE]: Done in 12.476s, max VRAM alloc 5802.607 MB
But I want to use more gpu and fast than now , how to solve it ?
And inference once some model offload in gpu
Hi. I made a colab notebook for inference.
I think PASD is underrated compared to other diffusion prior based models like DiffBIR.
The colab notebook does upscale but doesn't add details to the image. I have checked same settings on the colab and the demospace, thedemospace does excelent job on adding details.
In the colab i get warning in the last cell, then it keeps working and gives me an upscale:
2024-01-26 10:42:39.581017: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-26 10:42:39.581065: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-26 10:42:39.582556: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-26 10:42:39.590674: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-01-26 10:42:41.313011: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/local/lib/python3.10/dist-packages/torchvision/transforms/functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be removed in 0.17. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
warnings.warn(
Can you please post the details of the test dataset? I see there are no instructions specifically for the DIV2K valid dataset. I used the weights you posted to test on the DIV2K valid dataset posted by stablesr and found that the results were not the same.
Thanks for your impressive work, PASD. Now I encounter some issues while reproducing your Real-IR experiment results and would appreciate your assistance.
After training the model for 500k steps(as instructed in this repo), I tested the performance of different step model weights on the benchmark dataset, like 50k, 100k, 200k. However, the inference results of these weights I replicated are ALL NOT as impressive as yours. These are looked more blurry&noisy and lack sharpness. (FYI, I have included several comparative sample images from DRealSRx4 dataset, 128px -> 512px, in the attachment)
After code checking, I think the code is clean. Perhaps there's a discrepancy in the configuration of the degraded model parameters, i.e real-esrgan configure parameters? Or there are any magic training trick?
I would greatly appreciate any ideas or assistance you can provide! Thank you again!
Do we only need to add added_prompt and negative_prompt during inference, not during training?
can you help me?
when I download pre-trained models pasd_light and put them into runs/. And I download SD1.5 models v1-5-pruned-emaonly.ckpt and put them into checkpoints/stable-diffusion-v1-5. I run python test_pasd.py --use_pasd_light there is an error :
OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory checkpoints/stable-diffusion-v1-5.
Am I missing something that needs to be downloaded?
Hi~, can I ask for some detail about the model.
In Figure 2 of the paper, it is shown that PACA only exists in the up-sample blocks of U-Net.
Is there any specific motivation behind this design?
Looking forward to your reply.
Thanks very much
@yangxy
Hi, thanks for your meaningful work. However, when I reproduce PASD as your instruction in README, I notice the CPU memory keeps incereaing. Then my machine collapsed after 60k iterations, where the CPU memory usage goes to ~1T. Any ideas about this situation? Thanks for your any suggestion.
BTW, I employ the recomanded WebDataset + torch.Dataloader.
Hi, thanks for this excellent work! I would like to try on my own dataset (like hr_512, lr_128), could you please tell me which dataloader I should use to change to my own dataset path?
Thanks
Hello! Is there any package installation guidance? I try to use pip install -r requirements.txt. However, the diffusers package with the latest one cannot find some function sometimes. Is there any specific version for diffusers used in the experiments of the paper? Thanks!
How to solve it ?
Very nice job! When I want to train the model, I found that the training dataset URL is broken. May I ask for you updating the training dataset URL. Thank you very much.
During training, the RAM keeps increasing. Is it a memory leak? But I can’t find where the problem is. Can anyone help me?
I'll try it, but fail.
import cv2
import torch
from modelscope.outputs import OutputKeys
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
input_location = 'http://public-vigen-video.oss-cn-shanghai.aliyuncs.com/robin/results/output_test_pasd/0fbc3855c7cfdc95.png'
prompt = ''
output_image_path = 'result.png'
input = {
'image': input_location,
'prompt': prompt,
'upscale': 2,
'fidelity_scale_fg': 1.0,
'fidelity_scale_bg': 1.0,
'use_personalized_model': True,
'personalized_model_path': 'toonyou_beta3.safetensors'
}
pasd = pipeline(Tasks.image_super_resolution_pasd, model='damo/PASD_image_super_resolutions')
output = pasd(input)[OutputKeys.OUTPUT_IMG]
cv2.imwrite(output_image_path, output)
print('pipeline: the output image path is {}'.format(output_image_path))
Updated from latest changes and now get:
$ python test_pasd.py
C:\PASD\pipelines\pipeline_pasd.py:41: FutureWarning: Importing `DiffusionPipeline` or `ImagePipelineOutput` from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
from diffusers.pipeline_utils import DiffusionPipeline
C:\Python310\lib\site-packages\torchvision\transforms\functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
warnings.warn(
clean, high-resolution, 8k
0%| | 0/20 [00:00<?, ?it/s]
too many values to unpack (expected 2)
where pasd_color
thanks for help!
Thank you for sharing your work. I want to know how to train a colorization model on my own dataset.
PASD is very nice job!
I was trying to make a personalize style(toonyou) photo, the original photo is 000020x2.png in the folder samples. However, I have tried toonyou.safetensors in civatai, with pasd_rrdb/checkpoint-100000, --use_personalized_model, others as default. I can not generalize the beautiful photo just like you paper show. Can you tell me what are the parameters in personalized_model?
Thank you!
local variable 'validation_image' referenced before assignment
The text information obtained by image classification and target detection are all frogs. The text information obtained by the BLIP network also contains frogs. What is the role of repeated text?
Hi, authors,
Congrats for the nice work.
I wonder what is the model/config for colorization?
Thx a lot
python test_pasd.py
Traceback (most recent call last):
File "C:\PASD-main\test_pasd.py", line 22, in <module>
from pipelines.pipeline_pasd import StableDiffusionControlNetPipeline
File "C:\PASD-main\pipelines\pipeline_pasd.py", line 32, in <module>
from diffusers.utils import (
ImportError: cannot import name 'is_compiled_module' from 'diffusers.utils' (C:\Python310\lib\site-packages\diffusers\utils\__init__.py)
$ python test_pasd.py
Traceback (most recent call last):
File "C:\PASD-main\test_pasd.py", line 22, in <module>
from pipelines.pipeline_pasd import StableDiffusionControlNetPipeline
File "C:\PASD-main\pipelines\pipeline_pasd.py", line 32, in <module>
from diffusers.utils import (
ImportError: cannot import name 'randn_tensor' from 'diffusers.utils' (C:\Python310\lib\site-packages\diffusers\utils\__init__.py)
python test_pasd.py
C:\PASD-main\pipelines\pipeline_pasd.py:45: FutureWarning: Importing `DiffusionPipeline` or `ImagePipelineOutput` from diffusers.pipeline_utils is deprecated. Please import from diffusers.pipelines.pipeline_utils instead.
from diffusers.pipeline_utils import DiffusionPipeline
C:\Python310\lib\site-packages\torchvision\transforms\functional_tensor.py:5: UserWarning: The torchvision.transforms.functional_tensor module is deprecated in 0.15 and will be **removed in 0.17**. Please don't rely on it. You probably just need to use APIs in torchvision.transforms.functional or in torchvision.transforms.v2.functional.
warnings.warn(
Traceback (most recent call last):
File "C:\PASD-main\test_pasd.py", line 267, in <module>
main(args)
File "C:\PASD-main\test_pasd.py", line 167, in main
pipeline = load_pasd_pipeline(args, accelerator, enable_xformers_memory_efficient_attention)
File "C:\PASD-main\test_pasd.py", line 40, in load_pasd_pipeline
from models.pasd.controlnet import ControlNetModel
File "C:\PASD-main\models\pasd\controlnet.py", line 27, in <module>
from basicsr.archs.rrdbnet_arch import RRDB
ModuleNotFoundError: No module named 'basicsr.archs.rrdbnet_arch'
thanks for your work and it is really interesting! However, while reading your code I can't make it clear that which part of the model is responsible for the degradation removal work.
in line 927 of train_pasd.py , you calculated F.l1_loss(pixel_values.float(), controlnet_cond_mid.float(), reduction="mean")
so you mean controlnet_cond_mid is the denoised image for diffusion model? I'm not sure if I understood your idea
测试时,output里没有生成图像
python test_pasd.py --use_personalized_model
/home/root1/anaconda3/envs/pasd/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
/home/root1/anaconda3/envs/pasd/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
/home/root1/anaconda3/envs/pasd/lib/python3.10/site-packages/diffusers/utils/outputs.py:63: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
torch.utils._pytree._register_pytree_node(
/home/root1/anaconda3/envs/pasd/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download
is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True
.
warnings.warn(
config.json: 4.52kB [00:00, 15.6MB/s]
INFO:root:Loaded coca_ViT-L-14 model config.
INFO:root:Loading pretrained coca_ViT-L-14 weights (mscoco_finetuned_laion2B-s13B-b90k).
a dog sitting in the grass with its tongue hanging out . clean, high-resolution, 8k
cannot fit 'int' into an index-sized integer
Hi, could you share the ControlNet model producing the results in Fig. 1?
Hi! To reproduce your intersting work, i hope you can release your dataset which matches the formation defined in webdataset.py
thanks a lot!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.