zyxelsa / inst Goto Github PK

View Code? Open in Web Editor NEW

473.0 473.0 43.0 31.6 MB

Official implementation of the paper “Inversion-Based Style Transfer with Diffusion Models” (CVPR 2023)

License: Apache License 2.0

Jupyter Notebook 89.66% Python 10.28% Shell 0.06%

inst's People

Contributors

Stargazers

Watchers

inst's Issues

Can I use several images for training?

在运行inST.py时出现了Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

debug后发现是from torch import autocast就出错了

What license is this under?

I'm not seeing any license for this repository, can you tell me what the license it is on?

Is it possible that image x is not equal image y when training?

Thank you for your great work! I wonder whether we can use the content image x and style image y when training? Is it possible. In othe words, why image x is equal image y? Can you help me?

!!Help!!InST打不开style_image路径

是这样的，我在学校的服务器上跑。且因为Jupyter面板看不到，所以把ipynb改成了py文件去运行的。然后训练出的embedding.pt有且路径对，content_image的路径也对的。但是style_image的路径填了却报错，和content_image填一样的也报错。

想请问大佬应该怎么解决，无限感谢！！

Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the samethis may be cased by the code in line 1112 in ddpm.py, plz change the code like this, but im not sure, plz let me see the hole issue report

          this may be cased by the code in line 1112 in ddpm.py, plz change the code like this, but im not sure, plz let me see the hole issue report

Originally posted by @neverenough7 in #13 (comment)

this is my issue, I do not know how to find it`s problem. please help me solve it. Thank you

8张16G的卡，也报CUDA out of memory，得多大的显存啊

8张16G的卡，也报CUDA out of memory，得多大的显存啊，训不起来

Load pre-trained model:CLIP transformer encoder

while runing InST.ipynb, I have encountered the following problem. It seems to be failing to load CLIPProcessor when loading.
Has anyone had a similar problem, please help, thanks!

Traceback (most recent call last):
File "ml_InST.py", line 68, in
model = load_model_from_config(config, f"{ckpt}")
File "ml_InST.py", line 39, in load_model_from_config
model = instantiate_from_config(config.model)
File "/home/mll/DeepLearning/InST/ldm/util.py", line 85, in instantiate_from_config
return get_obj_from_str(config["target"])(**config.get("params", dict()), **kwargs)
File "/home/mll/DeepLearning/InST/ldm/models/diffusion/ddpm.py", line 477, in init
self.instantiate_cond_stage(cond_stage_config)
File "/home/mll/DeepLearning/InST/ldm/models/diffusion/ddpm.py", line 553, in instantiate_cond_stage
model = instantiate_from_config(config)
File "/home/mll/DeepLearning/InST/ldm/util.py", line 85, in instantiate_from_config
return get_obj_from_str(config["target"])(**config.get("params", dict()), **kwargs)
File "/home/mll/DeepLearning/InST/ldm/modules/encoders/modules.py", line 169, in init
self.processor = CLIPProcessor.from_pretrained(version)
File "/home/mll/miniconda3/envs/inst/lib/python3.8/site-packages/transformers/processing_utils.py", line 186, in from_pretrained
args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
File "/home/mll/miniconda3/envs/inst/lib/python3.8/site-packages/transformers/processing_utils.py", line 230, in _get_arguments_from_pretrained
args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
File "/home/mll/miniconda3/envs/inst/lib/python3.8/site-packages/transformers/feature_extraction_utils.py", line 308, in from_pretrained
feature_extractor_dict, kwargs = cls.get_feature_extractor_dict(pretrained_model_name_or_path, **kwargs)
File "/home/mll/miniconda3/envs/inst/lib/python3.8/site-packages/transformers/feature_extraction_utils.py", line 436, in get_feature_extractor_dict
raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this model, couldn't find it in the cached files and it looks like openai/clip-vit-large-patch14 is not the path to a directory containing a preprocessor_config.json file.

finished

How to prepare the Personalized dataset?

Excellent work! How to prepare Personalized dataset when training the code? Could you please provide some detailed instructions, thank you!

Inconsistencies between my test results and those of the paper

Hi, expert

I used the weights you provided for testing. Why are the test results inconsistent with the results in the paper?

Torch._six

HI,
i am currently facing issue with torch._six module.
current torch version=
Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import torch
torch.version
'2.0.0+cu117'

ModuleNotFoundError Traceback (most recent call last)
Cell In[7], line 10
8 sys.path.append(".")
9 sys.path.append('./taming-transformers')
---> 10 from taming.models import vqgan

File ~/Desktop/Difussion_InST/InST/scripts/./taming-transformers/taming/models/vqgan.py:5
2 import torch.nn.functional as F
3 import pytorch_lightning as pl
----> 5 from main import instantiate_from_config
7 from taming.modules.diffusionmodules.model import Encoder, Decoder
8 from taming.modules.vqvae.quantize import VectorQuantizer2 as VectorQuantizer

File ~/Desktop/Difussion_InST/InST/scripts/./taming-transformers/main.py:14
11 from pytorch_lightning.callbacks import ModelCheckpoint, Callback, LearningRateMonitor
12 from pytorch_lightning.utilities import rank_zero_only
---> 14 from taming.data.utils import custom_collate
17 def get_obj_from_str(string, reload=False):
18 module, cls = string.rsplit(".", 1)

File ~/Desktop/Difussion_InST/InST/scripts/./taming-transformers/taming/data/utils.py:12
10 from taming.data.helper_types import Annotation
11 # from torch._six import str as string_classes
---> 12 import torch._six.string_classes as string_classess
13 from torch.utils.data._utils.collate import np_str_obj_array_pattern, default_collate_err_msg_format
14 from tqdm import tqdm

ModuleNotFoundError: No module named 'torch._six'

NameError: name 'trainer' is not defined

这个问题不知道怎么解决，我运行的参数如下python main.py --base configs/stable-diffusion/v1-finetune.yaml -t --actual_resume ./models/sd/sd-v1-4.ckpt -n myrun --gpus 0, --data_root .\images
我反复检查但是没能解决问题，麻烦大佬
NameError: name 'trainer' is not defined

Dataset

Can you tell me which dataset you use?

Style doesn't transfer at all when running InST.ipynb with empty prompt.

Thank you for your paper, the results look amazing!!

Tried to run InST.ipynb to reproduce results published in the Comparison Data section.

As for style reference, I specified the following parameters: style_dir=andre-derain.jpg, embeddings - andre-derain_embeddings.pt_

Other parameters: content_dir=pexels-anastasiya-lobanovskaya-804952.jpg, and prompt=''

After running it with n_iter=3, got following results:

So it seems that style wasn't transferred at all in my case.

Could you please suggest whether I run InST.ipynb with the wrong parameters, or what could go not as expected in my setup?

checkpoint.clear

How to separate style and content in the reference image?

If one use a textual inversion to describe the style image, how to separate style and content in the reference image? I understand that "content" is part of the "style"(as the paper mentioned, semantics elements are part of style). However, in Fig.4, 2rd row, InST result is totally a robot as the reference image, while the content is a human. So I am confused about the evaluation: which parts of elements of the style image are we really want to transfer to content image...

How about your training dataset?

Hi expert,
As you mentioned in the paper "The training process takes about 20 minutes each image on one NVIDIA GeForce RTX3090 with a batch size of 1".
So, what the dataset you used as training data? I would like to compare with some state of art method of style transfer.

Can a cartoon or anime image be transferred back to a real image?

Can you run and test the project with a dedicated GPU of 10gb and a shared GPU of 32gb?

the results are bad

I train the embeddings.pt and use it in InST.ipynb to generate picture, but the results are bad not as good as what the author provided.
I want ask what caused it, Is it because there were other details during training?

using prompt

Hi ,
Can I use your network to transform my image to a different style (anime for example) and use also textual prompt (for example 'riding a horse') and receive an image of myself in the style of anime riding a horse?
How can I do it ? I trained the network with the anime style and tried adding a prompt in the evaluation and the prompt didn't effect the image
Thank you very much in advance,
Noa

embeddings.pt?

您好，请问embeddings.pt是只能自己运行main.py来生成吗？训练的数据是什么呢？这个能分享一下吗？

train

When I was training, I encountered the following error:

pytorch_lightning.utilities.exceptions.MisconfigurationException: No test_step() method defined to

It seems like the training has been completed, but there is no way to test it.What should I do? DoI need to modify the code？
I also want to know if the training data and test data are the same during training? Run command only have --data_root.

Thank a lot!

During the training process, I encountered the following issue. Has anyone encountered a similar problem? Please help me to solve it, thank you!

(ldm) lz@manager-Precision-7920-Tower:~/Documents/InST$ python main.py --base configs/stable-diffusion/v1-finetune.yaml -t --actual_resume ./models/sd/sd-v1-4.ckpt -n log1_shuimo --gpus 0, --data_root /home/lz/Documents/InST/style
Global seed set to 23
Running on GPUs 0,
Loading model from ./models/sd/sd-v1-4.ckpt
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Traceback (most recent call last):
File "main.py", line 582, in
model = load_model_from_config(config, opt.actual_resume)
File "main.py", line 29, in load_model_from_config
model = instantiate_from_config(config.model)
File "/home/lz/Documents/InST/ldm/util.py", line 85, in instantiate_from_config
return get_obj_from_str(config["target"])(**config.get("params", dict()), **kwargs)
File "/home/lz/Documents/InST/ldm/models/diffusion/ddpm.py", line 477, in init
self.instantiate_cond_stage(cond_stage_config)
File "/home/lz/Documents/InST/ldm/models/diffusion/ddpm.py", line 561, in instantiate_cond_stage
model = instantiate_from_config(config)
File "/home/lz/Documents/InST/ldm/util.py", line 85, in instantiate_from_config
return get_obj_from_str(config["target"])(**config.get("params", dict()), **kwargs)
File "/home/lz/Documents/InST/ldm/modules/encoders/modules.py", line 166, in init
self.tokenizer = CLIPTokenizer.from_pretrained(version)
File "/home/lz/anaconda3/envs/ldm/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 1764, in from_pretrained
raise EnvironmentError(
OSError: Can't load tokenizer for 'openai/clip-vit-large-patch14'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'openai/clip-vit-large-patch14' is the correct path to a directory containing all relevant files for a CLIPTokenizer tokenizer.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 795, in
if trainer.global_rank == 0:
NameError: name 'trainer' is not defined

someting seems to be missing in main.py

I reported an error during training, it seems that I am missing a rewrite of the “LightningModule” method.
Could you please provide a new main.py?

Error when running: $conda env create -f environment.yaml

When try to install packages under project folder, error occurs.
The error message is:

(base) PS C:\Users\bchan\Desktop\project\creativity-transfer> conda env create -f environment.yaml
Collecting package metadata (repodata.json): done
Solving environment: done
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Installing pip dependencies: / Ran pip subprocess with arguments:
['C:\\Users\\bchan\\Miniconda3\\envs\\ldm\\python.exe', '-m', 'pip', 'install', '-U', '-r', 'C:\\Users\\bchan\\Desktop\\project\\creativity-transfer\\condaenv.v7jnhji2.requirements.txt']
Pip subprocess output:

Pip subprocess error:
ERROR: File "setup.py" not found. Directory cannot be installed in editable mode: C:\Users\bchan\Desktop\project\creativity-transfer

failed

CondaEnvException: Pip failed

After some research online, by removing the last line in environment.yaml, that is, " - -e .", can avoid this error and successfully install the above dependencies.
OS: windows 11 22H2

Stochastic Inversion and embedding.pt

Dose the stochastic inversion is only used for the inference?

And this model is only trained on embedding.pt not the diffusion model parameters?

bug in embedding_manager

embedding_manager save和load模型时使用的是initial_embeddings，设置为requires_grad=False，测试了一下训练过程中没有更新这个参数，只更新了attention网络。看了一下InST的代码，这里应该是保存的string_to_token_dict而不是initial_embeddings吧。

RuntimeError: shape '[4, 1, 768]' is invalid for input of size 768

something was wrong
@zyxElsa

trainer.fit(model, data) in main.py

code: trainer.fit(model, data) in main.py
error: RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
I don't know how to solve the problem

training time for "main.py"

The paper says "The training process takes about 20 minutes each image on one NVIDIA GeForce RTX3090 with a batch size of 1."
I used two RTX3090. However, it already cost three hours and it doesn't seem to stop.

For "v1-finetune.yaml”, which paramater specifys the number of epochs in the training process?

Trained results are not so good

Thank you for your great work!
I trained an image and get results by main.py and CreativityTransfer.ipynb, but I can't get similar results to the results in the paper. Can you give specific steps or parameters in main.py?
Below is content image and my result.

Request for exemplar images

Hi, thank you for sharing the code!

Can you also share some exemplar images so that we can check whether we are on the right track of training and evaluation?

Also, I tried to transfer the style of a sample image to the newly trained style of TheStarryNight and got the following.

Is this an expected result?

What should I do to change this old man into Domhnall Gleeson?

what does 'logs/{log_dir}/checkpoints/embeddings.pt' mean?

What does 'logs/{log_dir}/checkpoints/embeddings.pt' in ipynb mean? and whether has pretrained pt file for test?

model.embedding_manager.load('./logs/{log_dir}/checkpoints/embeddings.pt')
model = model.to(device)

Thanks!

you will release yours embeddings.pt model?

I trained myself poorly，
Can this be achieved？ use a image trained a model， then we can transfer a video （or a scene） not only transfer a image （likeness trained image）

BUG

Train Step: occur some errors as the following

ConnectionRefusedError: [Errno 111] Connection refused
urllib.error.URLError: <urlopen error [Errno 111] Connection refused>
NameError: name 'trainer' is not defined
I don't know how to solve these problems. I would like to know if there is a code for linking pages in this project.I don't actually see any code like that
(不知道这个是不是因为项目里有代码链接到其他网页产生的报错，但是我并没有发现有类似的链接代码)

When training the Diffusion model（main.py）, two processes are found to be open. What is the reason

No "models/sd/sd-v1-4.ckpt"

Hello, thank you for sharing your code. But when I ran InST.ipynb for testing, I found that there was no corresponding trained model. Can you please share this test model?

Clarification on style-transfer details

First, really nice idea and results but for the style transfer case where a content reference image is given, the steps is not really explained clearly in the paper about how the content is preserved. Is the stochastic inversion related to this? If so, how?

Training doesn't work

Hi,

I run the training code main.py with reference image, but it seems doesn't work. I use the trained model to stylize the image, it always generate same images no matter what input style image I provide. And in the logs, the sampled images are all noises like this:

conda install error report

Really fantastic work! Respect firstly :-)

When I typed conda env create -f environment.yaml , error occured:

Installing pip dependencies: / Ran pip subprocess with arguments:
['/home/miniconda3/envs/ldm/bin/python', '-m', 'pip', 'install', '-U', '-r', '/data/application/creativity-transfer/condaenv.hx046q8u.requirements.txt']
Pip subprocess output:

Pip subprocess error:
ERROR: File "setup.py" not found. Directory cannot be installed in editable mode: /data/application/creativity-transfer
                                                                                                                                                                                                                                                                            failed

CondaEnvException: Pip failed

OS: Centos7

Maybe because of missing of setup.py in repo folder?

Unable to reproduce paper results，无法复现论文结果

By relying on style style screenshots in the paper to train embeddings. pt, it is impossible to achieve results that look as good as the paper,Doubt the feasibility of this method. May I ask if you can provide the embeddings. pt file for the author's training, any one is sufficient. Thank you very much!
依靠对论文里的style风格截图训练embeddings.pt，无法得到和论文一样好看的结果，怀疑这个方法的可行性。请问可以提供一下作者训练的embeddings.pt文件吗，随便一个就行。不胜感激！

zyxelsa / inst Goto Github PK

inst's People

Contributors

Stargazers

Watchers

Forkers

inst's Issues

Recommend Projects

Recommend Topics

Recommend Org