jack000 / glid-3-xl Goto Github PK
View Code? Open in Web Editor NEW1.4B latent diffusion model fine tuning
License: MIT License
1.4B latent diffusion model fine tuning
License: MIT License
I'm having an issue running this using what's available here. I've gotten this far but I'm just not sure how to proceed at this point. Is cognitive face API required? can it be removed? I'd personally rather not have every single sample going to microsoft if I could personally help it.
Using device: cuda:0
[2022-04-28 00:47:32] /home/user/Projects/glid-3-xl/sample.py - - - - - - - - - - - - - - - - - - - - - - - - - - - eprint(line:60) :: Error when calling Cognitive Face API:
status_code: 401
code: 401
message: Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.
[2022-04-28 00:47:32] /home/user/Projects/glid-3-xl/sample.py - - - - - - - - - - - - - - - - - - - - - - - - - - - eprint(line:60) :: img_url:https://raw.githubusercontent.com/Microsoft/Cognitive-Face-Windows/master/Data/detection1.jpg
[2022-04-28 00:47:32] /home/user/Projects/glid-3-xl/sample.py - - - - - - - - - - - - - - - - - - - - - - - - - - - eprint(line:60) :: Error when calling Cognitive Face API:
status_code: 401
code: 401
message: Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.
[2022-04-28 00:47:32] /home/user/Projects/glid-3-xl/sample.py - - - - - - - - - - - - - - - - - - - - - - - - - - - eprint(line:60) :: img_url:/data1/mingmingzhao/label/data_sets_teacher_1w/47017613_1510574400_out-video-jzc70f41fa6f7145b4b66738f81f082b65_f_1510574403268_t_1510575931221.flv_0001.jpg
[]
Traceback (most recent call last):
File "/home/user/Projects/glid-3-xl/sample.py", line 283, in <module>
ldm = torch.load(args.kl_path, map_location="cpu")
File "/home/user/.local/lib/python3.10/site-packages/torch/serialization.py", line 712, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/home/user/.local/lib/python3.10/site-packages/torch/serialization.py", line 1046, in _load
result = unpickler.load()
File "/home/user/.local/lib/python3.10/site-packages/torch/serialization.py", line 1039, in find_class
return super().find_class(mod_name, name)
ModuleNotFoundError: No module named 'ldm.models'; 'ldm' is not a package```
Distro: PopOS 22.01
NVIDIA-SMI:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.86 Driver Version: 470.86 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
NVIDIA GeForce RTX 2080 Ti
Python: 3.10.4
Hi, if I follow the instruction to run image_train_latent.py, it seems only one GPU is used. Can you advise on how to use multiple GPUs? Thanks.
Hello @Jack000,
I found that you finetuned "jack" model used in the latent diffusion notebook here (https://www.kaggle.com/code/litevex/lite-s-latent-diffusion-v9-with-gradio)
Could you please guide on how to improve/finetune jack model on wider dataset such as VQGAN Pairs or others as the quality is not good on some of the styles and prompts. Also please share the expected time (in hours) taken by the model and GPU specification if possible.
Thanks.
Shan
I have an 11GB Rtx3080TI and it seems to be failling. On CPU, I get the error "RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'". I hope I installed it correctly, I had to install some additional repos like transformers and taming-transformers. This is for the CLIP guidance
Hi,
I'm trying to run
python sample.py --model_path finetune.pt --batch_size 6 --num_batches 6 --text "a cyberpunk girl with a scifi neuralink device on her head"
I don't get anything. no output is saved, and no error is raised.
FYI, I'm running the code on CPU.
I am using 8 GPUs instance but the script uses only GPU:0 not all the GPUs . I get CUDA memory error when i try to make clip_guidance ofViT-L/14 model. How can i utilize all GPUs so i don't have any limitation ?
Received this error after following instructions to run sampling:
python sample.py --model_path finetune.pt --batch_size 6 --text "a cyberpunk girl with a scifi neuralink device on her head | cyberpunk anime girl"
Traceback (most recent call last):
File "sample.py", line 19, in <module>
from dalle_pytorch import DiscreteVAE, VQGanVAE
ModuleNotFoundError: No module named 'dalle_pytorch'
When generating images setting the seed doesn't give reproducible results, the images (and clip scores) are different every time the code is run with same seed, any idea why?
I can't figure why i'm getting this error
python sample.py --model_path finetune.pt --batch_size 1 --num_batches 1 --text "a cyberpunk girl with a scifi neuralink device on her head"
Using device: cuda:0
Traceback (most recent call last):
File "sample.py", line 284, in <module>
ldm.to(device)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/core/mixins/device_dtype_mixin.py", line 121, in to
return super().to(*args, **kwargs)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 927, in to
return self._apply(convert)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
module._apply(fn)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
module._apply(fn)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
module._apply(fn)
[Previous line repeated 3 more times]
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 602, in _apply
param_applied = fn(param)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 925, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Trying to run with CUDA_LAUNCH_BLOCKING enabled
CUDA_LAUNCH_BLOCKING=1 python sample.py --model_path finetune.pt --batch_size 1 --num_batches 1 --text "a cyberpunk girl with a scifi neuralink device on her head"
Using device: cuda:0
Traceback (most recent call last):
File "sample.py", line 284, in <module>
ldm.to(device)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/core/mixins/device_dtype_mixin.py", line 121, in to
return super().to(*args, **kwargs)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 927, in to
return self._apply(convert)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
module._apply(fn)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
module._apply(fn)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
module._apply(fn)
[Previous line repeated 3 more times]
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 602, in _apply
param_applied = fn(param)
File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 925, in convert
return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: unknown error
pip freeze
absl-py==1.1.0
aiohttp==3.8.1
aiosignal==1.2.0
albumentations==0.4.3
altair==4.2.0
antlr4-python3-runtime==4.8
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
asttokens==2.0.5
async-timeout==4.0.2
attrs==21.4.0
axial-positional-embedding==0.2.1
backcall==0.2.0
backports.zoneinfo==0.2.1
beautifulsoup4==4.11.1
bleach==5.0.1
blinker==1.4
blobfile==1.3.1
braceexpand==0.1.7
brotlipy @ file:///home/conda/feedstock_root/build_artifacts/brotlipy_1648854175163/work
cachetools==5.2.0
certifi==2022.6.15
cffi==1.15.0
charset-normalizer @ file:///home/conda/feedstock_root/build_artifacts/charset-normalizer_1655906222726/work
click==8.1.3
-e git+https://github.com/openai/CLIP.git@b46f5ac7587d2e1862f8b7b1573179d80dcdd620#egg=clip
commonmark==0.9.1
cryptography @ file:///home/conda/feedstock_root/build_artifacts/cryptography_1652967113783/work
DALL-E==0.1
dalle-pytorch==1.6.4
debugpy==1.6.0
decorator==5.1.1
defusedxml==0.7.1
einops==0.4.1
entrypoints==0.4
executing==0.8.3
fastjsonschema==2.15.3
filelock==3.7.1
frozenlist==1.3.0
fsspec==2022.5.0
ftfy==6.1.1
future==0.18.2
gitdb==4.0.9
GitPython==3.1.27
google-auth==2.9.0
google-auth-oauthlib==0.4.6
grpcio==1.47.0
-e git+https://github.com/Jack000/glid-3-xl@a0b5be4b04378d4d4779240d3e0a599360c1a133#egg=guided_diffusion
idna @ file:///home/conda/feedstock_root/build_artifacts/idna_1642433548627/work
imageio==2.9.0
imageio-ffmpeg==0.4.2
imgaug==0.2.6
importlib-metadata==4.12.0
importlib-resources==5.8.0
iniconfig==1.1.1
ipykernel==6.15.0
ipython==8.4.0
ipython-genutils==0.2.0
ipywidgets==7.7.1
jedi==0.18.1
Jinja2==3.1.2
joblib==1.1.0
jsonschema==4.6.1
jupyter-client==7.3.4
jupyter-core==4.10.0
jupyterlab-pygments==0.2.2
jupyterlab-widgets==1.1.1
-e git+https://github.com/CompVis/latent-diffusion.git@5a6571e384f9a9b492bbfaca594a2b00cad55279#egg=latent_diffusion
Markdown==3.3.7
MarkupSafe==2.1.1
matplotlib-inline==0.1.3
mistune==0.8.4
mkl-fft==1.3.1
mkl-random @ file:///tmp/build/80754af9/mkl_random_1626186064646/work
mkl-service==2.4.0
multidict==6.0.2
mypy==0.961
mypy-extensions==0.4.3
nbclient==0.6.5
nbconvert==6.5.0
nbformat==5.4.0
nest-asyncio==1.5.5
networkx==2.8.4
notebook==6.4.12
numpy @ file:///opt/conda/conda-bld/numpy_and_numpy_base_1654872176621/work
oauthlib==3.2.0
omegaconf==2.1.1
opencv-python==4.1.2.30
opencv-python-headless==4.6.0.66
packaging==21.3
pandas==1.4.3
pandocfilters==1.5.0
parso==0.8.3
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.0.1
pluggy==1.0.0
prometheus-client==0.14.1
prompt-toolkit==3.0.30
protobuf==3.19.4
psutil==5.9.1
ptyprocess==0.7.0
pudb==2019.2
pure-eval==0.2.2
py==1.11.0
pyarrow==8.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser @ file:///home/conda/feedstock_root/build_artifacts/pycparser_1636257122734/work
pycryptodomex==3.15.0
pydeck==0.7.1
pyDeprecate==0.3.2
Pygments==2.12.0
Pympler==1.0.1
pyOpenSSL @ file:///home/conda/feedstock_root/build_artifacts/pyopenssl_1643496850550/work
pyparsing==3.0.9
pyrsistent==0.18.1
PySocks @ file:///home/conda/feedstock_root/build_artifacts/pysocks_1648857275402/work
pytest==7.1.2
python-dateutil==2.8.2
pytorch-lightning==1.6.4
pytz==2022.1
pytz-deprecation-shim==0.1.0.post0
PyWavelets==1.3.0
PyYAML==6.0
pyzmq==23.2.0
regex==2022.6.2
requests @ file:///home/conda/feedstock_root/build_artifacts/requests_1656534056640/work
requests-oauthlib==1.3.1
rich==12.4.4
rotary-embedding-torch==0.1.5
rsa==4.8
sacremoses==0.0.53
scikit-image==0.19.3
scipy==1.8.1
semver==2.13.0
Send2Trash==1.8.0
six @ file:///tmp/build/80754af9/six_1644875935023/work
smmap==5.0.0
soupsieve==2.3.2.post1
stack-data==0.3.0
streamlit==1.10.0
-e git+https://github.com/CompVis/taming-transformers.git@24268930bf1dce879235a7fddd0b2355b84d7ea6#egg=taming_transformers
taming-transformers-rom1504==0.0.6
tensorboard==2.9.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
terminado==0.15.0
test-tube==0.7.5
tifffile==2022.5.4
tinycss2==1.1.1
tokenizers==0.10.3
toml==0.10.2
tomli==2.0.1
toolz==0.11.2
torch==1.12.0
torch-fidelity==0.3.0
torchaudio==0.12.0
torchmetrics==0.9.2
torchvision==0.13.0
tornado==6.1
tqdm==4.64.0
traitlets==5.3.0
transformers==4.3.1
typing-extensions @ file:///opt/conda/conda-bld/typing_extensions_1647553014482/work
tzdata==2022.1
tzlocal==4.2
urllib3 @ file:///home/conda/feedstock_root/build_artifacts/urllib3_1647489083693/work
urwid==2.1.2
validators==0.20.0
watchdog==2.1.9
wcwidth==0.2.5
webdataset==0.2.5
webencodings==0.5.1
Werkzeug==2.1.2
widgetsnbextension==3.6.1
xmltodict==0.12.0
yarl==1.7.2
youtokentome==1.0.6
zipp==3.8.0
Thanks for the fantastic repo and models!!!!
I have a quesiton about converting model. Can you have some suggestions about converting origin (latent/stable) diffusion models to the type of this repo for finetuing.
Thanks for your answer.
Hi.
I know that the MIT license applies to the repository.
Does it also apply to the models produced or finetuned by you?
Thanks!
Hi,
Thanks for the fantastic repo and models!
It seems in the sample.py
script, the model is loading CLIP and applies a CLIP embedding here
glid-3-xl/guided_diffusion/unet.py
Line 842 in a0b5be4
image_train_latent.py
, I don't see any logic regarding CLIP (but image_train_inpaint.py
does have). Can anyone explain more details on how finetune.pt
was finetuned?
Also, it seems not much difference on the results comparing whether setting CLIP embedding to work or not.
Thanks for answering!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.