jack000 / glid-3-xl Goto Github PK

View Code? Open in Web Editor NEW

259.0 259.0 49.0 1.44 MB

1.4B latent diffusion model fine tuning

License: MIT License

Python 100.00%

glid-3-xl's People

Contributors

Stargazers

Watchers

Forkers

limiteinductive techthiyanes c00renut bartman081523 salmanshah1d afiaka87 wn1695173791 handshake-consulting jordanmeyer wizrds hanxiao vanga wildermuthn tianbaicui wes-kay engin-project abcbcafe centuryglass ben-selas hrichardlee edend10 animebing ofirbb torgbuiedunyenyo lexkoin mkshing rob813 wes-kay-ml zerinhwang03 yumadara yuchen202 cleardry apprikatai ethicalsecurity-agency kyunghoyu bigdatasciencegroup electronpro brugarolas

glid-3-xl's Issues

ModuleNotFoundError: No module named 'ldm.models'; 'ldm' is not a package & Is Conitive Face API Required?

I'm having an issue running this using what's available here. I've gotten this far but I'm just not sure how to proceed at this point. Is cognitive face API required? can it be removed? I'd personally rather not have every single sample going to microsoft if I could personally help it.

Using device: cuda:0
[2022-04-28 00:47:32] /home/user/Projects/glid-3-xl/sample.py  -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   eprint(line:60) :: Error when calling Cognitive Face API:
        status_code: 401
        code: 401
        message: Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.

[2022-04-28 00:47:32] /home/user/Projects/glid-3-xl/sample.py  -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   eprint(line:60) :: img_url:https://raw.githubusercontent.com/Microsoft/Cognitive-Face-Windows/master/Data/detection1.jpg
[2022-04-28 00:47:32] /home/user/Projects/glid-3-xl/sample.py  -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   eprint(line:60) :: Error when calling Cognitive Face API:
        status_code: 401
        code: 401
        message: Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.

[2022-04-28 00:47:32] /home/user/Projects/glid-3-xl/sample.py  -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   -   eprint(line:60) :: img_url:/data1/mingmingzhao/label/data_sets_teacher_1w/47017613_1510574400_out-video-jzc70f41fa6f7145b4b66738f81f082b65_f_1510574403268_t_1510575931221.flv_0001.jpg
[]
Traceback (most recent call last):
  File "/home/user/Projects/glid-3-xl/sample.py", line 283, in <module>
    ldm = torch.load(args.kl_path, map_location="cpu")
  File "/home/user/.local/lib/python3.10/site-packages/torch/serialization.py", line 712, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/home/user/.local/lib/python3.10/site-packages/torch/serialization.py", line 1046, in _load
    result = unpickler.load()
  File "/home/user/.local/lib/python3.10/site-packages/torch/serialization.py", line 1039, in find_class
    return super().find_class(mod_name, name)
ModuleNotFoundError: No module named 'ldm.models'; 'ldm' is not a package```

Distro: PopOS 22.01
NVIDIA-SMI:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.86       Driver Version: 470.86       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
 NVIDIA GeForce RTX 2080 Ti

Python: 3.10.4

How to use multiple GPUs to finetune the model?

Hi, if I follow the instruction to run image_train_latent.py, it seems only one GPU is used. Can you advise on how to use multiple GPUs? Thanks.

How to finetune latent diffusion model

Hello @Jack000,

I found that you finetuned "jack" model used in the latent diffusion notebook here (https://www.kaggle.com/code/litevex/lite-s-latent-diffusion-v9-with-gradio)
Could you please guide on how to improve/finetune jack model on wider dataset such as VQGAN Pairs or others as the quality is not good on some of the styles and prompts. Also please share the expected time (in hours) taken by the model and GPU specification if possible.

Thanks.
Shan

How much GPU memory is required?

I have an 11GB Rtx3080TI and it seems to be failling. On CPU, I get the error "RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'". I hope I installed it correctly, I had to install some additional repos like transformers and taming-transformers. This is for the CLIP guidance

No output, No error or warning

Hi,
I'm trying to run
python sample.py --model_path finetune.pt --batch_size 6 --num_batches 6 --text "a cyberpunk girl with a scifi neuralink device on her head"
I don't get anything. no output is saved, and no error is raised.
FYI, I'm running the code on CPU.

How to run sample.py using multiple GPUs ?

I am using 8 GPUs instance but the script uses only GPU:0 not all the GPUs . I get CUDA memory error when i try to make clip_guidance ofViT-L/14 model. How can i utilize all GPUs so i don't have any limitation ?

ModuleNotFoundError: No module named 'dalle_pytorch'

Received this error after following instructions to run sampling:

python sample.py --model_path finetune.pt --batch_size 6 --text "a cyberpunk girl with a scifi neuralink device on her head | cyberpunk anime girl"
Traceback (most recent call last):
  File "sample.py", line 19, in <module>
    from dalle_pytorch import DiscreteVAE, VQGanVAE
ModuleNotFoundError: No module named 'dalle_pytorch'

Setting the seed gives different results each time the code is run

When generating images setting the seed doesn't give reproducible results, the images (and clip scores) are different every time the code is run with same seed, any idea why?

RuntimeError: CUDA error: unknown error

I can't figure why i'm getting this error

python sample.py --model_path finetune.pt --batch_size 1 --num_batches 1 --text "a cyberpunk girl with a scifi neuralink device on her head"

Using device: cuda:0
Traceback (most recent call last):
  File "sample.py", line 284, in <module>
    ldm.to(device)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/core/mixins/device_dtype_mixin.py", line 121, in to
    return super().to(*args, **kwargs)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 927, in to
    return self._apply(convert)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
    module._apply(fn)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
    module._apply(fn)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
    module._apply(fn)
  [Previous line repeated 3 more times]
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 602, in _apply
    param_applied = fn(param)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 925, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: unknown error
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Trying to run with CUDA_LAUNCH_BLOCKING enabled

CUDA_LAUNCH_BLOCKING=1 python sample.py --model_path finetune.pt --batch_size 1 --num_batches 1 --text "a cyberpunk girl with a scifi neuralink device on her head"

Using device: cuda:0
Traceback (most recent call last):
  File "sample.py", line 284, in <module>
    ldm.to(device)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/pytorch_lightning/core/mixins/device_dtype_mixin.py", line 121, in to
    return super().to(*args, **kwargs)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 927, in to
    return self._apply(convert)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
    module._apply(fn)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
    module._apply(fn)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 579, in _apply
    module._apply(fn)
  [Previous line repeated 3 more times]
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 602, in _apply
    param_applied = fn(param)
  File "/home/moltenn/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 925, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
RuntimeError: CUDA error: unknown error

pip freeze

absl-py==1.1.0
aiohttp==3.8.1
aiosignal==1.2.0
albumentations==0.4.3
altair==4.2.0
antlr4-python3-runtime==4.8
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
asttokens==2.0.5
async-timeout==4.0.2
attrs==21.4.0
axial-positional-embedding==0.2.1
backcall==0.2.0
backports.zoneinfo==0.2.1
beautifulsoup4==4.11.1
bleach==5.0.1
blinker==1.4
blobfile==1.3.1
braceexpand==0.1.7
brotlipy @ file:///home/conda/feedstock_root/build_artifacts/brotlipy_1648854175163/work
cachetools==5.2.0
certifi==2022.6.15
cffi==1.15.0
charset-normalizer @ file:///home/conda/feedstock_root/build_artifacts/charset-normalizer_1655906222726/work
click==8.1.3
-e git+https://github.com/openai/CLIP.git@b46f5ac7587d2e1862f8b7b1573179d80dcdd620#egg=clip
commonmark==0.9.1
cryptography @ file:///home/conda/feedstock_root/build_artifacts/cryptography_1652967113783/work
DALL-E==0.1
dalle-pytorch==1.6.4
debugpy==1.6.0
decorator==5.1.1
defusedxml==0.7.1
einops==0.4.1
entrypoints==0.4
executing==0.8.3
fastjsonschema==2.15.3
filelock==3.7.1
frozenlist==1.3.0
fsspec==2022.5.0
ftfy==6.1.1
future==0.18.2
gitdb==4.0.9
GitPython==3.1.27
google-auth==2.9.0
google-auth-oauthlib==0.4.6
grpcio==1.47.0
-e git+https://github.com/Jack000/glid-3-xl@a0b5be4b04378d4d4779240d3e0a599360c1a133#egg=guided_diffusion
idna @ file:///home/conda/feedstock_root/build_artifacts/idna_1642433548627/work
imageio==2.9.0
imageio-ffmpeg==0.4.2
imgaug==0.2.6
importlib-metadata==4.12.0
importlib-resources==5.8.0
iniconfig==1.1.1
ipykernel==6.15.0
ipython==8.4.0
ipython-genutils==0.2.0
ipywidgets==7.7.1
jedi==0.18.1
Jinja2==3.1.2
joblib==1.1.0
jsonschema==4.6.1
jupyter-client==7.3.4
jupyter-core==4.10.0
jupyterlab-pygments==0.2.2
jupyterlab-widgets==1.1.1
-e git+https://github.com/CompVis/latent-diffusion.git@5a6571e384f9a9b492bbfaca594a2b00cad55279#egg=latent_diffusion
Markdown==3.3.7
MarkupSafe==2.1.1
matplotlib-inline==0.1.3
mistune==0.8.4
mkl-fft==1.3.1
mkl-random @ file:///tmp/build/80754af9/mkl_random_1626186064646/work
mkl-service==2.4.0
multidict==6.0.2
mypy==0.961
mypy-extensions==0.4.3
nbclient==0.6.5
nbconvert==6.5.0
nbformat==5.4.0
nest-asyncio==1.5.5
networkx==2.8.4
notebook==6.4.12
numpy @ file:///opt/conda/conda-bld/numpy_and_numpy_base_1654872176621/work
oauthlib==3.2.0
omegaconf==2.1.1
opencv-python==4.1.2.30
opencv-python-headless==4.6.0.66
packaging==21.3
pandas==1.4.3
pandocfilters==1.5.0
parso==0.8.3
pexpect==4.8.0
pickleshare==0.7.5
Pillow==9.0.1
pluggy==1.0.0
prometheus-client==0.14.1
prompt-toolkit==3.0.30
protobuf==3.19.4
psutil==5.9.1
ptyprocess==0.7.0
pudb==2019.2
pure-eval==0.2.2
py==1.11.0
pyarrow==8.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser @ file:///home/conda/feedstock_root/build_artifacts/pycparser_1636257122734/work
pycryptodomex==3.15.0
pydeck==0.7.1
pyDeprecate==0.3.2
Pygments==2.12.0
Pympler==1.0.1
pyOpenSSL @ file:///home/conda/feedstock_root/build_artifacts/pyopenssl_1643496850550/work
pyparsing==3.0.9
pyrsistent==0.18.1
PySocks @ file:///home/conda/feedstock_root/build_artifacts/pysocks_1648857275402/work
pytest==7.1.2
python-dateutil==2.8.2
pytorch-lightning==1.6.4
pytz==2022.1
pytz-deprecation-shim==0.1.0.post0
PyWavelets==1.3.0
PyYAML==6.0
pyzmq==23.2.0
regex==2022.6.2
requests @ file:///home/conda/feedstock_root/build_artifacts/requests_1656534056640/work
requests-oauthlib==1.3.1
rich==12.4.4
rotary-embedding-torch==0.1.5
rsa==4.8
sacremoses==0.0.53
scikit-image==0.19.3
scipy==1.8.1
semver==2.13.0
Send2Trash==1.8.0
six @ file:///tmp/build/80754af9/six_1644875935023/work
smmap==5.0.0
soupsieve==2.3.2.post1
stack-data==0.3.0
streamlit==1.10.0
-e git+https://github.com/CompVis/taming-transformers.git@24268930bf1dce879235a7fddd0b2355b84d7ea6#egg=taming_transformers
taming-transformers-rom1504==0.0.6
tensorboard==2.9.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
terminado==0.15.0
test-tube==0.7.5
tifffile==2022.5.4
tinycss2==1.1.1
tokenizers==0.10.3
toml==0.10.2
tomli==2.0.1
toolz==0.11.2
torch==1.12.0
torch-fidelity==0.3.0
torchaudio==0.12.0
torchmetrics==0.9.2
torchvision==0.13.0
tornado==6.1
tqdm==4.64.0
traitlets==5.3.0
transformers==4.3.1
typing-extensions @ file:///opt/conda/conda-bld/typing_extensions_1647553014482/work
tzdata==2022.1
tzlocal==4.2
urllib3 @ file:///home/conda/feedstock_root/build_artifacts/urllib3_1647489083693/work
urwid==2.1.2
validators==0.20.0
watchdog==2.1.9
wcwidth==0.2.5
webdataset==0.2.5
webencodings==0.5.1
Werkzeug==2.1.2
widgetsnbextension==3.6.1
xmltodict==0.12.0
yarl==1.7.2
youtokentome==1.0.6
zipp==3.8.0

model conversion

Thanks for the fantastic repo and models!!!!
I have a quesiton about converting model. Can you have some suggestions about converting origin (latent/stable) diffusion models to the type of this repo for finetuing.
Thanks for your answer.

License

Hi.

I know that the MIT license applies to the repository.

Does it also apply to the models produced or finetuned by you?

Thanks!

Is CLIP used in the checkpoint finetune.pt?

Hi,
Thanks for the fantastic repo and models!

It seems in the sample.py script, the model is loading CLIP and applies a CLIP embedding here

glid-3-xl/guided_diffusion/unet.py

Line 842 in a0b5be4

emb = emb + self.clip_proj(clip_embed).to(emb)

However, in the training script image_train_latent.py, I don't see any logic regarding CLIP (but image_train_inpaint.py does have). Can anyone explain more details on how finetune.pt was finetuned?

Also, it seems not much difference on the results comparing whether setting CLIP embedding to work or not.

Thanks for answering!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.