Code Monkey home page Code Monkey logo

Comments (5)

vladmandic avatar vladmandic commented on September 22, 2024

i've tried to so hard to reproduce this, both on windows and linux using exact configuration and torch version and even exact model - and i cant.

root cause is here:

Model move: device=cuda Cannot copy out of meta tensor; no data!

but meta tensors should only be used when accelerate is active and you have model offloading enabled
in which case accelerate handles moving to/from meta device.

but you don't have offloading enabled, so why is anything on meta device, i have no idea.

from automatic.

LankyPoet avatar LankyPoet commented on September 22, 2024

Thanks for trying to replicate. I just tried again with today's build. I changed to model load on start and set VAE to none to try to change some variables to retest but no luck.

Is this significant "Please use torch.nn.Module.to_empty()
instead of torch.nn.Module.to() when moving module from meta to a different device." ?
It sounds maybe like it can't complete an attempted move?

Latest log:
`D:\sdnext>set SD_MOVE_DEBUG=true

D:\sdnext>set SD_PROMPT_DEBUG=true

D:\sdnext>call webui.bat --upgrade --debug
Using VENV: D:\sdnext\venv
13:11:38-361661 INFO Starting SD.Next
13:11:38-363662 INFO Logger: file="D:\sdnext\sdnext.log" level=DEBUG size=65 mode=create
13:11:38-365164 INFO Python 3.11.9 on Windows
13:11:38-465171 INFO Version: app=sd.next updated=2024-04-24 hash=ec3d38ad branch=dev
url=https://github.com/vladmandic/automatic/tree/dev
13:11:38-792518 INFO Updating main repository
13:11:39-937939 INFO Upgraded to version: 0137331 Thu Apr 25 10:05:56 2024 -0400
13:11:39-941938 INFO Platform: arch=AMD64 cpu=Intel64 Family 6 Model 183 Stepping 1, GenuineIntel system=Windows
release=Windows-10-10.0.22631-SP0 python=3.11.9
13:11:39-942940 DEBUG Setting environment tuning
13:11:39-943938 DEBUG HF cache folder: C:\Users\Default.LivingRoomPC.cache\huggingface\hub
13:11:39-943938 DEBUG Torch overrides: cuda=False rocm=False ipex=False diml=False openvino=False
13:11:39-944939 DEBUG Torch allowed: cuda=True rocm=True ipex=True diml=True openvino=True
13:11:39-945938 INFO nVidia CUDA toolkit detected: nvidia-smi present
13:11:39-975455 INFO Startup: standard
13:11:39-976455 INFO Verifying requirements
13:11:39-987454 INFO Verifying packages
13:11:39-987454 INFO Verifying submodules
13:11:41-527886 DEBUG Submodule: extensions-builtin/sd-extension-chainner / main
13:11:41-865603 DEBUG Submodule: extensions-builtin/sd-extension-system-info / main
13:11:42-225274 DEBUG Submodule: extensions-builtin/sd-webui-agent-scheduler / main
13:11:42-591036 DEBUG Submodule: extensions-builtin/stable-diffusion-webui-rembg / master
13:11:42-937846 DEBUG Submodule: modules/k-diffusion / master
13:11:43-303029 DEBUG Submodule: wiki / master
13:11:43-681787 DEBUG Register paths
13:11:43-732787 DEBUG Installed packages: 212
13:11:43-733787 DEBUG Extensions all: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info', 'sd-webui-agent-scheduler',
'stable-diffusion-webui-rembg']
13:11:44-087760 DEBUG Submodule: extensions-builtin\sd-extension-chainner / main
13:11:44-488453 DEBUG Submodule: extensions-builtin\sd-extension-system-info / main
13:11:44-798690 DEBUG Running extension installer: D:\sdnext\extensions-builtin\sd-extension-system-info\install.py
13:11:45-066357 DEBUG Submodule: extensions-builtin\sd-webui-agent-scheduler / main
13:11:45-354935 DEBUG Running extension installer: D:\sdnext\extensions-builtin\sd-webui-agent-scheduler\install.py
13:11:45-613733 DEBUG Submodule: extensions-builtin\stable-diffusion-webui-rembg / master
13:11:45-935920 DEBUG Running extension installer: D:\sdnext\extensions-builtin\stable-diffusion-webui-rembg\install.py
13:11:46-150947 DEBUG Extensions all: ['adetailer', 'OneButtonPrompt', 'sd-dynamic-prompts']
13:11:46-201751 DEBUG Submodule: extensions\adetailer / main
13:11:46-502357 DEBUG Running extension installer: D:\sdnext\extensions\adetailer\install.py
13:11:46-795588 DEBUG Submodule: extensions\OneButtonPrompt / main
13:11:47-209381 DEBUG Submodule: extensions\sd-dynamic-prompts / main
13:11:47-562126 DEBUG Running extension installer: D:\sdnext\extensions\sd-dynamic-prompts\install.py
13:11:47-826771 INFO Extensions enabled: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info',
'sd-webui-agent-scheduler', 'stable-diffusion-webui-rembg', 'adetailer', 'OneButtonPrompt',
'sd-dynamic-prompts']
13:11:47-829771 INFO Verifying requirements
13:11:47-837770 INFO Updating Wiki
13:11:47-880887 DEBUG Submodule: D:\sdnext\wiki / master
13:11:48-206546 DEBUG Setup complete without errors: 1714065108
13:11:48-220118 DEBUG Extension preload: {'extensions-builtin': 0.0, 'extensions': 0.0}
13:11:48-221118 DEBUG Starting module: <module 'webui' from 'D:\sdnext\webui.py'>
13:11:48-223120 INFO Command line args: ['--upgrade', '--debug'] upgrade=True debug=True
13:11:48-224118 DEBUG Env flags: ['SD_MOVE_DEBUG=true', 'SD_PROMPT_DEBUG=true']
13:11:56-871705 INFO Load packages: {'torch': '2.3.0+cu121', 'diffusers': '0.27.2', 'gradio': '3.43.2'}
13:11:58-903896 INFO VRAM: Detected=23.99 GB Optimization=none
13:11:58-906895 DEBUG Read: file="config.json" json=48 bytes=2354 time=0.000
13:11:58-907895 DEBUG Unknown settings: ['chainner_models_path', 'queue_button_hide_checkpoint', 'queue_button_placement',
'queue_history_retention_days']
13:11:58-908897 INFO Engine: backend=Backend.DIFFUSERS compute=cuda device=cuda attention="Scaled-Dot-Product" mode=no_grad
13:11:58-944896 INFO Device: device=NVIDIA GeForce RTX 4090 n=1 arch=sm_90 cap=(8, 9) cuda=12.1 cudnn=8801 driver=552.22
13:11:58-950895 DEBUG Read: file="html\reference.json" json=38 bytes=22313 time=0.004
13:11:59-543982 DEBUG ONNX: version=1.17.3 provider=CUDAExecutionProvider, available=['AzureExecutionProvider',
'CPUExecutionProvider']
13:11:59-679008 TRACE Trace: PROMPT
13:11:59-708010 DEBUG Importing LDM
13:11:59-725009 DEBUG Entering start sequence
13:11:59-728010 DEBUG Initializing
13:11:59-752008 INFO Available VAEs: path="D:\SDshared\models\vae" items=8
13:11:59-754009 INFO Disabled extensions: []
13:11:59-762014 DEBUG Read: file="cache.json" json=2 bytes=102176 time=0.005
13:11:59-787522 DEBUG Read: file="metadata.json" json=2439 bytes=6770211 time=0.024
13:11:59-804522 DEBUG Scanning diffusers cache: folder=D:\SDshared\models\diffusers items=0 time=0.00
13:11:59-805522 INFO Available models: path="D:\SDshared\models\checkpoints" items=249 time=0.05
13:12:00-484638 DEBUG Load extensions
13:12:00-532639 INFO Extension: script='scripts\regional_prompting.py' [2;36m13:12:00-530637[0m[2;36m [0mTRACE Trace: PROMPT
13:12:00-657155 INFO Extension: script='extensions-builtin\Lora\scripts\lora_script.py' [2;36m13:12:00-650156[0m[2;36m
[0m[34mINFO [0m LoRA networks: [33mavailable[0m=[1;36m1131[0m [33mfolders[0m=[1;36m6[0m
13:12:01-398229 INFO Extension: script='extensions-builtin\sd-webui-agent-scheduler\scripts\task_scheduler.py' Using sqlite file:
extensions-builtin\sd-webui-agent-scheduler\task_scheduler.sqlite3
13:12:01-950065 INFO Extension: script='extensions\adetailer\scripts!adetailer.py' [-] ADetailer initialized. version: 24.4.2,
num models: 12
13:12:02-024582 INFO Extension: script='extensions\OneButtonPrompt\scripts\api.py' [2;36m13:12:01-995582[0m[2;36m [0m[32mDEBUG
[0m Read: [33mfile[0m=[32m"html[0m[32m/upscalers.json"[0m [33mjson[0m=[1;36m4[0m [33mbytes[0m=[1;36m2672[0m
[33mtime[0m=[1;36m0[0m[1;36m.004[0m
13:12:02-026583 INFO Extension: script='extensions\OneButtonPrompt\scripts\api.py' [2;36m13:12:02-000582[0m[2;36m [0m[32mDEBUG
[0m Read: [33mfile[0m=[32m"extensions[0m[32m-builtin\sd-extension-chainner\models.json"[0m
[33mjson[0m=[1;36m24[0m [33mbytes[0m=[1;36m2719[0m [33mtime[0m=[1;36m0[0m[1;36m.004[0m
13:12:02-028584 INFO Extension: script='extensions\OneButtonPrompt\scripts\api.py' [2;36m13:12:02-002583[0m[2;36m [0m[32mDEBUG
[0m chaiNNer models: [33mpath[0m=[32m"D[0m[32m:\SDshared\models\upscale_models"[0m
[33mdefined[0m=[1;36m24[0m [33mdiscovered[0m=[1;36m38[0m [33mdownloaded[0m=[1;36m48[0m
13:12:02-030583 INFO Extension: script='extensions\OneButtonPrompt\scripts\api.py' [2;36m13:12:02-008582[0m[2;36m [0m[32mDEBUG
[0m Load upscalers: [33mtotal[0m=[1;36m90[0m [33mdownloaded[0m=[1;36m52[0m [33muser[0m=[1;36m38[0m
[33mtime[0m=[1;36m0[0m[1;36m.03[0m [1m[[0m[32m'None'[0m, [32m'Lanczos'[0m, [32m'Nearest'[0m,
[32m'ChaiNNer'[0m,
13:12:02-032584 INFO Extension: script='extensions\OneButtonPrompt\scripts\api.py' [2;36m [0m
[32m'ESRGAN'[0m, [32m'LDSR'[0m, [32m'RealESRGAN'[0m, [32m'SCUNet'[0m, [32m'SD'[0m, [32m'SwinIR'[0m[1m][0m
13:12:02-034583 INFO Extension: script='extensions\OneButtonPrompt\scripts\api.py' [2;36m13:12:02-011584[0m[2;36m [0m[32mDEBUG
[0m Read: [33mfile[0m=[32m"extensions[0m[32m-builtin\sd-extension-chainner\models.json"[0m
[33mjson[0m=[1;36m24[0m [33mbytes[0m=[1;36m2719[0m [33mtime[0m=[1;36m0[0m[1;36m.000[0m
13:12:02-036583 INFO Extension: script='extensions\OneButtonPrompt\scripts\api.py' [2;36m13:12:02-013582[0m[2;36m [0m[32mDEBUG
[0m chaiNNer models: [33mpath[0m=[32m"D[0m[32m:\SDshared\models\upscale_models"[0m
[33mdefined[0m=[1;36m24[0m [33mdiscovered[0m=[1;36m38[0m [33mdownloaded[0m=[1;36m48[0m
13:12:02-037583 INFO Extension: script='extensions\OneButtonPrompt\scripts\api.py' [2;36m13:12:02-016582[0m[2;36m [0m[32mDEBUG
[0m Load upscalers: [33mtotal[0m=[1;36m90[0m [33mdownloaded[0m=[1;36m52[0m [33muser[0m=[1;36m38[0m
[33mtime[0m=[1;36m0[0m[1;36m.01[0m [1m[[0m[32m'None'[0m, [32m'Lanczos'[0m, [32m'Nearest'[0m,
[32m'ChaiNNer'[0m,
13:12:02-040584 INFO Extension: script='extensions\OneButtonPrompt\scripts\api.py' [2;36m [0m
[32m'ESRGAN'[0m, [32m'LDSR'[0m, [32m'RealESRGAN'[0m, [32m'SCUNet'[0m, [32m'SD'[0m, [32m'SwinIR'[0m[1m][0m
13:12:02-074093 INFO Extension: script='extensions\OneButtonPrompt\scripts\onebuttonprompt.py' [2;36m13:12:02-053582[0m[2;36m
[0m[32mDEBUG [0m Read: [33mfile[0m=[32m"extensions[0m[32m-builtin\sd-extension-chainner\models.json"[0m
[33mjson[0m=[1;36m24[0m [33mbytes[0m=[1;36m2719[0m [33mtime[0m=[1;36m0[0m[1;36m.000[0m
13:12:02-076097 INFO Extension: script='extensions\OneButtonPrompt\scripts\onebuttonprompt.py' [2;36m13:12:02-056582[0m[2;36m
[0m[32mDEBUG [0m chaiNNer models: [33mpath[0m=[32m"D[0m[32m:\SDshared\models\upscale_models"[0m
[33mdefined[0m=[1;36m24[0m [33mdiscovered[0m=[1;36m38[0m [33mdownloaded[0m=[1;36m48[0m
13:12:02-079097 INFO Extension: script='extensions\OneButtonPrompt\scripts\onebuttonprompt.py' [2;36m13:12:02-060587[0m[2;36m
[0m[32mDEBUG [0m Load upscalers: [33mtotal[0m=[1;36m90[0m [33mdownloaded[0m=[1;36m52[0m
[33muser[0m=[1;36m38[0m [33mtime[0m=[1;36m0[0m[1;36m.01[0m [1m[[0m[32m'None'[0m, [32m'Lanczos'[0m,
[32m'Nearest'[0m, [32m'ChaiNNer'[0m,
13:12:02-082098 INFO Extension: script='extensions\OneButtonPrompt\scripts\onebuttonprompt.py' [2;36m [0m
[32m'ESRGAN'[0m, [32m'LDSR'[0m, [32m'RealESRGAN'[0m, [32m'SCUNet'[0m, [32m'SD'[0m, [32m'SwinIR'[0m[1m][0m
13:12:02-084097 INFO Extension: script='extensions\OneButtonPrompt\scripts\onebuttonprompt.py' [2;36m13:12:02-063587[0m[2;36m
[0m[32mDEBUG [0m Read: [33mfile[0m=[32m"extensions[0m[32m-builtin\sd-extension-chainner\models.json"[0m
[33mjson[0m=[1;36m24[0m [33mbytes[0m=[1;36m2719[0m [33mtime[0m=[1;36m0[0m[1;36m.000[0m
13:12:02-086099 INFO Extension: script='extensions\OneButtonPrompt\scripts\onebuttonprompt.py' [2;36m13:12:02-065588[0m[2;36m
[0m[32mDEBUG [0m chaiNNer models: [33mpath[0m=[32m"D[0m[32m:\SDshared\models\upscale_models"[0m
[33mdefined[0m=[1;36m24[0m [33mdiscovered[0m=[1;36m38[0m [33mdownloaded[0m=[1;36m48[0m
13:12:02-088098 INFO Extension: script='extensions\OneButtonPrompt\scripts\onebuttonprompt.py' [2;36m13:12:02-070587[0m[2;36m
[0m[32mDEBUG [0m Load upscalers: [33mtotal[0m=[1;36m90[0m [33mdownloaded[0m=[1;36m52[0m
[33muser[0m=[1;36m38[0m [33mtime[0m=[1;36m0[0m[1;36m.01[0m [1m[[0m[32m'None'[0m, [32m'Lanczos'[0m,
[32m'Nearest'[0m, [32m'ChaiNNer'[0m,
13:12:02-091097 INFO Extension: script='extensions\OneButtonPrompt\scripts\onebuttonprompt.py' [2;36m [0m
[32m'ESRGAN'[0m, [32m'LDSR'[0m, [32m'RealESRGAN'[0m, [32m'SCUNet'[0m, [32m'SD'[0m, [32m'SwinIR'[0m[1m][0m
13:12:02-177612 DEBUG Extensions init time: 1.69 Lora=0.12 sd-extension-chainner=0.09 sd-webui-agent-scheduler=0.65 adetailer=0.55
OneButtonPrompt=0.14 sd-dynamic-prompts=0.07
13:12:02-179611 DEBUG Read: file="extensions-builtin\sd-extension-chainner\models.json" json=24 bytes=2719 time=0.000
13:12:02-181611 DEBUG chaiNNer models: path="D:\SDshared\models\upscale_models" defined=24 discovered=38 downloaded=48
13:12:02-184611 DEBUG Load upscalers: total=90 downloaded=52 user=38 time=0.01 ['None', 'Lanczos', 'Nearest', 'ChaiNNer',
'ESRGAN', 'LDSR', 'RealESRGAN', 'SCUNet', 'SD', 'SwinIR']
13:12:02-825017 DEBUG Load styles: folder="D:\SDshared\models\styles" items=649 time=0.64
13:12:02-844574 DEBUG Creating UI
13:12:02-845574 DEBUG UI themes available: type=Standard themes=12
13:12:02-846575 INFO UI theme: type=Standard name="timeless-beige"
13:12:02-852574 DEBUG UI theme: css="D:\sdnext\javascript\timeless-beige.css" base="sdnext.css" user="None"
13:12:02-856573 DEBUG UI initialize: txt2img
13:12:03-063606 DEBUG UI initialize: img2img
13:12:03-503367 DEBUG UI initialize: control models=D:\SDshared\models\control
13:12:04-522439 DEBUG Read: file="ui-config.json" json=26 bytes=1037 time=0.011
13:12:04-647952 DEBUG UI themes available: type=Standard themes=12
13:12:05-347445 DEBUG Extension list: processed=352 installed=8 enabled=8 disabled=0 visible=352 hidden=0
13:12:05-829229 DEBUG Root paths: ['D:\sdnext']
13:12:05-956911 INFO Local URL: http://127.0.0.1:7860/
13:12:05-958910 DEBUG Gradio functions: registered=2058
13:12:05-960910 DEBUG FastAPI middleware: ['Middleware', 'Middleware']
13:12:05-964915 DEBUG Creating API
13:12:06-152566 INFO [AgentScheduler] Task queue is empty
13:12:06-153566 INFO [AgentScheduler] Registering APIs
13:12:06-310831 DEBUG Scripts setup: ['IP Adapters:0.032', 'AnimateDiff:0.012', 'ADetailer:0.1', 'Dynamic Prompts v2.17.1:0.047',
'Prompts from File:0.005', 'X/Y/Z Grid:0.017', 'One Button Prompt:0.082', 'Face:0.018',
'Image-to-Video:0.009', 'Stable Video Diffusion:0.007']
13:12:06-312831 DEBUG Model metadata: file="metadata.json" no changes
13:12:06-313831 DEBUG Model auto load disabled
13:12:06-315831 DEBUG Script callback init time: system-info.py:app_started=0.07 task_scheduler.py:app_started=0.18
13:12:06-317832 DEBUG Save: file="config.json" json=48 bytes=2278 time=0.002
13:12:06-318830 INFO Startup time: 18.09 torch=6.62 gradio=1.96 diffusers=0.07 libraries=2.84 extensions=1.69 models=0.05
face-restore=0.68 networks=0.66 ui-en=0.37 ui-txt2img=0.16 ui-img2img=0.17 ui-control=0.21 ui-extras=0.06
ui-models=0.63 ui-settings=0.31 ui-extensions=0.56 ui-defaults=0.11 launch=0.48 api=0.10 app-started=0.25
13:12:13-702862 INFO Browser session: user=None client=127.0.0.1 agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:125.0)
Gecko/20100101 Firefox/125.0
13:12:14-064734 DEBUG UI themes available: type=Standard themes=12
13:14:00-371449 DEBUG Server: alive=True jobs=0 requests=205 uptime=123 memory=0.94/79.85 backend=Backend.DIFFUSERS state=idle
13:14:18-753904 INFO Select: model="AA-RealismSDXL\fullyREALXL_v10Perfect10n [859484a7eb]"
13:14:18-762904 DEBUG Load model: existing=False
target=D:\SDshared\models\checkpoints\AA-RealismSDXL\fullyREALXL_v10Perfect10n.safetensors info=None
13:14:18-934676 DEBUG Desired Torch parameters: dtype=FP16 no-half=False no-half-vae=False upscast=False
13:14:18-935676 INFO Setting Torch parameters: device=cuda dtype=torch.float16 vae=torch.float16 unet=torch.float16
context=inference_mode fp16=True bf16=None optimization=Scaled-Dot-Product
13:14:18-936676 INFO Loading VAE: model=D:\SDshared\models\vae\SDXL\sdxl-vae-fp16-fix.safetensors source=settings
13:14:18-939676 DEBUG Diffusers VAE load config: {'low_cpu_mem_usage': False, 'torch_dtype': torch.float16, 'use_safetensors':
True, 'variant': 'fp16'}
13:14:18-944676 INFO Autodetect: vae="Stable Diffusion XL" class=StableDiffusionXLPipeline
file="D:\SDshared\models\checkpoints\AA-RealismSDXL\fullyREALXL_v10Perfect10n.safetensors" size=6617MB
13:14:19-065185 DEBUG Diffusers loading:
path="D:\SDshared\models\checkpoints\AA-RealismSDXL\fullyREALXL_v10Perfect10n.safetensors"
13:14:19-066691 INFO Autodetect: model="Stable Diffusion XL" class=StableDiffusionXLPipeline
file="D:\SDshared\models\checkpoints\AA-RealismSDXL\fullyREALXL_v10Perfect10n.safetensors" size=6617MB
13:14:21-154312 DEBUG Setting model: pipeline=StableDiffusionXLPipeline config={'low_cpu_mem_usage': True, 'torch_dtype':
torch.float16, 'load_connected_pipeline': True, 'extract_ema': False, 'original_config_file':
'configs/sd_xl_base.yaml', 'use_safetensors': True}
13:14:21-163313 DEBUG Setting model VAE: name=sdxl-vae-fp16-fix.safetensors
13:14:21-167834 DEBUG Setting model: enable VAE slicing
13:14:21-185911 TRACE Model move: device=cuda class=<class
'diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl.StableDiffusionXLPipeline'>
accelerate=False fn=load_diffuser
13:14:24-572490 ERROR Model move: device=cuda Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty()
instead of torch.nn.Module.to() when moving module from meta to a different device.
13:14:25-348011 INFO Load embeddings: loaded=28 skipped=28 time=0.77
13:14:25-685633 DEBUG GC: collected=185 device=cuda {'ram': {'used': 2.57, 'total': 79.85}, 'gpu': {'used': 3.32, 'total': 23.99},
'retries': 0, 'oom': 0} time=0.34
13:14:25-693640 INFO Load model: time=6.58 load=6.58 native=1024 {'ram': {'used': 2.57, 'total': 79.85}, 'gpu': {'used': 3.32,
'total': 23.99}, 'retries': 0, 'oom': 0}
13:14:25-697641 DEBUG Setting changed: sd_model_checkpoint=AA-RealismSDXL\fullyREALXL_v10Perfect10n [859484a7eb] progress=True
13:14:25-699639 DEBUG Save: file="config.json" json=48 bytes=2281 time=0.003
13:14:30-306558 INFO Select: model="AA-RealismSDXL\epicrealismXL_v6Miracle [81877f3be6]"
13:14:30-314558 DEBUG Load model: existing=False
target=D:\SDshared\models\checkpoints\AA-RealismSDXL\epicrealismXL_v6Miracle.safetensors info=None
13:14:30-321558 TRACE Model move: device=cpu class=<class
'diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl.StableDiffusionXLPipeline'>
accelerate=False fn=reload_model_weights
13:14:31-475354 ERROR Model move: device=cpu Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty()
instead of torch.nn.Module.to() when moving module from meta to a different device.
13:14:31-479277 TRACE Model move: device=meta class=<class
'diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl.StableDiffusionXLPipeline'>
accelerate=False fn=unload_model_weights
13:14:32-093587 DEBUG GC: collected=480 device=cuda {'ram': {'used': 2.57, 'total': 79.85}, 'gpu': {'used': 1.59, 'total': 23.99},
'retries': 0, 'oom': 0} time=0.31
13:14:32-100588 DEBUG Unload weights model: {'ram': {'used': 2.57, 'total': 79.85}, 'gpu': {'used': 1.59, 'total': 23.99},
'retries': 0, 'oom': 0}
13:14:32-140622 DEBUG Desired Torch parameters: dtype=FP16 no-half=False no-half-vae=False upscast=False
13:14:32-144581 INFO Setting Torch parameters: device=cuda dtype=torch.float16 vae=torch.float16 unet=torch.float16
context=inference_mode fp16=True bf16=None optimization=Scaled-Dot-Product
13:14:32-148577 INFO Loading VAE: model=D:\SDshared\models\vae\SDXL\sdxl-vae-fp16-fix.safetensors source=settings
13:14:32-150591 DEBUG Diffusers VAE load config: {'low_cpu_mem_usage': False, 'torch_dtype': torch.float16, 'use_safetensors':
True, 'variant': 'fp16'}
13:14:32-151593 INFO Autodetect: vae="Stable Diffusion XL" class=StableDiffusionXLPipeline
file="D:\SDshared\models\checkpoints\AA-RealismSDXL\epicrealismXL_v6Miracle.safetensors" size=6618MB
13:14:32-246080 DEBUG Diffusers loading: path="D:\SDshared\models\checkpoints\AA-RealismSDXL\epicrealismXL_v6Miracle.safetensors"
13:14:32-248080 INFO Autodetect: model="Stable Diffusion XL" class=StableDiffusionXLPipeline
file="D:\SDshared\models\checkpoints\AA-RealismSDXL\epicrealismXL_v6Miracle.safetensors" size=6618MB
13:14:33-507568 DEBUG Setting model: pipeline=StableDiffusionXLPipeline config={'low_cpu_mem_usage': True, 'torch_dtype':
torch.float16, 'load_connected_pipeline': True, 'extract_ema': False, 'original_config_file':
'configs/sd_xl_base.yaml', 'use_safetensors': True}
13:14:33-512568 DEBUG Setting model VAE: name=sdxl-vae-fp16-fix.safetensors
13:14:33-514570 DEBUG Setting model: enable VAE slicing
13:14:33-533563 TRACE Model move: device=cuda class=<class
'diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl.StableDiffusionXLPipeline'>
accelerate=False fn=load_diffuser
13:14:35-811380 ERROR Model move: device=cuda Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty()
instead of torch.nn.Module.to() when moving module from meta to a different device.
13:14:36-144532 INFO Load embeddings: loaded=28 skipped=28 time=0.33
13:14:36-466481 DEBUG GC: collected=127 device=cuda {'ram': {'used': 3.34, 'total': 79.85}, 'gpu': {'used': 3.32, 'total': 23.99},
'retries': 0, 'oom': 0} time=0.32
13:14:36-477981 INFO Load model: time=4.01 load=4.01 native=1024 {'ram': {'used': 3.34, 'total': 79.85}, 'gpu': {'used': 3.32,
'total': 23.99}, 'retries': 0, 'oom': 0}
13:14:36-482985 DEBUG Setting changed: sd_model_checkpoint=AA-RealismSDXL\epicrealismXL_v6Miracle [81877f3be6] progress=True
13:14:36-483987 DEBUG Save: file="config.json" json=48 bytes=2279 time=0.002
13:14:44-519438 INFO Available VAEs: path="D:\SDshared\models\vae" items=8
13:14:52-281645 TRACE Model move: device=cuda class=<class
'diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl.StableDiffusionXLPipeline'>
accelerate=False fn=reload_vae_weights
13:14:52-285643 ERROR Model move: device=cuda Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty()
instead of torch.nn.Module.to() when moving module from meta to a different device.
13:14:52-287643 INFO Settings: changed=2 ['sd_vae', 'sd_checkpoint_autoload']
13:14:52-289646 DEBUG Save: file="config.json" json=47 bytes=2219 time=0.002
13:14:59-683879 DEBUG serialize sampler index: 10 as DPM SDE
13:14:59-694388 INFO [AgentScheduler] Total pending tasks: 1
13:14:59-700897 INFO [AgentScheduler] Executing task task(n4tuan9mjyn679j)
13:14:59-736896 TRACE Model move: device=cuda class=<class
'diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl.StableDiffusionXLPipeline'>
accelerate=False fn=process_diffusers
13:14:59-784899 ERROR Model move: device=cuda Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty()
instead of torch.nn.Module.to() when moving module from meta to a different device.
13:14:59-790898 INFO Base: class=StableDiffusionXLPipeline
13:14:59-939025 DEBUG Sampler: sampler="DPM SDE" config={'num_train_timesteps': 1000, 'beta_start': 0.00085, 'beta_end': 0.012,
'beta_schedule': 'scaled_linear', 'prediction_type': 'epsilon', 'use_karras_sigmas': True,
'noise_sampler_seed': None, 'timestep_spacing': 'leading', 'steps_offset': 1}
13:14:59-961021 TRACE Prompt: convert=test
13:14:59-963022 TRACE Prompt: parser=Full parser [['test', 1.0]]
13:14:59-964024 TRACE Prompt: weights=[['test', 1.0]]
13:14:59-972015 TRACE Prompt: convert=test
13:14:59-974021 TRACE Prompt: parser=Full parser [['test', 1.0]]
13:14:59-975021 TRACE Prompt: weights=[['test', 1.0]]
13:14:59-985031 TRACE Prompt: convert=
13:14:59-986030 TRACE Prompt: parser=Full parser [['', 1.0]]
13:14:59-990032 TRACE Prompt: weights=[['', 1.0]]
13:14:59-998552 TRACE Prompt: convert=
13:15:00-001046 TRACE Prompt: parser=Full parser [['', 1.0]]
13:15:00-002026 TRACE Prompt: weights=[['', 1.0]]
13:15:00-004028 TRACE Prompt: section="['test']" len=1 weights=[1.0]
13:15:00-536824 TRACE Prompt: positive unpadded shape = torch.Size([1, 77, 768])
13:15:00-622943 TRACE Prompt: section="['test']" len=1 weights=[1.0]
13:15:00-794024 TRACE Prompt: positive unpadded shape = torch.Size([1, 77, 768])
13:15:00-935317 TRACE Prompt: shape=torch.Size([1, 77, 2048]) negative=torch.Size([1, 77, 2048])
13:15:00-937314 TRACE Prompt Parser: Elapsed Time 0.9872934818267822
13:15:00-938318 DEBUG Diffuser pipeline: StableDiffusionXLPipeline task=DiffusersTaskType.TEXT_2_IMAGE set={'prompt_embeds':
torch.Size([1, 77, 2048]), 'pooled_prompt_embeds': torch.Size([1, 1280]), 'negative_prompt_embeds':
torch.Size([1, 77, 2048]), 'negative_pooled_prompt_embeds': torch.Size([1, 1280]), 'guidance_scale': 4.5,
'generator': device(type='cuda'), 'num_inference_steps': 40, 'eta': 1.0, 'guidance_rescale': 0.7,
'denoising_end': None, 'output_type': 'latent', 'width': 896, 'height': 1152, 'parser': 'Full parser'}
13:15:01-066673 TRACE Model move: device=cuda class=<class
'diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl.StableDiffusionXLPipeline'>
accelerate=False fn=process_diffusers
13:15:01-097183 ERROR Model move: device=cuda Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty()
instead of torch.nn.Module.to() when moving module from meta to a different device.
Progress ?it/s 0% 0/40 00:00 ? Base
13:15:01-248642 ERROR Processing: args={'prompt_embeds': tensor([[[-3.8418, -2.4258, 4.3945, ..., 0.1616, 0.4128, -0.2703],
[-1.2676, -0.1746, -0.7104, ..., 0.9365, 0.9688, -0.9731],
[ 0.0283, -0.0571, -0.6982, ..., -0.0664, -0.8818, 0.0356],
...,
[-0.3418, -0.1042, -0.8555, ..., 0.4211, 0.1763, 0.7378],
[-0.3345, -0.1115, -0.8457, ..., 0.3616, 0.0742, 0.7363],
[-0.3525, -0.0671, -0.8452, ..., 0.4004, 0.1572, 0.7788]]],
device='cuda:0', dtype=torch.float16), 'pooled_prompt_embeds': tensor([[-1.3971e-03, -4.3506e-01,
2.2644e-01, ..., -8.8818e-01,
-2.1230e+00, 5.8411e-02]], device='cuda:0', dtype=torch.float16), 'negative_prompt_embeds':
tensor([[[-3.8418, -2.4258, 4.3945, ..., 0.1616, 0.4128, -0.2703],
[-0.3306, -0.5435, -0.4988, ..., 0.3655, -0.5273, 0.7349],
[-0.4092, -0.5815, -0.4973, ..., -0.3997, 0.2234, 0.0156],
...,
[-0.0163, -0.2251, -0.4456, ..., 0.2817, 0.2285, 0.6660],
[-0.0290, -0.2201, -0.4351, ..., 0.2432, 0.0996, 0.6514],
[-0.0563, -0.1396, -0.4011, ..., 0.2544, 0.2393, 0.6665]]],
device='cuda:0', dtype=torch.float16), 'negative_pooled_prompt_embeds': tensor([[-0.4978, 0.3616,
-0.7021, ..., -0.6875, -0.9780, 1.1533]],
device='cuda:0', dtype=torch.float16), 'guidance_scale': 4.5, 'generator': [<torch._C.Generator
object at 0x000001BD287E2510>], 'callback_on_step_end': <function
process_diffusers..diffusers_callback at 0x000001BD2AC9F1A0>, 'callback_on_step_end_tensor_inputs':
['latents', 'prompt_embeds', 'negative_prompt_embeds', 'add_text_embeds', 'add_time_ids',
'negative_pooled_prompt_embeds', 'negative_add_time_ids'], 'num_inference_steps': 40, 'eta': 1.0,
'guidance_rescale': 0.7, 'denoising_end': None, 'output_type': 'latent', 'width': 896, 'height': 1152}
Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when
checking argument for argument mat1 in method wrapper_CUDA_addmm)
13:15:01-269546 ERROR Processing: RuntimeError
╭──────────────────────────────────────────────── Traceback (most recent call last) ────────────────────────────────────────────────╮
│ D:\sdnext\modules\processing_diffusers.py:445 in process_diffusers │
│ │
│ 444 │ │ │ p.extra_generation_params['HiDiffusion'] = f'{shared.opts.hidiffusion_raunet}/{shared.opts.hidiffusion_aggressi │
│ ❱ 445 │ │ output = shared.sd_model(**base_args) # pylint: disable=not-callable │
│ 446 │ │ if isinstance(output, dict): │
│ │
│ D:\sdnext\venv\Lib\site-packages\torch\utils_contextlib.py:115 in decorate_context │
│ │
│ 114 │ │ with ctx_factory(): │
│ ❱ 115 │ │ │ return func(*args, **kwargs) │
│ 116 │
│ │
│ D:\sdnext\venv\Lib\site-packages\diffusers\pipelines\stable_diffusion_xl\pipeline_stable_diffusion_xl.py:1174 in call
│ │
│ 1173 │ │ │ │ │ added_cond_kwargs["image_embeds"] = image_embeds │
│ ❱ 1174 │ │ │ │ noise_pred = self.unet( │
│ 1175 │ │ │ │ │ latent_model_input, │
│ │
│ D:\sdnext\venv\Lib\site-packages\torch\nn\modules\module.py:1532 in _wrapped_call_impl │
│ │
│ 1531 │ │ else: │
│ ❱ 1532 │ │ │ return self._call_impl(*args, **kwargs) │
│ 1533 │
│ │
│ D:\sdnext\venv\Lib\site-packages\torch\nn\modules\module.py:1541 in _call_impl │
│ │
│ 1540 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1541 │ │ │ return forward_call(*args, **kwargs) │
│ 1542 │
│ │
│ ... 2 frames hidden ... │
│ │
│ D:\sdnext\venv\Lib\site-packages\torch\nn\modules\module.py:1541 in _call_impl │
│ │
│ 1540 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1541 │ │ │ return forward_call(*args, **kwargs) │
│ 1542 │
│ │
│ D:\sdnext\venv\Lib\site-packages\diffusers\models\embeddings.py:227 in forward │
│ │
│ 226 │ │ │ sample = sample + self.cond_proj(condition) │
│ ❱ 227 │ │ sample = self.linear_1(sample) │
│ 228 │
│ │
│ D:\sdnext\venv\Lib\site-packages\torch\nn\modules\module.py:1532 in _wrapped_call_impl │
│ │
│ 1531 │ │ else: │
│ ❱ 1532 │ │ │ return self._call_impl(*args, **kwargs) │
│ 1533 │
│ │
│ D:\sdnext\venv\Lib\site-packages\torch\nn\modules\module.py:1541 in _call_impl │
│ │
│ 1540 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ ❱ 1541 │ │ │ return forward_call(*args, **kwargs) │
│ 1542 │
│ │
│ D:\sdnext\venv\Lib\site-packages\torch\nn\modules\linear.py:116 in forward │
│ │
│ 115 │ def forward(self, input: Tensor) -> Tensor: │
│ ❱ 116 │ │ return F.linear(input, self.weight, self.bias) │
│ 117 │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)
13:15:02-055835 INFO Processed: images=0 time=2.34 its=0.00 memory={'ram': {'used': 3.84, 'total': 79.85}, 'gpu': {'used': 3.34,
'total': 23.99}, 'retries': 0, 'oom': 0}
13:15:02-107181 INFO [AgentScheduler] Task queue is empty
`

from automatic.

vladmandic avatar vladmandic commented on September 22, 2024

Is this significant "Please use torch.nn.Module.to_empty()
instead of torch.nn.Module.to() when moving module from meta to a different device." ?
It sounds maybe like it can't complete an attempted move?

yes, that's exactly what i was talking about.
tensor of type meta has no weights, only params, so it cannot be moved.
but there should be no such tensors when running without offloading - all tensors should already be initialized to zeros.

only when using offloading, accelerate flags empty tensors as meta tensors to save a bit of memory.
and meta tensors cannot be directly moved, they need to be zeroed-out before move - that's what accelerate does during offloading.

since there is no offloading, nothing does that so move fails. but you should not have any meta tensors to start with. where are they coming from? if i could reproduce, i could maybe somehow handle them. but its impossible to find them otherwise - there are thousands of tensors in a model.

from automatic.

LankyPoet avatar LankyPoet commented on September 22, 2024

Thanks again for looking into. I created one more brand new install with brand new dev release and now everything is fine for me. No further need to spin your wheels - will close this out.

from automatic.

vladmandic avatar vladmandic commented on September 22, 2024

I really wish I knew what changed?

from automatic.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.