Comments (22)
Also , in ComfyUI device can be detected like this(from model_management.py):
import comfy.model_management as mm device = mm.get_torch_device() if mm.is_device_mps(device): pass if mm.is_device_cuda(device): pass
I can make a PR for this and other parts related to MPS a little bit later if you wish :)
from comfyui_llm_party.
auto-gptq is a library used to invoke the Qwen model. It’s quite late here now, and in the next few days, I will adjust the way Qwen is invoked and try to find a way around this auto-gptq library.
from comfyui_llm_party.
I'm not in a rush at all, so it's okay.
As far as I remember, Qwen is almost no different from other models, and I was already able to load Llama without “auto-gptq” into node on macOS (I had to change some small things for this).
I do not see use of auto-gptq, maybe I am too tired already or blind )
Lines 736 to 760 in b71c58f
from comfyui_llm_party.
I found that when I was invoking Qwen’s GPTQ model, there was an error due to a missing third-party library, so I added this library to the requirements at that time. If AutoGPTQ is directly removed, I believe the unquantized Qwen should be able to run (but I don’t have enough VRAM, haha), I considered that most users might face the issue of insufficient VRAM to use the quantized model, so I included AutoGPTQ in the requirements. I am willing to sacrifice the need for the quantized model for the sake of cross-platform requirements, and I will temporarily remove the AutoGPTQ library; the code should still be able to run correctly. As for how to invoke the quantized model, I will think of a solution.
from comfyui_llm_party.
I have already removed AutoGPTQ from the requirements. Seeing that you had to make some minor adjustments, I want to know if by removing AutoGPTQ, I have avoided those changes. Are there any other adjustments needed?
from comfyui_llm_party.
As for how to invoke the quantized model, I will think of a solution.
You can probably mention this in the readme, something like:
"if you want to support quantized models, you need to install auto-gptq"
How ComfyUI mentions different packages for different video cards/OS in it's Readme file.
If needed it is simple to check if auto-gptq
can be imported and present.
Are there any other adjustments needed?
In LLM_local class to device
need be added mps
Autodetect of default can be changed to this:
"default": "cuda" if torch.cuda.is_available() else ("mps" if torch.backends.mps.is_available() else "cpu"),
MPS is the same device as CUDA, but last PyTorch(2.3) supports only fp32/fp16 and maybe int8
on it.
Imho, from start only supporting fp32
will be enough as there are usually many RAM in MacBooks, and support for fp16
/int8
can be added later.
from comfyui_llm_party.
Also , in ComfyUI device can be detected like this(from model_management.py):
import comfy.model_management as mm
device = mm.get_torch_device()
if mm.is_device_mps(device):
pass
if mm.is_device_cuda(device):
pass
from comfyui_llm_party.
Thank you for your contribution! I have just freed myself from the tedious work, and I will take the time to modify the issues you mentioned, although there might be a bit of procrastination haha. Thank you again for your dedication to this project!
from comfyui_llm_party.
此外,在ComfyUI中可以像这样检测到设备(从model_management.py):
import comfy.model_management as mm device = mm.get_torch_device() if mm.is_device_mps(device): pass if mm.is_device_cuda(device): pass如果您愿意,我可以稍后为这个以及与 MPS 相关的其他部分制作 PR:)
I missed this message earlier, but if you could provide a PR for these parts, that would be really great. Thank you for your help and support!
from comfyui_llm_party.
I have already added the code related to MPS, but I do not have the relevant equipment on hand to test it. If you find any issues with this part of the code, please let me know at any time.
from comfyui_llm_party.
AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True).mps() # there is no ".mps() function"
should be
llama_model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True).to(device)
or
llama_model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True).to("mps")
After that it is partially working with TinyLlama model for one run(when I just open flow and execute it works)
But when I press "Queue prompt" second time I get:
cannot access local variable 'llama_device' where it is not associated with a value
currently looking into this, soon will say why
from comfyui_llm_party.
When is_reload
is False
the llama_device
is undefined on the second run.
So it is not related to mps
, the same error will be for cuda
too.
I have a question related to this: why are glm_tokenizer
, llama_tokenizer
, qwen_tokenizer
- is not a one variable(model_tokenizer
)?
The same question is for glm_model
, llama_model
...
Also this local variable llama_device
(gem_device
) can be unified in one variable too, and need to be global.
Or there was a specific reason for this?
from comfyui_llm_party.
fix this bug!
from comfyui_llm_party.
Today, while modifying the code, I wrote the assignment for llama_device inside the if llama_model == "": block, which resulted in llama_device not being assigned when the model was not unloaded the second time.
from comfyui_llm_party.
I’ve made modifications to the code related to MPS compatibility. A new issue has arisen: when I introduce int8 and int4 precision, I have to use the bitsandbytes library, which is not adapted for MPS. Currently, I prevent the bitsandbytes library from being imported when the device does not have CUDA. However, I’m still concerned that macOS users might encounter environment dependency errors when bitsandbytes is installed.Could you please help me test it?Thank you very much!
from comfyui_llm_party.
was no time during week, were too much work.
will check all this at these weekends )
from comfyui_llm_party.
I really, really appreciate you! There's nothing more joyful than completing an interesting project with like-minded individuals. For this reason, I've even hidden an easter egg in the project as a little surprise, haha!
from comfyui_llm_party.
Tested repo with latest commits, on MPS it works
But now on the cpu
I get Placeholder storage has not been allocated on MPS device!
.
Will we close this problem, like for MPS, and create a new one for CPU?
Testing on AMD (CUDA version) - also expected today.
from comfyui_llm_party.
But now on the
cpu
I getPlaceholder storage has not been allocated on MPS device!
.
this happens only if first time generate on mps
and after that switch to cpu
.
or do the opposite, first time generate on cpu
and after that switch to mps
- will be the same error
Error/Warning:
You are calling .generate() with the input_ids
being on a device type different than your model's device. input_ids
is on mps, whereas the model is on cpu. You may experience unexpected behaviors or slower generation. Please make sure that you have put input_ids
to the correct device by calling for example input_ids = input_ids.to('cpu') before running .generate()
from comfyui_llm_party.
Yes, actually there is a problem with the code I wrote. I only accounted for loading the model onto the corresponding device when it’s not loaded, but I didn’t write the code to switch devices after it’s loaded. I admit I was being lazy because I originally thought no one would load a model and then switch devices, haha. I’ll immediately add the code for this part.
from comfyui_llm_party.
Fixed a bug where switching devices, dtypes, or model types would cause the model to throw an error. Now, when users switch these parameters, the model will reload.
from comfyui_llm_party.
Thank you, tested and it works now on macOS on both cpu
and mps
:)
from comfyui_llm_party.
Related Issues (20)
- load video HOT 1
- centos平台报错:ComfyUI/custom_nodes/comfyui_LLM_party module for custom nodes: [Errno 2] No such file or di rectory: 'dpkg' HOT 6
- 有没有豆包平台工作流的例子,配完config.ini不知道怎么连工作流 HOT 2
- 使用本地模型生成提示词后,继续文生图显存不足 HOT 2
- 导入 amap 时出错:cannot import name 'language' from 'config' HOT 3
- 'Chat' object has no attribute 'parameters' HOT 1
- 何不加入 G4F HOT 2
- 關於一些workflow更新 HOT 2
- Excel Parsing of info from columns HOT 3
- Unable to unload ollama model after a query using keep_alive option. HOT 5
- No module named 'langchain_community' HOT 1
- 模型加载器里的对话框文字不能编辑。 HOT 6
- Model selection uses a drop-down list HOT 1
- Cannot load Mistral Nemo in gguf format, asks for config.json HOT 1
- Is there a way to randomize and/or specify the seed in LLM? HOT 1
- Please use `torch.amp.autocast('cuda', args...)` instead. HOT 1
- How to do RAG / how to make the embedding tool work? HOT 2
- Ask for help: failed to enable LLM part HOT 2
- ModuleNotFoundError: No module named 'server' HOT 2
- ComfyUI update broke LLM Party HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from comfyui_llm_party.