Comments (3)
多卡推理的话,你这个是哪个代码,而且我看了你这个错误好像是驱动和cuda级的,不是代码错误,请你发个完整的官方代码运行的位置和完整报错
from chatglm3.
多卡推理的话,你这个是哪个代码,而且我看了你这个错误好像是驱动和cuda级的,不是代码错误,请你发个完整的官方代码运行的位置和完整报错
我找到出bug的原因了,当我在一个py文件中同时导入blip2和glm-6b模型时,就会报错,如果只是导入单一模型则没有问题。相关代码如下:
local_path = "./blip2-opt-2.7b"
processor = Blip2Processor.from_pretrained(local_path)
model_large = Blip2ForConditionalGeneration.from_pretrained(
local_path, torch_dtype=torch.float16, device_map="auto"
)
model_large.eval()
tokenizer = AutoTokenizer.from_pretrained("./chatglm3-6b", trust_remote_code=True)
model_GLM = AutoModel.from_pretrained("./chatglm3-6b", trust_remote_code=True, device_map="auto")
model_GLM = model_GLM.eval()
generated_text_GLM, history = model_GLM.chat(tokenizer, "你好", history=[])
完整报错如下:
Traceback (most recent call last):
File "/home2/an/project/DataShunt+/image_caption/a-PyTorch-Tutorial-to-Image-Captioning-master-2/eval_DS_PT.py", line 73, in
generated_text_GLM, history = model_GLM.chat(tokenizer, "你好", history=[])
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 1042, in chat
outputs = self.generate(**inputs, **gen_kwargs, eos_token_id=eos_token_id)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/transformers/generation/utils.py", line 1452, in generate
return self.sample(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/transformers/generation/utils.py", line 2468, in sample
outputs = self(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 941, in forward
transformer_outputs = self.transformer(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 834, in forward
hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 641, in forward
layer_ret = layer(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 544, in forward
attention_output, kv_cache = self.self_attention(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 376, in forward
mixed_x_layer = self.query_key_value(hidden_states)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 103, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)
from chatglm3.
嗯那,正常是单独导入,不然分配可能出现问题
from chatglm3.
Related Issues (20)
- problems when finetuning with lora HOT 3
- 【求助】关于算法备案的问题
- 进行p-tuning-v2微调时,报如下错误 HOT 4
- 拼接格式与 chat 接口处理逻辑是否冲突 HOT 2
- lora微调报错 HOT 1
- RMSNorm的不同实现方式
- LORA 微调报错 HOT 1
- bug when running inference_hf.py after finetuning with lora.
- api模式下是不能实现工具调用吗? HOT 1
- 执行了composition_demo,发现仅仅用了一个cpu去执行推理。速度非常慢。这个是什么原因导致的? HOT 1
- 运行basic_demo下的web_demo_gradio.py程序报错ModuleNotFoundError: No module named 'peft' HOT 3
- langchain_demo中的那个是不是不是流式处理?
- ChatGLMForConditionalGeneration forward position_ids 参数哪里传入的?
- 从 chat 接口内部调用 generate 接口的处理逻辑看,使用上述拼接方式生成的 input_ids 不符合你们对于特殊符(如<|user|>、<|assistant|>)的 id 定义,这部分是否只是为了兼容通用的 generate 接口?且存在对模型性能的损失? HOT 3
- 使用chatglm.cpp调用chatglm3-6b-32k时,非常容易触发无限循环的问题,repetition_penalty设置为2依然没有效果 HOT 1
- openai_api_request.py运行不成功 HOT 1
- Getting requirements to build wheel did not run successfully. HOT 1
- ChatGLM3-6b微调之后再运行就会出错 HOT 1
- 但是这样做代码会自动复原,加入的代码就被冲刷掉了> 请问你解决了吗?
- 使用finetune_demo/finetune_hf.py的lora微调之后,用finetune_demo/inference_hf.py做推理了的时候加载权重后无反应 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chatglm3.