System Info / 系統信息 2080ti多卡 Who can help? / 谁可以

关于多卡部署 about chatglm3 HOT 3 CLOSED

Anfeather commented on June 24, 2024

关于多卡部署

from chatglm3.

Comments (3)

zRzRzRzRzRzRzR commented on June 24, 2024

多卡推理的话，你这个是哪个代码，而且我看了你这个错误好像是驱动和cuda级的，不是代码错误，请你发个完整的官方代码运行的位置和完整报错

from chatglm3.

Anfeather commented on June 24, 2024

多卡推理的话，你这个是哪个代码，而且我看了你这个错误好像是驱动和cuda级的，不是代码错误，请你发个完整的官方代码运行的位置和完整报错

我找到出bug的原因了，当我在一个py文件中同时导入blip2和glm-6b模型时，就会报错，如果只是导入单一模型则没有问题。相关代码如下：
local_path = "./blip2-opt-2.7b"
processor = Blip2Processor.from_pretrained(local_path)
model_large = Blip2ForConditionalGeneration.from_pretrained(
local_path, torch_dtype=torch.float16, device_map="auto"
)
model_large.eval()

tokenizer = AutoTokenizer.from_pretrained("./chatglm3-6b", trust_remote_code=True)
model_GLM = AutoModel.from_pretrained("./chatglm3-6b", trust_remote_code=True, device_map="auto")
model_GLM = model_GLM.eval()
generated_text_GLM, history = model_GLM.chat(tokenizer, "你好", history=[])

完整报错如下：

Traceback (most recent call last):
File "/home2/an/project/DataShunt+/image_caption/a-PyTorch-Tutorial-to-Image-Captioning-master-2/eval_DS_PT.py", line 73, in
generated_text_GLM, history = model_GLM.chat(tokenizer, "你好", history=[])
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 1042, in chat
outputs = self.generate(**inputs, **gen_kwargs, eos_token_id=eos_token_id)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/transformers/generation/utils.py", line 1452, in generate
return self.sample(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/transformers/generation/utils.py", line 2468, in sample
outputs = self(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 941, in forward
transformer_outputs = self.transformer(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 834, in forward
hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 641, in forward
layer_ret = layer(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 544, in forward
attention_output, kv_cache = self.self_attention(
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home2/an/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 376, in forward
mixed_x_layer = self.query_key_value(hidden_states)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home2/an/anaconda3/envs/GLM1/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 103, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)

from chatglm3.

zRzRzRzRzRzRzR commented on June 24, 2024

嗯那，正常是单独导入，不然分配可能出现问题

from chatglm3.

关于多卡部署 about chatglm3 HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent