模型部署后调用了几百次没问题但再调用就报了这个错误 ERROR: Exception in ASGI application Traceback (

<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

RuntimeError: CUDA error: device-side assert triggered about chatglm3 HOT 22 CLOSED

thudm commented on June 22, 2024 4

RuntimeError: CUDA error: device-side assert triggered

from chatglm3.

Comments (22)

zRzRzRzRzRzRzR commented on June 22, 2024 2

mark，我们抓紧解决这个问题

from chatglm3.

hxujal commented on June 22, 2024 1

我也出现了同样的问题

from chatglm3.

xx1906 commented on June 22, 2024 1

../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [0,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [1,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [2,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [3,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [4,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [5,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [6,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [7,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [8,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [9,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [10,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [11,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [12,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [13,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [14,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [15,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [16,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [17,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [18,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [19,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [20,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [21,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [22,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [23,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [24,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [25,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [26,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [27,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [28,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [29,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [30,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [31,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

类似的问题

from chatglm3.

jeinlee1991 commented on June 22, 2024

File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 1038, in chat
    outputs = self.generate(**inputs, **gen_kwargs, eos_token_id=eos_token_id)
  File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/generation/utils.py", line 1522, in generate
    return self.greedy_search(
  File "/opt/conda/lib/python3.10/site-packages/transformers/generation/utils.py", line 2339, in greedy_search
    outputs = self(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 937, in forward
    transformer_outputs = self.transformer(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 830, in forward
    hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 640, in forward
    layer_ret = layer(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 542, in forward
    layernorm_output = self.input_layernorm(hidden_states)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 190, in forward
    hidden_states = hidden_states * torch.rsqrt(variance + self.eps)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

我也出现了同样的问题

from chatglm3.

Hcnaeg commented on June 22, 2024

同样问题

from chatglm3.

hubo0417 commented on June 22, 2024

怎么解决呢？

from chatglm3.

BeiJingChengXiTech commented on June 22, 2024

glm官方，能快速解决这个问题不？？？

from chatglm3.

YiFraternity commented on June 22, 2024

是这样的，也是遇到这个问题

from chatglm3.

lanxinlo commented on June 22, 2024

我也遇到这个问题怎么解决哇

from chatglm3.

qzl164 commented on June 22, 2024

同样遇到了，求解

from chatglm3.

Btlmd commented on June 22, 2024

目前我暂时没能复现这个问题。这里 CUDA 的报错是异步的，是否有人可以在设置了 CUDA_LAUNCH_BLOCKING=1 环境变量的情况下，定位一下问题的产生位置？

from chatglm3.

Btlmd commented on June 22, 2024

我在一种情况下复现了这种问题。一种可能的原因是输入序列的长度超过了模型的 position embedding 的最大长度，造成索引时超范围了。

在 composite_demo 中引入了更加友善的错误提示。

from chatglm3.

tjgaozw commented on June 22, 2024

我也出现了这个问题，出现了很多次，多轮对话后出现，显存也没有爆

from chatglm3.

worm128 commented on June 22, 2024

一样的问题，用model.stream_chat流输出字符串没问题，用model.chat整段字符串输出就报错，断点到源码看，看起来像是数组越界
调试代码得知：
错误文件：.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py
代码错误行：723行代码报错 words_embeddings = self.word_embeddings(input_ids)
错误信息：pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Indexing.cu:1239: block: [28,0,0], thread: [63,0,0] Assertion srcIndex < srcSelectDimSize failed.
CUDA error: device-side assert triggered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

from chatglm3.

worm128 commented on June 22, 2024

TORCH_USE_CUDA_DSA

看我错误报告 #243

from chatglm3.

worm128 commented on June 22, 2024

mark，我们抓紧解决这个问题

看我错误报告 #243

from chatglm3.

qzl164 commented on June 22, 2024

同样遇到了，求解

设置了 TORCH_USE_CUDA_DSA=1 环境变量后的报错如下：
[2023-11-10 14:15:27,541] ERROR in app: Exception on /chat [POST]
Traceback (most recent call last):
File "/data/miniconda3/lib/python3.8/site-packages/flask/app.py", line 2190, in wsgi_app
response = self.full_dispatch_request()
File "/data/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1486, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/data/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1484, in full_dispatch_request
rv = self.dispatch_request()
File "/data/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1469, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "/data/project/chatglm3-code/flask_stream_glm3_main-code2.py", line 149, in chatGLM
response, history = model.chat(tokenizer, query, history=history, max_length=max_len, temperature=temperature)
File "/data/miniconda3/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/useradmin/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 1104, in chat
outputs = self.generate(**inputs, **gen_kwargs, eos_token_id=eos_token_id)
File "/data/miniconda3/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/miniconda3/lib/python3.8/site-packages/transformers/generation/utils.py", line 1648, in generate
return self.sample(
File "/data/miniconda3/lib/python3.8/site-packages/transformers/generation/utils.py", line 2730, in sample
outputs = self(
File "/data/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/useradmin/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 938, in forward
transformer_outputs = self.transformer(
File "/data/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/useradmin/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 824, in forward
rotary_pos_emb = rotary_pos_emb[position_ids]
RuntimeError: CUDA error: device-side assert triggered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

from chatglm3.

tjgaozw commented on June 22, 2024

还没解决啊？

from chatglm3.

xx1906 commented on June 22, 2024

The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
File "/home/xx/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 178, in apply_rotary_pos_emb
)
x_out2 = x_out2.flatten(3)
return torch.cat((x_out2, x_pass), dim=-1)
~~~~~~~~~ <--- HERE
RuntimeError: CUDA error: device-side assert triggered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

done....
The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
File "/home/xx/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 178, in apply_rotary_pos_emb
)
x_out2 = x_out2.flatten(3)
return torch.cat((x_out2, x_pass), dim=-1)
~~~~~~~~~ <--- HERE
RuntimeError: CUDA error: device-side assert triggered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

from chatglm3.

Btlmd commented on June 22, 2024

是否可以提供报错的那一次 forward 给模型传入的 input_ids 的 shape 和 input_ids.tolist() 的结果？如果长度超过 positional embedding 的最大长度那就是输入长度的问题。如果不是，我可以使用相同的 input_ids 试一试

from chatglm3.

ycjcl868 commented on June 22, 2024

同样问题

Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.

from chatglm3.

zRzRzRzRzRzRzR commented on June 22, 2024

该问题为bad case，目前版本难以解决，已经移动到Discussion并记录为之后模型升级的方向

from chatglm3.

RuntimeError: CUDA error: device-side assert triggered about chatglm3 HOT 22 CLOSED

Comments (22)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent