Comments (22)
mark,我们抓紧解决这个问题
from chatglm3.
我也出现了同样的问题
from chatglm3.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [0,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [1,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [2,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [3,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [4,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [5,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [6,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [7,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [8,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [9,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [10,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [11,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [12,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [13,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [14,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [15,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [16,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [17,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [18,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [19,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [20,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [21,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [22,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [23,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [24,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [25,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [26,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [27,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [28,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [29,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [30,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [1162,0,0], thread: [31,0,0] Assertion `-sizes[i] <= index && index < sizes[i] && "index out of bounds"` failed.
CUDA error: device-side assert triggered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
类似的问题
from chatglm3.
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 1038, in chat
outputs = self.generate(**inputs, **gen_kwargs, eos_token_id=eos_token_id)
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/generation/utils.py", line 1522, in generate
return self.greedy_search(
File "/opt/conda/lib/python3.10/site-packages/transformers/generation/utils.py", line 2339, in greedy_search
outputs = self(
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 937, in forward
transformer_outputs = self.transformer(
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 830, in forward
hidden_states, presents, all_hidden_states, all_self_attentions = self.encoder(
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 640, in forward
layer_ret = layer(
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 542, in forward
layernorm_output = self.input_layernorm(hidden_states)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 190, in forward
hidden_states = hidden_states * torch.rsqrt(variance + self.eps)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
我也出现了同样的问题
from chatglm3.
同样问题
from chatglm3.
怎么解决呢?
from chatglm3.
glm官方,能快速解决这个问题不???
from chatglm3.
是这样的,也是遇到这个问题
from chatglm3.
我也遇到这个问题 怎么解决哇
from chatglm3.
同样遇到了,求解
from chatglm3.
目前我暂时没能复现这个问题。这里 CUDA 的报错是异步的,是否有人可以在设置了 CUDA_LAUNCH_BLOCKING=1
环境变量的情况下,定位一下问题的产生位置?
from chatglm3.
我在一种情况下复现了这种问题。一种可能的原因是输入序列的长度超过了模型的 position embedding 的最大长度,造成索引时超范围了。
在 composite_demo
中引入了更加友善的错误提示。
from chatglm3.
我也出现了这个问题,出现了很多次,多轮对话后出现,显存也没有爆
from chatglm3.
一样的问题,用model.stream_chat流输出字符串没问题,用model.chat整段字符串输出就报错,断点到源码看,看起来像是数组越界
调试代码得知:
错误文件:.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py
代码错误行:723行代码报错 words_embeddings = self.word_embeddings(input_ids)
错误信息:pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\cuda\Indexing.cu:1239: block: [28,0,0], thread: [63,0,0] Assertion srcIndex < srcSelectDimSize
failed.
CUDA error: device-side assert triggered
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
from chatglm3.
TORCH_USE_CUDA_DSA
看我错误报告 #243
from chatglm3.
mark,我们抓紧解决这个问题
看我错误报告 #243
from chatglm3.
同样遇到了,求解
设置了 TORCH_USE_CUDA_DSA=1 环境变量后的报错如下:
[2023-11-10 14:15:27,541] ERROR in app: Exception on /chat [POST]
Traceback (most recent call last):
File "/data/miniconda3/lib/python3.8/site-packages/flask/app.py", line 2190, in wsgi_app
response = self.full_dispatch_request()
File "/data/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1486, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/data/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1484, in full_dispatch_request
rv = self.dispatch_request()
File "/data/miniconda3/lib/python3.8/site-packages/flask/app.py", line 1469, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "/data/project/chatglm3-code/flask_stream_glm3_main-code2.py", line 149, in chatGLM
response, history = model.chat(tokenizer, query, history=history, max_length=max_len, temperature=temperature)
File "/data/miniconda3/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/useradmin/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 1104, in chat
outputs = self.generate(**inputs, **gen_kwargs, eos_token_id=eos_token_id)
File "/data/miniconda3/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/miniconda3/lib/python3.8/site-packages/transformers/generation/utils.py", line 1648, in generate
return self.sample(
File "/data/miniconda3/lib/python3.8/site-packages/transformers/generation/utils.py", line 2730, in sample
outputs = self(
File "/data/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/useradmin/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 938, in forward
transformer_outputs = self.transformer(
File "/data/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/useradmin/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 824, in forward
rotary_pos_emb = rotary_pos_emb[position_ids]
RuntimeError: CUDA error: device-side assert triggered
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
from chatglm3.
还没解决啊?
from chatglm3.
The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
File "/home/xx/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 178, in apply_rotary_pos_emb
)
x_out2 = x_out2.flatten(3)
return torch.cat((x_out2, x_pass), dim=-1)
~~~~~~~~~ <--- HERE
RuntimeError: CUDA error: device-side assert triggered
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
done....
The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
File "/home/xx/.cache/huggingface/modules/transformers_modules/chatglm3-6b/modeling_chatglm.py", line 178, in apply_rotary_pos_emb
)
x_out2 = x_out2.flatten(3)
return torch.cat((x_out2, x_pass), dim=-1)
~~~~~~~~~ <--- HERE
RuntimeError: CUDA error: device-side assert triggered
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
from chatglm3.
是否可以提供报错的那一次 forward
给模型传入的 input_ids
的 shape 和 input_ids.tolist()
的结果?如果长度超过 positional embedding 的最大长度那就是输入长度的问题。如果不是,我可以使用相同的 input_ids
试一试
from chatglm3.
同样问题
Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds"
failed.
from chatglm3.
该问题为bad case,目前版本难以解决,已经移动到Discussion并记录为之后模型升级的方向
from chatglm3.
Related Issues (20)
- problems when finetuning with lora HOT 3
- 【求助】关于算法备案的问题
- 进行p-tuning-v2微调时,报如下错误 HOT 4
- 拼接格式与 chat 接口处理逻辑是否冲突 HOT 2
- lora微调报错 HOT 1
- RMSNorm的不同实现方式
- LORA 微调报错 HOT 1
- bug when running inference_hf.py after finetuning with lora.
- api模式下是不能实现工具调用吗? HOT 1
- 执行了composition_demo,发现仅仅用了一个cpu去执行推理。速度非常慢。这个是什么原因导致的? HOT 1
- 运行basic_demo下的web_demo_gradio.py程序报错ModuleNotFoundError: No module named 'peft' HOT 3
- langchain_demo中的那个是不是不是流式处理?
- ChatGLMForConditionalGeneration forward position_ids 参数哪里传入的?
- 从 chat 接口内部调用 generate 接口的处理逻辑看,使用上述拼接方式生成的 input_ids 不符合你们对于特殊符(如<|user|>、<|assistant|>)的 id 定义,这部分是否只是为了兼容通用的 generate 接口?且存在对模型性能的损失? HOT 3
- 使用chatglm.cpp调用chatglm3-6b-32k时,非常容易触发无限循环的问题,repetition_penalty设置为2依然没有效果 HOT 1
- openai_api_request.py运行不成功 HOT 1
- Getting requirements to build wheel did not run successfully. HOT 1
- ChatGLM3-6b微调之后再运行就会出错 HOT 1
- 但是这样做代码会自动复原,加入的代码就被冲刷掉了> 请问你解决了吗?
- 使用finetune_demo/finetune_hf.py的lora微调之后,用finetune_demo/inference_hf.py做推理了的时候加载权重后无反应 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chatglm3.