Comments (1)
经过详细测试,发现是因为请求了一个4461 tokens 的query,导致显存占满,接着日志打印:
This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (8192). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.
在接着就是报错
from qwen.
Related Issues (20)
- [BUG] <title>Adding regular tokens is not supported HOT 1
- 如何修改模型的结构 HOT 1
- [BUG] <title> vLLM推理乱码 HOT 2
- Qwen 的开源模型能输出 logprobs吗? HOT 3
- [BUG] docker_openai_api.sh 报can't open file 'openai_api.py' HOT 1
- 推理时的显存使用为啥这么少呢? HOT 1
- [BUG] <title>Qwen2-7b-instruct使用SFT-FT,loss变为0,如何解决? HOT 2
- 大模型function call对比传统nlp方式有什么优势? HOT 2
- [BUG] 百炼文档中function call 的示例有误 HOT 1
- 请教下为什么Qwen/finetune.py和Qwen/eval/evaluate_ceval.py 的tokenizer的padding_side 不一样呢? HOT 1
- [BUG] Qwen 1.8B 多线程推理时报错 HOT 2
- [BUG] <title> model_max_length 32768 not work HOT 4
- [BUG] <title> 请问QWenLMHeadModel中的QWenModel模块是处理文本信息吗? HOT 1
- 官方推理脚本和模型文件中的pad_token不一致 HOT 1
- Qwen-Chat-RLHF和Qwen-Chat的区别 HOT 1
- [BUG] 增加上下文长度后输出乱码 HOT 1
- [BUG] <title>Nvidia Jetson Orin NX开发板上推理运行qlora微调之后的模型,报错:不支持QuantLinear() HOT 1
- AWQ量化后,输出不能正常停止,不量化推理正常 HOT 1
- 请问可以支持加入本地知识库进行微调大模型吗 HOT 2
- qwen-7b-int4用vllm推理,为什么结果是乱码?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from qwen.