Comments (6)
你用-v,看看是否正常编译了
from chatglm.cpp.
我这边测试也是一样,windows和linux差距特别明显,在-v中可以看到windows平台下无法使用AVX,AVX2等,Linux下正常
from chatglm.cpp.
@JianbangZ @zhangtao103239
谢谢回复!
-
完整编译日志:
build_log.txt
from chatglm.cpp.
使用服务器测试确实快了不少,首字符1分钟左右,后续平均每字符0.3s左右。
CPU:Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz
from chatglm.cpp.
已经在windows上支持了AVX/AVX2指令集,可以pull下最新代码试试
from chatglm.cpp.
已经在windows上支持了AVX/AVX2指令集,可以pull下最新代码试试
非常感谢!在我的windows机器上实测速度有了明显提升,首个token耗时10-30s左右,后续每秒5-10token,不过受限于硬件水平,这个速度只能说勉强达到可用级别,再次感谢作者大大的贡献!
from chatglm.cpp.
Related Issues (20)
- Windows pip install 报错 HOT 1
- GGML_CLBLAST=ON 编译失败 HOT 1
- 你好,启动API的文件在哪,怎么启动api服务? where is the api_demo.py? HOT 2
- API Server Error HOT 6
- 上下文长了后chatglm2 生成第一个token前处理太慢 HOT 1
- [Feature] internlm/internlm-20b 支持
- langchain-api运行问题 HOT 1
- Is it possible to increase the Baihuan2-13b default ctx length to 4096?
- 请问有计划支持GGUF格式吗?GGML已经被llama.cpp标记为deprecated了 HOT 4
- max_context_length > 2048 (比如langchain 场景下很长的上下文)时 报错: ggml_new_tensor_impl: not enough space in the scratch memory pool HOT 4
- chatglm2使用pyfastllm推理速度变慢
- Error when run with docker: error while loading shared libraries: libggml.so HOT 4
- how to use cuda on AGX Orin HOT 2
- ggml_graph_compute_helper函数的作用? HOT 2
- 执行convert.py,直接就退出了,不能成功转换。 HOT 9
- chatglm.cpp:481 check failed (std::isfinite(next_token_logits[i])) nan/inf encountered at lm_logits[0]
- 成功转换chatglm2-6b 后,cmake 报错 HOT 1
- 什么时候支持千问模型 Qwen-7b? HOT 4
- 是否考虑支持llama.cpp中新增加的k_quants方法 HOT 1
- compile failed when usign flag -DGGML_CUBLAS=ON
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chatglm.cpp.