Comments (4)
@luyifanlu Hi,
请问你的测试内容是什么?是在什么数据集上进行的测试呢?
from glm-130b.
@luyifanlu Hi,
请问你的测试内容是什么?是在什么数据集上进行的测试呢?
就一个2+3 = ,给出的output一大堆列式
from glm-130b.
如果不加任何 MASK 标记,模型默认会将 [gMASK]
加入输入末尾做长文本生成,在这种情况下 GLM-130B 的行为类似于 GPT 这类生成模型,输出列式的结果是正常现象,您可以对比一下 BLOOM 176B 模型的生成结果。在这个场景下我们更推荐使用 GLM 的短文本填空能力:
输入:the answer to 2+3 is [MASK].
输出:the answer to 2+3 is 5 .
感谢您的反馈!
from glm-130b.
这是 OpenAI GPT-3 API 给出的结果,供参考
GPT-3 (Davinci): 2+3 = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
from glm-130b.
Related Issues (20)
- Question about P-Tuning
- 关于Fastertransformer推理的程序
- torch run的问题 HOT 4
- 部署后报错 size mismatch for transformer.word_embeddings.weight: copying a param with shape torch.Size([18816, 12288]) from checkpoint, the shape in current model is torch.Size([150528, 12288]). HOT 5
- V100(8 * 32G)运行报错 HOT 14
- 为什么没有中文说明? HOT 3
- https://tianqi.aminer.cn/ 天启官网合作咨询验证码打不开,请问如何联系商用 HOT 1
- 想问一下作者,量化成int4 int8 之后为什么模型大小没有变化,都是240g HOT 15
- 请教
- 4*4090gpu for int4 model inference error HOT 1
- question: what does token mean here ?
- 国内模型下载地址 HOT 2
- [ERROR] `bash scripts/generate.sh --input-source interactive` 报错 HOT 7
- 是不是chatglm与这个GLM-130b开源模型中间还有很多问题待解决? HOT 2
- [HELP] 有人能分享一下量化好的int4 版本的模型吗?
- 关于论文中bf16的一个疑问
- RuntimeError: CUDA error: invalid device ordinal HOT 1
- 如何使用FasterTransformer适配自己的模型 HOT 1
- 现在好像没有ChatGLM-130B开源吧?只有6B, 130B的不是Chat HOT 1
- bash scripts/generate.sh --input-source interactive运行报错 HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from glm-130b.