Code Monkey home page Code Monkey logo

Comments (10)

wangye01inf avatar wangye01inf commented on May 30, 2024

@Copilot-X 请问你的运行代码是怎么样的呢?理论上用 bf16/fp16 加载模型只需要 12GB 左右显存

from yi.

Liangdi avatar Liangdi commented on May 30, 2024

我跑了 demo , 加载了模型后 13G 左右显存占用, 推理时候再多 500MB 左右

from yi.

Copilot-X avatar Copilot-X commented on May 30, 2024

我跑了 demo , 加载了模型后 13G 左右显存占用, 推理时候再多 500MB 左右

加载推理的代码有么? 我对比一下看看

from yi.

Liangdi avatar Liangdi commented on May 30, 2024

我跑了 demo , 加载了模型后 13G 左右显存占用, 推理时候再多 500MB 左右

加载推理的代码有么? 我对比一下看看

就仓库的呀: https://github.com/01-ai/Yi/blob/main/demo/text_generation.py

from yi.

learninmou avatar learninmou commented on May 30, 2024

目前模型是用bfloat16数据类型,6B模型至少需要13GB左右的显存。

from yi.

DumoeDss avatar DumoeDss commented on May 30, 2024

200k上下文的6B与34B模型分别需要多少显存?

from yi.

mwmif avatar mwmif commented on May 30, 2024

image


Yi\demo\text_generation.py 文件中
加两个参数(需要安装一些依赖库,没安装会报错)后,4G 显存也能跑,但是速度超级慢。
还是要依赖llama.cpp 这种优化方案,否则小显存设备基本没法玩

image

ChatGLM3 6B 也是使用chatglm.cpp 量化到4 后,才跑的飞起,使用官方量化方案,也基本十几分钟才有回复。

from yi.

cutoutsy avatar cutoutsy commented on May 30, 2024

想问下,推理速度有多少tokens / s

from yi.

ZhaoFancy avatar ZhaoFancy commented on May 30, 2024

本次 Chat 版本的发布特地增加了该部分内容。

from yi.

garbe-github-support avatar garbe-github-support commented on May 30, 2024

按照readme给的代码,用的6B chat 11GB模型,8G显存,显卡是3070Ti
能跑但是很慢很慢,10分钟多了
但是同样的机器我跑chatglm3-6b 也是11GB的模型很快呀,几秒钟就开始输出了,一两分钟就输出完了,
难道是因为这个是一次性输出的?

from yi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.