Just wanted to report that this works perfect on my gtx1060 (6gb) on my old i5-7200 16

Works like a charm ! about mlc-llm HOT 5 CLOSED

mlc-ai commented on May 18, 2024 3

Works like a charm !

from mlc-llm.

Comments (5)

dbddv01 commented on May 18, 2024 1

I have no tools to measure it. So i handle it manually. I prompted 'please produce 500 tokens story of starwars' then copy pasted the text produced to count the tokens using the https://platform.openai.com/tokenizer and my stopwatch.
i have following results.
673 tokens in 50 sec : 13,46 t/sec
318 tokens in 25 sec : 12,72 t/sec
256 tokens in 23 sec : 11,13 t/sec

by the way, i noticed :
Usage: mlc_chat [--help] [--version] [--device-name VAR] [--artifact-path VAR] [--model VAR] [--dtype VAR] [--params VAR] [--evaluate]

Optional arguments:
-h, --help shows help message and exits
-v, --version prints version information and exits
--device-name [default: "auto"]
--artifact-path [default: "dist"]
--model [default: "vicuna-v1-7b"]
--dtype [default: "auto"]
--params [default: "auto"]
--evaluate

Is it just a matter of documentation that we would be able to already play with the arguments ?

from mlc-llm.

junrushao commented on May 18, 2024 1

Hey thanks for the data! This is super valuable to us!

We updated mlc_chat_cli this morning to include a command \stats. Would you mind if you updated the conda environment to include this change?

To include this update, you will have to remove the package and install again (conda update doesn't work for some reason):

conda remove mlc-chat-nightly
conda install -c conda-forge -c mlc-ai mlc-chat-nightly

Then the help message will show up when initializing the program, and you may use \stats to get some details:

Thanks a bunch!

from mlc-llm.

dbddv01 commented on May 18, 2024 1

Ok thx ! now with the /stats.

USER: /stats
encode: 35.4 tok/s, decode: 16.7 tok/s
USER: continue
ASSISTANT: In this epic (... removed ...)
USER: /stats
encode: 11.1 tok/s, decode: 14.0 tok/s
USER: continue
ASSISTANT: Sure, here's the continuation:
(... removed ...)
USER: /stats
encode: 37.7 tok/s, decode: 17.0 tok/s

from mlc-llm.

junrushao commented on May 18, 2024

Thank you for sharing the information! We are currently gathering data points on runnable devices and their speed. Would you be willing to assist us in this effort by sharing the tokens/sec data on your GTX 1060?

from mlc-llm.

junrushao commented on May 18, 2024

Thanks a lot for your swift response! The data is super valuable to us!

from mlc-llm.

Recommend Projects

Works like a charm ! about mlc-llm HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent