Code Monkey home page Code Monkey logo

Comments (5)

dbddv01 avatar dbddv01 commented on May 18, 2024 1

I have no tools to measure it. So i handle it manually. I prompted 'please produce 500 tokens story of starwars' then copy pasted the text produced to count the tokens using the https://platform.openai.com/tokenizer and my stopwatch.
i have following results.
673 tokens in 50 sec : 13,46 t/sec
318 tokens in 25 sec : 12,72 t/sec
256 tokens in 23 sec : 11,13 t/sec

by the way, i noticed :
Usage: mlc_chat [--help] [--version] [--device-name VAR] [--artifact-path VAR] [--model VAR] [--dtype VAR] [--params VAR] [--evaluate]

Optional arguments:
-h, --help shows help message and exits
-v, --version prints version information and exits
--device-name [default: "auto"]
--artifact-path [default: "dist"]
--model [default: "vicuna-v1-7b"]
--dtype [default: "auto"]
--params [default: "auto"]
--evaluate

Is it just a matter of documentation that we would be able to already play with the arguments ?

from mlc-llm.

junrushao avatar junrushao commented on May 18, 2024 1

Hey thanks for the data! This is super valuable to us!

We updated mlc_chat_cli this morning to include a command \stats. Would you mind if you updated the conda environment to include this change?

To include this update, you will have to remove the package and install again (conda update doesn't work for some reason):

conda remove mlc-chat-nightly
conda install -c conda-forge -c mlc-ai mlc-chat-nightly

Then the help message will show up when initializing the program, and you may use \stats to get some details:

image

Thanks a bunch!

from mlc-llm.

dbddv01 avatar dbddv01 commented on May 18, 2024 1

Ok thx ! now with the /stats.

USER: /stats
encode: 35.4 tok/s, decode: 16.7 tok/s
USER: continue
ASSISTANT: In this epic (... removed ...)
USER: /stats
encode: 11.1 tok/s, decode: 14.0 tok/s
USER: continue
ASSISTANT: Sure, here's the continuation:
(... removed ...)
USER: /stats
encode: 37.7 tok/s, decode: 17.0 tok/s

from mlc-llm.

junrushao avatar junrushao commented on May 18, 2024

Thank you for sharing the information! We are currently gathering data points on runnable devices and their speed. Would you be willing to assist us in this effort by sharing the tokens/sec data on your GTX 1060?

from mlc-llm.

junrushao avatar junrushao commented on May 18, 2024

Thanks a lot for your swift response! The data is super valuable to us!

from mlc-llm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.