Comments (6)
@ollmer Thanks you for pointing that out! The reason is that official stablelm2
model has been updated and there are some differences in parameters. What we have in huggingface.co/mlc-ai
may be obsolete. We will soon update new one.
from mlc-llm.
@saurav-pwh-old Hi, I think that you can temporarily try to use --model-type stablelm
in your command to override it.
from mlc-llm.
@tlopex can you look a bit into this model?
from mlc-llm.
Hi, I have the same issue. When I tried to use --model-type stablelm
, I got a new error:
TypeError: StableLmConfig.__init__() missing 2 required positional arguments: 'layer_norm_eps' and 'partial_rotary_factor'
from mlc-llm.
Hello, everyone! Sorry for so long waiting. Thanks to @MasterJH5574 's help, I already uploaded stablelm2_1.6b models below:
And I tested it here
(mlc-prebuilt) tlopex@tlopex-OMEN-by-HP-Laptop-17-ck1xxx:~/mlc-llm$ python -m mlc_llm chat HF://mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC --device "cuda:0" --overrides context_window_size=4096 --opt "O2"
[2024-05-24 18:40:07] INFO config.py:106: Overriding context_window_size from None to 4096
[2024-05-24 18:40:09] INFO auto_device.py:79: Found device: cuda:0
[2024-05-24 18:40:09] INFO chat_module.py:362: Downloading model from HuggingFace: HF://mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC
[2024-05-24 18:40:09] INFO download.py:42: [Git] Cloning https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC.git to /tmp/tmpq4lci24o/tmp
[2024-05-24 18:40:12] INFO download.py:78: [Git LFS] Downloading 0 files with Git LFS: []
0it [00:00, ?it/s]
[2024-05-24 18:40:18] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_3.bin to /tmp/tmpq4lci24o/tmp/params_shard_3.bin
[2024-05-24 18:40:21] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_2.bin to /tmp/tmpq4lci24o/tmp/params_shard_2.bin
[2024-05-24 18:40:26] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_4.bin to /tmp/tmpq4lci24o/tmp/params_shard_4.bin
[2024-05-24 18:40:32] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_0.bin to /tmp/tmpq4lci24o/tmp/params_shard_0.bin
[2024-05-24 18:40:33] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_6.bin to /tmp/tmpq4lci24o/tmp/params_shard_6.bin
[2024-05-24 18:40:36] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_1.bin to /tmp/tmpq4lci24o/tmp/params_shard_1.bin
[2024-05-24 18:40:36] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_5.bin to /tmp/tmpq4lci24o/tmp/params_shard_5.bin
[2024-05-24 18:40:36] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_7.bin to /tmp/tmpq4lci24o/tmp/params_shard_7.bin
[2024-05-24 18:40:38] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_8.bin to /tmp/tmpq4lci24o/tmp/params_shard_8.bin
[2024-05-24 18:40:40] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_9.bin to /tmp/tmpq4lci24o/tmp/params_shard_9.bin
[2024-05-24 18:40:41] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_10.bin to /tmp/tmpq4lci24o/tmp/params_shard_10.bin
[2024-05-24 18:40:46] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_12.bin to /tmp/tmpq4lci24o/tmp/params_shard_12.bin
[2024-05-24 18:40:46] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_13.bin to /tmp/tmpq4lci24o/tmp/params_shard_13.bin
[2024-05-24 18:40:48] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_11.bin to /tmp/tmpq4lci24o/tmp/params_shard_11.bin
[2024-05-24 18:40:51] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_15.bin to /tmp/tmpq4lci24o/tmp/params_shard_15.bin
[2024-05-24 18:40:51] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_14.bin to /tmp/tmpq4lci24o/tmp/params_shard_14.bin
[2024-05-24 18:40:52] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_16.bin to /tmp/tmpq4lci24o/tmp/params_shard_16.bin
[2024-05-24 18:40:52] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_17.bin to /tmp/tmpq4lci24o/tmp/params_shard_17.bin
[2024-05-24 18:40:56] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_18.bin to /tmp/tmpq4lci24o/tmp/params_shard_18.bin
[2024-05-24 18:40:57] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_20.bin to /tmp/tmpq4lci24o/tmp/params_shard_20.bin
[2024-05-24 18:40:58] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_21.bin to /tmp/tmpq4lci24o/tmp/params_shard_21.bin
[2024-05-24 18:41:00] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_22.bin to /tmp/tmpq4lci24o/tmp/params_shard_22.bin
[2024-05-24 18:41:01] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_23.bin to /tmp/tmpq4lci24o/tmp/params_shard_23.bin
[2024-05-24 18:41:03] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_24.bin to /tmp/tmpq4lci24o/tmp/params_shard_24.bin
[2024-05-24 18:41:05] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_26.bin to /tmp/tmpq4lci24o/tmp/params_shard_26.bin
[2024-05-24 18:41:05] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_25.bin to /tmp/tmpq4lci24o/tmp/params_shard_25.bin
[2024-05-24 18:41:08] INFO download.py:154: Downloaded https://huggingface.co/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC/resolve/main/params_shard_19.bin to /tmp/tmpq4lci24o/tmp/params_shard_19.bin
100%|βββββββββββββββββββββββββββββββββββββββββββ| 27/27 [00:55<00:00, 2.05s/it]
[2024-05-24 18:41:08] INFO download.py:155: Moving /tmp/tmpq4lci24o/tmp to /home/tlopex/.cache/mlc_llm/model_weights/mlc-ai/stablelm-2-zephyr-1_6b-q4f16_1-MLC
[2024-05-24 18:41:08] INFO chat_module.py:781: Now compiling model lib on device...
[2024-05-24 18:41:08] INFO jit.py:43: MLC_JIT_POLICY = ON. Can be one of: ON, OFF, REDO, READONLY
[2024-05-24 18:41:08] INFO jit.py:160: Using cached model lib: /home/tlopex/.cache/mlc_llm/model_lib/489dc4831dc725c82bd025a54da84013.so
[2024-05-24 18:41:09] INFO model_metadata.py:96: Total memory usage: 1756.66 MB (Parameters: 882.66 MB. KVCache: 0.00 MB. Temporary buffer: 874.00 MB)
[2024-05-24 18:41:09] INFO model_metadata.py:105: To reduce memory usage, tweak `prefill_chunk_size`, `context_window_size` and `sliding_window_size`
You can use the following special commands:
/help print the special commands
/exit quit the cli
/stats print out the latest stats (token/sec)
/reset restart a fresh chat
/set [overrides] override settings in the generation config. For example,
`/set temperature=0.5;max_gen_len=100;stop=end,stop`
Note: Separate stop words in the `stop` option with commas (,).
Multi-line input: Use escape+enter to start a new line.
<|user|>: Hello!
<|assistant|>:
Hello! How can I assist you today?
So I believe you all can use it as well. Enjoy trying it!
from mlc-llm.
Thanks @tlopex !
from mlc-llm.
Related Issues (20)
- [Question] Is there a way to compute ppl of models in MLC-LLM? HOT 2
- [Bug] in android folder there is no library folder witch contains prepare_lib.sh οΌ how build android tvm.so ? HOT 2
- [Bug] in android folder there is no library folder witch contains prepare_lib.sh οΌ how build android tvm.so ? HOT 1
- Exiting all the time. Android, Redmi Note 13 pro plus [Bug] HOT 15
- [Question] Proper way to use multiple GPUs HOT 10
- ζ§θ‘mlc_chatζ什ζΆζ»ζ―ζ₯ι [Bug] HOT 2
- [Question] Is mlc chat deprecated? HOT 3
- [Bug] The performance accuracy of large models is severely lost after quantization on Qwen2-1.5B-Instruct οΌplease fix it HOT 4
- [Question] where to find the JAR file or dependency for TVM HOT 1
- [Bug] Does MLC LLM support function call οΌ HOT 5
- [Question] can't download model HOT 8
- [Question] How can I build Android demo app with llava model? HOT 7
- [Question] multiple gpu seting: Check failed num_running_rsentries <= engine_config_->max_num_sequence (81 vs. 80) :
- [Question] Error occurs when I modify Llava to use GPT2's tokenizer HOT 1
- [Question] How to run MLC-LLM as a local chat server on Android HOT 3
- [Bug] FP8 quantization accuracy loss with TinyLlama-1.1B-Chat-v1.0
- [Question]
- [Question] MLCEngine instance must be a singleton? HOT 2
- Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time. HOT 4
- [Question] batchsize of prefill step
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlc-llm.