Comments (10)
So I tried to run your model. But I get an error again.
zohran@alienware-m17-r3 ~/Downloads/llama.cpp-b3400 $ ./llama-cli -m /home/zohran/Downloads/Llama-2-13B-chat-GGUF/llama-2-13b-chat.Q8_0.gguf -p "How are you?"
Log start
main: build = 0 (unknown)
main: built with cc (Gentoo Hardened 14.1.1_p20240622 p2) 14.1.1 20240622 for x86_64-pc-linux-gnu
main: seed = 1721145009
gguf_init_from_file: invalid magic characters 'vers'
llama_model_load: error loading model: llama_model_loader: failed to load model from /home/zohran/Downloads/Llama-2-13B-chat-GGUF/llama-2-13b-chat.Q8_0.gguf
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '/home/zohran/Downloads/Llama-2-13B-chat-GGUF/llama-2-13b-chat.Q8_0.gguf'
main: error: unable to load model
from llama.cpp.
I understood how to do now no worries 👍
I asked somewhere else with people more able to explain. It's not that much complicate...
from llama.cpp.
You should download a model firstly, mate.
If you check your models dir, probably you cannot find llama-7b.
from llama.cpp.
Sorry I am starting with llama. So I clone that model for example:
https://huggingface.co/THUDM/glm-4-9b
But I don't see any .gguf file. I guess I have to generate it ?
How I am suppose to do that ?
from llama.cpp.
For this kind of model, you need to convert it to gguf, use convert-hf-to-gguf.py
to do it, you can find details in documentation.
However, if you just want a quick start, have a try of "quantised-ready" model, like this: https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF
from llama.cpp.
On huggingface, there is a demo code for llama.cpp (at the top-right corner "Use this model" button). Probably have a try:
./llama-cli --hf-repo "TheBloke/Llama-2-13B-chat-GGUF" -m llama-2-13b-chat.Q2_K.gguf -p "How are you?" -n 128
from llama.cpp.
Okay I understood how to download now.
So I have a question. Which model do you recommend me ? Because I would like to integrant an IA assistant inside the linux distribution I am making, and I would like to teach the assistant how to manage the system with my tools. The one you gave me is good you think for that ? Basically I would like to this assistant to be able to run command.
I would like as well when the AI answer to stop talking too much xD. How can I allow the AI to run some bash command in my system ?
from llama.cpp.
Because the terminal is always showing extra text, I want to avoid that (with im_end and tips):
> Hello
Hi there! How can I help you today?
<|im_end|>
In this example, the
>
from llama.cpp.
Okay I understood how to download now.
So I have a question. Which model do you recommend me ? Because I would like to integrant an IA assistant inside the linux distribution I am making, and I would like to teach the assistant how to manage the system with my tools. The one you gave me is good you think for that ? Basically I would like to this assistant to be able to run command.
I would like as well when the AI answer to stop talking too much xD. How can I allow the AI to run some bash command in my system ?
TBH, I don't think you're going to get a good answer to that question here. You're clearly new to this stuff and have a lot of homework to do. What you want to do is extremely complicated and probably well out of reach for your skill level. My advice is to start doing a lot of research and attempt far easier projects first. You're asking how to design a car when you don't know how to drive. Also, this is not the appropriate place for these questions. I figured this comment would be more helpful to you than silence. Good luck
from llama.cpp.
I understood how to do now no worries 👍 I asked somewhere else with people more able to explain. It's not that much complicate...
Let me know when you're done so I can check it out!
from llama.cpp.
Related Issues (20)
- Feature Request: Support T5-based encoder-only models HOT 5
- Bug: Quantizing HuggingFaceM4/Idefics3-8B-Llama3 fails with error
- Bug: Unreachable code warnings
- Bug: 2 tests fail HOT 2
- Bug: Inference fails with "llama_get_logits_ith: invalid logits id 7, reason: no logits" in ollama HOT 1
- Feature Request: RPC Cuda Build to link with cudart dlls HOT 1
- Fail to build with Vulkan on macOS with make or cmake HOT 7
- Feature Request: Req to support Structured output and JSON schema to GBNF
- Bug: KV cache load/save is slow HOT 1
- Bug: Quantized kv cache caused performance drop on Apple silicon HOT 3
- build ERROR: Failed building wheel for pyyaml HOT 1
- Bug: Update to "convert_hf_to_gguf.py"
- Bug: Latest version of convert_hf_to_gguf not compatible with gguf 0.9.1 from pip
- Bug: llama-cli.exe don't have option as doc describes (like --chat-template) HOT 5
- Bug: exception while rasing a another exception in convert_llama_ggml_to_gguf script
- Bug: Kompute exits before loading model when offloading to GPU HOT 1
- Feature Request: Support vulkan when building on Android
- Bug: When --parallel 4 is turned ON, the inferring result is apparently like fool .But when --parallel 4 is turned OFF everything is OK ? HOT 4
- Bug: Decoding special tokens in T5 HOT 6
- Feature Request: Ovis1.5-Gemma2-9B model support? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama.cpp.