Comments (7)
same problem here, was really hoping this would work with my fine tune as i switch to newer models but no luck
from llama.cpp.
notebook that shows the problem:
https://colab.research.google.com/drive/1GSlcZFE_kokYRFAqL81jVVZm04IbX5L6?usp=sharing
from llama.cpp.
I got past that, convert script works with L3 model but the quantize script stopped working today. Colab.
!./llama.cpp/quantize /content/models/"model".gguf "model"-q6_k.gguf q6_k
/bin/bash: line 1: ./llama.cpp/quantize: No such file or directory
from llama.cpp.
I got past that, convert script works with L3 model but the quantize script stopped working today. Colab.
!./llama.cpp/quantize /content/models/"model".gguf "model"-q6_k.gguf q6_k
/bin/bash: line 1: ./llama.cpp/quantize: No such file or directory
that's because now quantize is called llama-quantize.
from llama.cpp.
I got past that, convert script works with L3 model but the quantize script stopped working today. Colab.
How? With the same colab notebook? Did you change something?
from llama.cpp.
I tried again but I got the same result!
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
pip install -r requirements.txt
pip install ./gguf-py/
python convert-hf-to-gguf.py --outtype f16 --outfile /content/Phi-3-small-128k-instruct.f16.gguf /content/Phi-3-small-128k-instruct
same as before.
INFO:hf-to-gguf:Loading model: Phi-3-small-128k-instruct
ERROR:hf-to-gguf:Model Phi3SmallForCausalLM is not supported
from llama.cpp.
the same error , Local or https://huggingface.co/spaces/ggml-org/gguf-my-repo
from llama.cpp.
Related Issues (20)
- Bug: Persistent hallucination even after re-running llama.cpp HOT 4
- win7 failed HOT 1
- Bug: JSON Schema - enum behind a $ref generates an object with unrestricted properties HOT 3
- Bug: llama-server crashes when started with --embeddings HOT 6
- Bug: similar sizes suggest some heavy shared component in all 38 `llama-*` binaries (which now weigh 14 GB in total) HOT 5
- [feature request] conversion to gguf in a more pure form. HOT 2
- Vulkan backend regression: gibberish output when layers offloaded to GPU HOT 2
- Bug: Cannot load GGUF file, it asks if it is GGML. HOT 1
- Bug: Crashes at the end of startup during first prompt processing HOT 23
- Bug: llama.cpp apparently exits with '[end of text]' before processing prompt if prompt is ~2048 tokens
- Add Support for Bamboo LLM
- Bug: ggml_backend_cpu_buffer_type_alloc_buffer: failed to allocate buffer of size 137438953504 HOT 2
- sh: 1: ./llama.cpp/llama-quantize: not found HOT 2
- Bug: abort on Android (pixel 8 pro) HOT 1
- Bug: [RPC] RPC apparently isn't honoring backend memory capacity et. al. HOT 1
- Feature Request: Provide means to quantify the restriction of RAM/VRAM usage for each GPU and system RAM.
- Feature Request: It would be convenient and faster if users could specify that the model data used for a RPC-server instance is already available by some fast(er) means (file system GGUF, whatever). HOT 1
- Bug: Crash with GGML CUDA error when inferencing on llama-server HOT 9
- Bug: convert-hf-to-gguf.py - AttributeError: 'LlamaTokenizerFast' object has no attribute 'added_tokens_decoder' HOT 1
- Bug: llama3 8b gradient unsupported? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama.cpp.