Comments (5)
I am not sure we want to invest on --merge
as of now llama_model_loader
supports sharded model loading.
from llama.cpp.
I am not sure we want to invest on
--merge
as of nowllama_model_loader
supports sharded model loading.
No problem - it's just I'm used to using cat
and assumed it would follow a similar syntax rather than expand argv[2] into dbrx-16x12b-instruct-q4_0-00001-of-00010.gguf
and arg[3] into dbrx-16x12b-instruct-q4_0-00002-of-00010.gguf
like that.
Probably a simple test against argc that n=2 and a warning message if not, would stop people doing the same.
from llama.cpp.
Yes feel free to submit a PR
from llama.cpp.
Yes feel free to submit a PR
I'll have a look on Monday. What is the best way to deal with this:
- Just check that there are 2 file arguments?
- Don't clobber files by default and add an
--clobber
option? - Have an interactive "Do you really want to overwrite XXX? (y/n)" option?
(3) can be problematic for running in scripts though, but I could an an -y
option to force "yes"?
from llama.cpp.
user@host:~/models$ ls -alh
-rw-rw-r-- 1 user group 43G May 2 18:43 Mixtral-8x22B-Instruct-v0.1.Q4_K_M-00001-of-00002.gguf
-rw-rw-r-- 1 user group 38G Apr 17 19:23 Mixtral-8x22B-Instruct-v0.1.Q4_K_M-00002-of-00002.gguf
user@host:~/models$ gguf-split --merge Mixtral-8x22B-Instruct-v0.1.Q4_K_M-00001-of-00002.gguf Mixtral-8x22B-Instruct-v0.1.Q4_K_M-00002-of-00002.gguf Mixtral-8x22B-Instruct-v0.1.Q4_K_M.gguf
gguf_merge: Mixtral-8x22B-Instruct-v0.1.Q4_K_M-00001-of-00002.gguf -> Mixtral-8x22B-Instruct-v0.1.Q4_K_M-00002-of-00002.gguf
gguf_merge: reading metadata Mixtral-8x22B-Instruct-v0.1.Q4_K_M-00001-of-00002.gguf done
gguf_merge: reading metadata Mixtral-8x22B-Instruct-v0.1.Q4_K_M-00002-of-00002.gguf ...gguf_init_from_file: invalid magic characters ''
gguf_merge: failed to load input GGUF from Mixtral-8x22B-Instruct-v0.1.Q4_K_M-00001-of-00002.gguf
user@host:~/models$ ls -alh
-rw-rw-r-- 1 user group 43G May 2 18:43 Mixtral-8x22B-Instruct-v0.1.Q4_K_M-00001-of-00002.gguf
-rw-rw-r-- 1 user group 0 May 2 19:43 Mixtral-8x22B-Instruct-v0.1.Q4_K_M-00002-of-00002.gguf
cool, now i get to download that whole 38GB again... no idea why a program would overwrite the file it's using as input...
from llama.cpp.
Related Issues (20)
- Support for Bunny VLM (SigLip + Phi-3) HOT 1
- convert.py fails importing a new model architecture HOT 3
- Track allocated buffers in rpc-server HOT 2
- Specific suggested wiki "Feature Matrix" updates pertaining to SYCL. HOT 2
- Add idefics2 support
- llamacpp server REST API | How to interrupt and stop an llm generate request. HOT 2
- Generating the same token (token '1') over and over, after a few successful messages? HOT 4
- server `/embedding` api doesn't handle cases when physical batch size < prompt length. HOT 1
- Quantization ok but check_tensor_dims: tensor 'output_norm.weight' HOT 1
- b2950 broke RPC mode HOT 3
- Please don't hoard memory. HOT 1
- Why is convert-lora-to-ggml.py removed? HOT 2
- Potential string overflow in stbi_parse_png_file function
- [server] phi-3 uses <|endoftext|> instead of <|end|> when applying chat template in /chat/completions HOT 1
- Still not working with Meta-Llama-3-8B-Instruct HOT 1
- Phi 3 medium/small support HOT 31
- Problems about Hugging Face model to the gguf format. HOT 3
- Performance Regression Observed in llama.cpp HOT 3
- FR: Phi-3-vision-128k-instruct implementation HOT 11
- Build fails with `ggml-vulkan.cpp:6880:80: error: cannot convert โggml_tensor*โ to โfloatโ` HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama.cpp.