Comments (2)
According to the fact that the Q4 quantized 34B model requires 20g RAM,
the Q4 quantized 340B model should be able to run on a computer with 256G RAM.
from llama.cpp.
Well even if its not something that most can run at home, it would still be really useful for people who can deploy it. Big GPUs can be rented in the cloud. This model feels to me like its going to be a game changer!
llama.cpp is simply the least headache inducing a way of running any LLM, Renting for this model is going to be expensive and not having to fiddle with jank is nice. I also wonder how well the AMD MI300x would be.
from llama.cpp.
Related Issues (20)
- Bug: Building through oneAPI compilers on Windows failed.
- Bug: Phi-3 Tokenizer Adds Whitespaces on re-tokenization (which invalidates KV-cache) HOT 8
- Bug: Weird output from CodeQwen converted from safetensors and unrecognized BPE pre-tokenizer for CodeQwen HOT 4
- examples/server: "New UI" chat becomes slower with each subsequent message
- nvm
- How to properly serve Gemma 7b? HOT 2
- Refactor: GGUF my Repo tool on HF needs its scripts updated with the new naming scheme HOT 1
- Facing issue while converting finetune LLaVA Mistral model to gguf HOT 1
- Error converting gemma-1.1-7b-it to gguf. HOT 2
- Latest vulkan version doesn't follow instruction HOT 1
- Bug: Unable to load model using SYCL HOT 4
- Bug: b3028 breaks mixtral 8x22b HOT 19
- Bug: The output of the lama-clI is not the same as the output of the lama-server HOT 4
- Bug: -[MTLComputePipelineDescriptorInternal setComputeFunction:withType:]:722: failed assertion `computeFunction must not be nil.' HOT 6
- Bug: Vulkan, I-quants partially working since PR #6210 (very slow, only with all repeating layers offloaded) HOT 1
- Bug: Unable to call llama.cpp inference server with llama 3 model HOT 1
- Bug: Deepseek Coder MOE GGML_ASSERT: ggml.c:5705: ggml_nelements(a) == ne0*ne1 HOT 10
- SIMD Everywhere HOT 1
- Bug: Llama3 8B Instruct Model outputting nonsensical text on AMD GPUs. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama.cpp.