Comments (7)
I find that the structure of Phi2 and Phi3 are named in different way, in Phi2, llama.cpp works fine on transforming LoRA weights to GGML (where the layer is named Wqkv) while in Phi3 this layer is named qkv_proj, I am thinking is this the problem for the failure of llama.cpp on transforming into GGML?
from llama.cpp.
Any update on this? I am running into the same issue. LoRA runs correctly with transformers, but when I convert to llama cpp. It gives me non sense output.
from llama.cpp.
hope this will be fixed as soon as possible
from llama.cpp.
The reason is that llama.cpp treats phi3 as llama architecture, i.e., splitting the merged qkv_proj into q_proj, k_proj and v_proj layers. One way posted by @Raibows at vllm-project/vllm#4715 is to convert the tensor weight of your adapter/lora checkpoint to match it where he gives the script https://gist.github.com/Raibows/079713a060f0c49c8f3b47c227aff722.
I have tested and it is successful for transforming LoRA weights into GGML, but there is another problem that Ollama cannot integrate this GGML LoRA weights back into Phi3-instruct, I think we should somehow merge back the LoRA weights...
from llama.cpp.
anyone?
from llama.cpp.
I have the same issue...
from llama.cpp.
This issue was closed because it has been inactive for 14 days since being marked as stale.
from llama.cpp.
Related Issues (20)
- Bug: pydantic_models_to_grammar_examples.py is broken HOT 2
- Feature Request: Improve Gemma v2 model performance on Vulkan backend HOT 1
- Bug: RuntimeError: Internal: could not parse ModelProto from ../llama3/Meta-Llama-3-8B-Instruct/tokenizer.model HOT 1
- Feature Request: T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge HOT 9
- Bug: a null-pointer defer in examples/gguf/gguf.cpp/gguf_ex_read_0 and gguf_ex_read_1
- Feature Request: Hope to support Qwen VL HOT 1
- Bug - Can't build vulkan backend on RISC-V platform anymore HOT 7
- Bug: gemma2 perplexity pending forever HOT 1
- Bug: MESA: error: ../src/intel/vulkan/anv_device.c:4237: VK_ERROR_OUT_OF_DEVICE_MEMORY HOT 4
- Bug: GGML_HIP_UMA causes consistency errors HOT 7
- Bug: In a small n_ctx_slot, the llama.cpp begins gibberish HOT 1
- Bug: Weird output from llama-speculative HOT 14
- Bug: Qwen-2-7b-q8 and Qwen-2-7b-instruct-q8 giving weird output when run with CUDA support HOT 2
- Bug: ROCm CUDA error HOT 5
- Run Llama.cpp in silent mode HOT 1
- Llama.cpp release notes lacking descriptions in the github.com page HOT 1
- lliblama.so is missing HOT 1
- Bug: GGML_CUDA_FORCE_CUBLAS cannot be compile for hipblas HOT 2
- Newest apple model unsupported... HOT 4
- Bug: Failed to load model HOT 10
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama.cpp.