Comments (2)
I was just looking for this. All i found was how to train a lora in llama.cpp https://github.com/ggerganov/llama.cpp/tree/master/examples/finetune . I just saw a demo of Lorax https://github.com/predibase/lorax ,which lets use multiple lora on the fly and to turn them off independently. In llama.ccp documention it says you can adjust the scaling of the lora which is nice. It would be great if you could adjust them when sending the prompt. Wishful thinking, but to be able to convert loras already out there would be great.
from llama.cpp.
But isn't this already possible today here :
![image](https://private-user-images.githubusercontent.com/12037164/337994289-7fbecedc-57e8-4a47-95d8-004a7dd80499.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkxNjk4NDIsIm5iZiI6MTcxOTE2OTU0MiwicGF0aCI6Ii8xMjAzNzE2NC8zMzc5OTQyODktN2ZiZWNlZGMtNTdlOC00YTQ3LTk1ZDgtMDA0YTdkZDgwNDk5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjMlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIzVDE5MDU0MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWNkMjZjODFjN2MwM2ZmNWQ3MTY1Y2VlM2FkZTNhZmRlNmIyMDRhODBlNTM4OWZlODdiZmU2OTI0YTk1NTc2YmYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.VN-_mzR6-I_uNaZ5w1NDywwW72Fp11kF-AWRVadRiQA)
Check llama server here : https://github.com/ggerganov/llama.cpp/tree/master/examples/server
from llama.cpp.
Related Issues (20)
- server: Bring back multimodal support
- Feature Request: Support for Florence-2 Vision Models HOT 1
- Feature Request: Hardware support check HOT 12
- Bug: Or Feature? BPE Tokenization mutates whitespaces into double-whitespace tokens when add_prefix_space is true (default)
- Bug: Qwen2-72B-Instruct (and finetunes) Q4_K_M generates random output HOT 2
- Bug: Inference is messed up in llama-server+default ui and llama-cli but works in llama-server+openweb ui HOT 2
- Bug: `-fPIC` compiler flag missing in cmake build?
- Bug: Embedding endpoint takes exponential time to process a long unknown token HOT 3
- 我想convert一个比较大的模型时报错Unable to allocate 1.96 GiB for an array with shape (128256, 8192) and data type float16如何解决 HOT 1
- Bug: moondream2 inference not correct (severe quality degradation compared to reference)
- Tag b3187 Windows ARM binary release without "main.exe" HOT 1
- Bug: ABI problem in binary file "llama-b3187-bin-win-msvc-arm64.zip" HOT 1
- Bug: --chat-template seems to be broken now, no way to truly chat from the llama-cli HOT 3
- Bug: LoRA Finetuning fails for GPU offloading HOT 4
- Bug: brew install on a Mac HOT 1
- Bug: Persistent hallucination even after re-running llama.cpp HOT 10
- win7 failed HOT 1
- Bug: JSON Schema - enum behind a $ref generates an object with unrestricted properties HOT 3
- Bug: llama-server crashes when started with --embeddings HOT 6
- Bug: similar sizes suggest some heavy shared component in all 38 `llama-*` binaries (which now weigh 14 GB in total) HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama.cpp.