Comments (9)
Some features of llama.cpp require using ggml. If llama.cpp is built as a shared library linking statically to ggml, then applications cannot use these features without linking their own copy of ggml - which may not work at all. Is it really that much of a problem to bundle the ggml shared library alongside llama.cpp?
@mudler BUILD_SHARED_LIBS
is the default in most cases:
Lines 31 to 41 in f675b20
from llama.cpp.
Yes. I agree with you. I got lbc++ statically linking, but gomp and openblas are always dynamically linked. You can reclose this if you want.
from llama.cpp.
It should be linked statically without the -DBUILD_SHARED_LIBS=ON
parameter (edit: BUILD_SHARED_LIBS
is the default in most cases). What is not possible at the moment is to build llama.cpp as a shared library and ggml as a static library, but I am not sure that's really a problem.
from llama.cpp.
Thanks a lot for the reply @slaren!
This flag indeed links GGML statically into llama.cpp. However, in this configuration, llama.cpp is also generated as a static library. I'm trying to compile llama.cpp as a dynamic library.
For me, the ideal option would be to have both GGML_BUILD_SHARED_LIBS and LLAMA_BUILD_SHARED_LIBS.
from llama.cpp.
I'm hitting the same issue here - however it's more dodgy in my case as I'm linking also grpc - and, before the mentioned PR that's how the linker was looking like on the resulting binaries, and everything was working fine:
linux-vdso.so.1 (0x00007ffe3a6b6000)
libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f784dbfc000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f784d800000)
libm.so.6 => /lib64/libm.so.6 (0x00007f784db15000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f784dae6000)
libc.so.6 => /lib64/libc.so.6 (0x00007f784d400000)
/lib64/ld-linux-x86-64.so.2 (0x00007f784f882000)
now, with shared mode on there are more shared libraries used, but that breaks badly my linker now:
/usr/lib64/gcc/x86_64-suse-linux/13/../../../../x86_64-suse-linux/bin/ld: ../../bin/grpc-server: hidden symbol `_ZN6google8protobuf2io19EpsCopyOutputStream16WriteRawFallbackEPKviPh' in /home/mudler/_git/LocalAI/backend/cpp/grpc/installed_packages/lib64/libprotobuf.a(coded_stream.cc.o) is referenced by DSO
and yeah, the symbol is hidden in libprotobuf.a - but the pity of that is that I can't always control how libprotobuf is built (indeed, it fails in mac with homebrew now)
It should be linked statically without the
-DBUILD_SHARED_LIBS=ON
parameter. What is not possible at the moment is to build llama.cpp as a shared library and ggml as a static library, but I am not sure that's really a problem.
Sadly that has no effect here - I never enabled that explicitly before. I 'm going to try to disable it explicitly and see if that's helping
Update: Thanks for the hint @slaren ! specifying -DBUILD_SHARED_LIBS=OFF
did the trick here - I guess now -DBUILD_SHARED_LIBS=ON
is implied by default?
from llama.cpp.
Some features of llama.cpp require using ggml. If llama.cpp is built as a shared library linking statically to ggml, then applications cannot use these features without linking their own copy of ggml - which may not work at all. Is it really that much of a problem to bundle the ggml shared library alongside llama.cpp?
In my case it is, because I'm using llama.cpp with gRPC, and that is by default static - and mixing gRPC static with shared ggml and llama cpp breaks the linker as some of the protobuf symbols are hidden when building from a DSO.
@mudler
BUILD_SHARED_LIBS
is the default in most cases:Lines 31 to 41 in f675b20
gotcha - thanks @slaren ! - it clicked here now thanks to your previous comment. I thought that was defaulting to ON before as well when I was looking at the PR diff, but it looks it wasn't (at least here), so now I had to disable it explicitly to get things going. I have yet to see what LocalAI's CI tells, but locally at least now builds fine as before.
from llama.cpp.
Thanks folks for what you've shared.
Some features of llama.cpp require using ggml. If llama.cpp is built as a shared library linking statically to ggml, then applications cannot use these features without linking their own copy of ggml
I don't believe that was the case unless I missed something. I used to use pretty much all of the llama.cpp API by compiling llama.cpp as a shared library, statically linking ggml, at least on Windows, macOS, and Linux systems.
Is it really that much of a problem to bundle the ggml shared library alongside llama.cpp?
Good point. I would not argue that it is a real problem. Plus I understand the willingness to keep ggml as a separate binary for the long run and/or for building apps using llama.cpp, Whisper.cpp, and probably other upcoming projects.
-> I'm reviewing my compilation and deployment pipelines to follow this approach.
If you don't mind, I'm marking this case as resolved.
from llama.cpp.
I'm having the same issue. Tried with both GCC and Clang in MSYS2.
Even when setting GGML_STATIC
to true, there's still dependencies to libc++, pthread ,etc
LLMA_STATIC from before was working and doing what we needed.
What are the flags now to get llama-server linked statically to libc++, etc?
from llama.cpp.
LLAMA_STATIC
never really worked here as it always linked on libgomp and failed linking, same GGML_STATIC
(but maybe caused by my requirements? ). Did you tried to set BUILD_SHARED_LIBS
to OFF?
from llama.cpp.
Related Issues (20)
- Bug: The inference speed of building with HIPBLAS (gfx1100) is very slow, only 2~5t/s HOT 10
- Feature Request: Installable package via winget HOT 1
- I have found a solution to the problem myself HOT 3
- @ggerganov!!! Why can't llama.cpp be used on the server side to provide one-to-many services? HOT 1
- Bug: Gemma-2 not supported on b3262 HOT 12
- Bug: Unable to generate the model output correctly HOT 5
- Show: FUTO-org Keyboard with llama.cpp-powered auto-correction and on-device finetuning
- Bug: Failed to allocate memory on the 2nd GPU for loading large model HOT 8
- Bug: ld: symbol(s) not found for architecture arm64
- Bug: Docker ROCm crashs, only works on metal compiled. HOT 2
- Refactor: gemma 9b not work with lama.cpp b3259 HOT 10
- NEW UI options don't reflect the OLD UI options.
- device Vulkan0 does not support 16-bit storage.
- [feature request] Ability to import/export sessions from the UI. HOT 8
- Feature Request: Support for CodeSage
- Failed to create gguf for Viking 13b although support for viking pre-tokenizer was added in one of the recent llama.cpp's releases HOT 1
- Bug: GGML assert with bf16, RTX3090 HOT 1
- Investigate gemma 2 generation quality HOT 62
- STILL no way to convert phi-3-small to GGUF HOT 10
- Bug: Server Embedding Segmentation Fault HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama.cpp.