Comments (8)
Hi, can you test this build?
from llama.cpp.
I can't test it myself, but if this is the case, then it means I understand deviceID incorrectly - it is the same for different gpu of the same model.
Can you build llama.cpp with the LLAMA_VULKAN_DEBUG
option (or define GGML_VULKAN_DEBUG
) to see if there is something like ...have the same device id
in the log?
from llama.cpp.
If that is the case then I think deviceUUID should be used instead. I will submit a fix later today.
from llama.cpp.
@Adriankhl was afraid you might ask that :) unfortunately I have like zero clue how to build things on Windows.
- I think you are exactly right though given what I see about that data structure
- I did test the build right before this commit and saw two entries in command line output, so confident it's not something like my GPU being offline.
it's strange there's not like a serial ID of a GPU on vulkan
Not sure if "pipelineCacheUUID" is enough :/
Seems like there's something
https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkPhysicalDeviceIDProperties.html
but it looks harder to acquire.
from llama.cpp.
@richardanaya Thank you for reporting this, I don't have two GPUs of the same type so I didn't catch this bug. A workaround for you is setting the environment variable GGML_VK_VISIBLE_DEVICES=0,1
.
from llama.cpp.
I can't test it myself, but if this is the case, then it means I understand deviceID incorrectly - it is the same for different gpu of the same model.
Can you build llama.cpp with the
LLAMA_VULKAN_DEBUG
option (or defineGGML_VULKAN_DEBUG
) to see if there is something like...have the same device id
in the log?
Alternatively, can you download vulkaninfo and upload the output on your system? That should show both deviceID and deviceUUID.
from llama.cpp.
thanks @0cc4m
It does look as suspected.
GPU0:
VkPhysicalDeviceProperties:
apiVersion = 1.3.277 (4206869)
driverVersion = 2.0.299 (8388907)
vendorID = 0x1002
deviceID = 0x744c
deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
deviceName = AMD Radeon RX 7900 XTX
pipelineCacheUUID = 3396acc0-ea27-542c-926b-685c47c98626
VkPhysicalDeviceIDProperties:
deviceUUID = 00000000-4700-0000-0000-000000000000
driverUUID = 414d442d-5749-4e2d-4452-560000000000
deviceLUID = fa460100-00000000
deviceNodeMask = 1
deviceLUIDValid = true
GPU2:
VkPhysicalDeviceProperties:
apiVersion = 1.3.277 (4206869)
driverVersion = 2.0.299 (8388907)
vendorID = 0x1002
deviceID = 0x744c
deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
deviceName = AMD Radeon RX 7900 XTX
pipelineCacheUUID = 3396acc0-ea27-542c-926b-685c47c98626
VkPhysicalDeviceIDProperties:
deviceUUID = 00000000-0300-0000-0000-000000000000
driverUUID = 414d442d-5749-4e2d-4452-560000000000
deviceLUID = 1b700100-00000000
deviceNodeMask = 1
deviceLUIDValid = true
```
from llama.cpp.
@Adriankhl works! I see two graphics cards
from llama.cpp.
Related Issues (20)
- Bug: Llama3 8B Instruct Model outputting nonsensical text on AMD GPUs. HOT 2
- The image generated by dockerfile cannot be used
- The image generated by dockerfile cannot be used
- Bug: Server not support mmproj HOT 3
- Bug: Unable to load grammar from `json.gbnf` example HOT 2
- Feature Request: Support for Meta Chameleon 7B and 34B HOT 7
- Bug: UwU Emergency! Control Vectors for Qwen2 and Command-r Models Need Fixing! HOT 6
- Bug: ggml-cuda.cu: error: call of overloaded 'forward<std::array<float, 16>&>(std::array<float, 16>&)' is ambiguous HOT 2
- Bug: llama-server + LLava 1.6 hallucinates HOT 2
- Nit: Is `--config Release` necessary?
- server: Bring back multimodal support
- Feature Request: Support for Florence-2 Vision Models HOT 1
- Feature Request: Hardware support check HOT 12
- Bug: Or Feature? BPE Tokenization mutates whitespaces into double-whitespace tokens when add_prefix_space is true (default)
- Bug: Qwen2-72B-Instruct (and finetunes) Q4_K_M generates random output HOT 2
- Bug: Inference is messed up in llama-server+default ui and llama-cli but works in llama-server+openweb ui HOT 1
- Bug: `-fPIC` compiler flag missing in cmake build?
- Bug: Embedding endpoint takes exponential time to process a long unknown token HOT 3
- 我想convert一个比较大的模型时报错Unable to allocate 1.96 GiB for an array with shape (128256, 8192) and data type float16如何解决 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama.cpp.