Comments (11)
Duplicate of bentoml/BentoML#3985
from openllm.
qq: Can you check if on Linux the following path exists /opt/rocm/libexec/rocm_smi
?
from openllm.
Yeap (looks like I have 2 ROCM installations for some reason, one under /opt/rocm
and another under /opt/rocm-5.5.0
with the latter one being used):
~ $ ls -al /opt/rocm/libexec/rocm_smi
total 176
drwxr-xr-x 2 root root 4096 Ιουν 23 00:50 .
drwxr-xr-x 6 root root 4096 Ιουν 24 14:28 ..
-rwxrwxr-x 1 root root 148963 Απρ 19 03:03 rocm_smi.py
-rw-r--r-- 1 root root 19229 Απρ 19 03:03 rsmiBindings.py
~ $ ls -al /opt/rocm-5.5.0/libexec/rocm_smi/
total 176
drwxr-xr-x 2 root root 4096 Ιουν 23 00:50 .
drwxr-xr-x 6 root root 4096 Ιουν 24 14:28 ..
-rwxrwxr-x 1 root root 148963 Απρ 19 03:03 rocm_smi.py
-rw-r--r-- 1 root root 19229 Απρ 19 03:03 rsmiBindings.py
from openllm.
Does the /opt/rocm-5.5.0
symlinked back to /opt/rocm
?
from openllm.
I have preliminary support for AMD GPU on main branch. You can try it out:
pip install git+https://github.com/bentoml/openllm@main
from openllm.
This has been included in 0.1.19
from openllm.
Awesome!! Thanks!
I installed 0.1.19 from PyPI and tried to run the StarCoder model but I still get No GPU available, therefore this command is disabled
. I was not expecting my GPU to be able to load StarCoder (not enough vRAM) but it should at least try to.
Installation & execution steps:
~$ python3 -m venv env
~$ source env/bin/activate
~$ pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2
~$ pip3 install openllm==0.1.19
~$ pip install "openllm[starcoder]"
openllm start starcoder
PyTorch reports that GPU is available:
~$ python3 -c "import torch;print(torch.cuda.is_available())"
True
rocminfo
output:
~$ rocminfo
ROCk module is loaded
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen 7 5825U with Radeon Graphics
Uuid: CPU-XX
Marketing Name: AMD Ryzen 7 5825U with Radeon Graphics
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2000
BDFID: 0
Internal Node ID: 0
Compute Unit: 16
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 12351708(0xbc78dc) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 12351708(0xbc78dc) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 12351708(0xbc78dc) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
*******
Agent 2
*******
Name: gfx90c
Uuid: GPU-XX
Marketing Name: AMD Radeon Graphics
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
L2: 1024(0x400) KB
Chip ID: 5607(0x15e7)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2000
BDFID: 1792
Internal Node ID: 1
Compute Unit: 8
SIMDs per CU: 4
Shader Engines: 1
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 64(0x40)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 40(0x28)
Max Work-item Per CU: 2560(0xa00)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 4194304(0x400000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx90c:xnack-
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
from openllm.
Can you set CUDA_VISIBLE_DEVICES=0?
from openllm.
Just tried it, no luck - is there any logs / output I could provide that could help?
In any case, it could be an issue with my setup, if it works on another AMD setup I suggest this issue gets closed.
from openllm.
I don't have my hand on a AMD GPU right now, so I'm just going off the assumption of rocm is setting up correctly.
OpenLLM/src/openllm/_strategies.py
Line 201 in 8ac2755
from openllm.
tentatively support since I don't really have the access to hardware to test this out.
from openllm.
Related Issues (20)
- bug: Not able to work with locally downloaded model
- How to modify the port number?
- How to load a model offline HOT 1
- bug: "Failed to parse JSON from SSE message" error HOT 3
- bug: AttributeError in Phi-2 Model Initialization Using BentoML OpenLLM Docker Container HOT 3
- bug: Docker images with GPTQ quantized models do not have auto-gptq or optimum installed
- How to run a docker image on multiple GPUs?
- bug: TypeError attribute name must be string, not 'NoneType' HOT 3
- feat: add vllm 0.3.0 support HOT 3
- bug: running any model results on the error: "Error: [bentoml-cli] `serve` failed: Invalid Tag _service:svc"
- bug: Can't load local models with (No such file or directory) error
- How to update the prompt template without change openllm-core config
- feat: Document usage of request_id
- Can't pass workers_per_resource to the bentoml container HOT 2
- RunnerService: MAX_MODEL_LEN is not reflected to the llm._max_model_len HOT 2
- bug: Requests with "use_beam_search: true" fail with an unclear exception message.
- bug: Error in sending post request for bentoml container service HOT 1
- Runtime error about concurrency
- When attempting to add the CohereForAI/aya-101 model using CohereForAI's Aya model, an error occurred during the loading process.
- openllm_core.exceptions.OpenLLMException: Failed to initialise vLLMEngine due to the following error: Model architectures ['T5ForConditionalGeneration'] are not supported for now. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openllm.