A nice list of tests that I would like to implement in order to more easily make sure

Add unit/integration testing about autoawq HOT 4 OPEN

casper-hansen commented on June 26, 2024

Add unit/integration testing

from autoawq.

Comments (4)

bdambrosio commented on June 26, 2024

I'll take a shot at some of these. If nothing else I'll learn a lot.
Are you ok with api-level python tests?
I'm esp. interested in multi-GPU, but I'll start w some simpler ones.
Also, any hope of ever getting this to run on T4's (kaggle...). I'd be willing to dive pretty deep, and have the skills, but don't know enough about that level of cuda to know if it's even remotely possible.

from autoawq.

casper-hansen commented on June 26, 2024

I would love some help here for implementing the tests. T4 has compute capability 7.5, so it is not compatible with the AWQ CUDA kernel for running the quantized layers as they require 8.0 (Ampere architecture or later).

EDIT: To add support for earlier GPUs, you would have to implement a completely new CUDA kernel because the current one utilizes tensor cores that are 10x faster than CUDA cores. GPUs that are less than 8.0 in compute capability do not have tensor cores (I believe), so it cannot install or run the current CUDA kernel.

from autoawq.

bdambrosio commented on June 26, 2024

Ok, will work on tests.

Switching to CUDA core from Tensor cores doesn't sound totally out of the realm, esp since I'm just interested in inference only for that task, but I won't even think about it for a while.
tnx

from autoawq.

wanzhenchn commented on June 26, 2024

I would love some help here for implementing the tests. T4 has compute capability 7.5, so it is not compatible with the AWQ CUDA kernel for running the quantized layers as they require 8.0 (Ampere architecture or later).

EDIT: To add support for earlier GPUs, you would have to implement a completely new CUDA kernel because the current one utilizes tensor cores that are 10x faster than CUDA cores. GPUs that are less than 8.0 in compute capability do not have tensor cores (I believe), so it cannot install or run the current CUDA kernel.

@casper-hansen, @bdambrosio
Actually, the T4 GPU also has Tensor Cores (Hardware-Specific), However, its compute capability is 7.5 showed in GPU List.

The real reason that AWQ requires GPU sm_80 or higher lies in the fact that the gemm_cuda_gen.cu kernel uses the '.m16n8k16' feature, which requires GPU architecture sm_80 or higher.

from autoawq.

Recommend Projects

Add unit/integration testing about autoawq HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent