I don't have any issue, but I just saw this repo and wanted to say that this looks rea

This looks awesome! about autogptq HOT 2 CLOSED

autogptq commented on May 20, 2024

This looks awesome!

from autogptq.

Comments (2)

PanQiWei commented on May 20, 2024 1

Thanks so much for your like and support for this project! ❤️

Ooba made his own fork for stability, but this does not support the latest GPTQ methods; specifically, --act-order. If a user wants to use --act-order, they have to link qwopqwop's latest GPTQ-for-LlaMa in instead.

Yes, I also experimented the two new features --act-order and --true-sequential on GPTJ model and found out it can truly bridge the gap between the quantized model and the origin one.

Thus, I add the option of using --act-order to BaseQuantizeConfig and renamed to desc_act;
And for --true-sequential it has been add in this project as a non-optional feature, as you can see in this example

There is my roughly schedule and todo list of features on this project in the rest of days of this month, but it might be changed:

I would spend ~2 hours per day on this project in my spare time
Test on and add in more CausalLMs in transformers
Faster version CUDA kernels in earlier version of gptq-for-llama cuda branch would be added back, but this may not compatible with desc_act as described in here
Triton integration, but I'm really new to triton so I've turned to qwopqwop200 for some advices and help, current and future discussions will all be in here. Also it would be wonderful if you have some more tips about development with triton, and even better to directly contribute to this project. 😄

And in particular I think it could be really good if oobabooga implemented your code in text-generation-webui.

AutoGPTQ is still in a very early stage and many apis might be vastly changed in the future, but I will pull a request to text-generation-webui as soon as AutoGPTQ becomes stable.

from autogptq.