What happened? Trying to quantize <a href="https://huggingface.co/

<a class="issue-link js-issue-link" data-error-text="Failed to load title" data-id="23

Bug: Can't quantize 405B Mega merge about llama.cpp HOT 4 CLOSED

bartowski1182 commented on August 16, 2024 2

Bug: Can't quantize 405B Mega merge

from llama.cpp.

Comments (4)

slaren commented on August 16, 2024 1

#7359 broke models with more than 256 layers.

from llama.cpp.

bartowski1182 commented on August 16, 2024 1

Ooo I see... On purpose or as a consequence of supporting that model? Could it be patched or is it a hard limit?

from llama.cpp.

compilade commented on August 16, 2024

On purpose or as a consequence of supporting that model? Could it be patched or is it a hard limit?

It's a consequence of keeping llama_hparams trivially copyable with a compile-time known size, while having layer-wise hyper-parameters. Increasing the limit to 512 would make llama_hparams take 6.16 KiB instead of 3.16 KiB, but that's pretty much the only thing it changes. The size.

Making the layer-wise hparams take less space when not needed is something which I'll likely fix eventually, so that the limit only applies to models which need layer-wise hparams.

from llama.cpp.

Haus1 commented on August 16, 2024

Ooo I see... On purpose or as a consequence of supporting that model? Could it be patched or is it a hard limit?

It appears to be an arbitrary limit even though an int64 can handle an absurd 9x10^18 before overflowing. I don't know why but this seems to be something fairly unique to the machine learning space even though it makes the code needlessly brittle and user hostile.

from llama.cpp.

Recommend Projects

Bug: Can't quantize 405B Mega merge about llama.cpp HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent