If we get <a class="issue-link js-issue-link" data-error-text="Failed to

Feature Request: Support for NVEmbed about llama.cpp HOT 2 OPEN

christianazinn commented on June 27, 2024 2

Feature Request: Support for NVEmbed

from llama.cpp.

Comments (2)

iamlemec commented on June 27, 2024 1

It looks like NVEmbed is basically Mistral but with non-causal attention and "latent attention" pooling. I hadn't seen latent attention pooling before, but judging from the modeling code on HF, it's just another attention layer on top of the last hidden states.

Right now in llama.cpp, we can tell causal-by-default models like Mistral to use non-causal attention. If we get #7477 merged, that will allow general pooling on these models. The only catch is we don't have latent pooling implemented, but it should be quite straightforward.

from llama.cpp.

christianazinn commented on June 27, 2024

If we get #7477 merged, that will allow general pooling on these models. The only catch is we don't have latent pooling implemented, but it should be quite straightforward.

Thanks, will wait for that to be merged.

from llama.cpp.

Feature Request: Support for NVEmbed about llama.cpp HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent