Hi, First of all thank you for this plugin. Relieved us of the pain of going throu

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Combining elastiknn with standard ES queries. about elastiknn HOT 6 CLOSED

alexklibisz commented on May 13, 2024

Combining elastiknn with standard ES queries.

from elastiknn.

Comments (6)

alexklibisz commented on May 13, 2024

Hi, this shows how I've gotten this type of combination query working in the past: http://elastiknn.klibisz.com/api/#running-nearest-neighbors-query-on-a-filtered-subset-of-documents. If that doesn't fit your usecase, can you post some example docs to try your query?

from elastiknn.

alexklibisz commented on May 13, 2024

@joseph-macraty I happened to run into an issue with the combined query while working on something else. I found that my original example in the docs technically works, but it will actually evaluate over all of the docs, instead of just the ones matching a filter. I updated the example linked above so that it will only run knn on the docs matching a filter.

from elastiknn.

joseph-macraty commented on May 13, 2024

Hi @alexklibisz ,
Thanks! It's working perfectly now:)

from elastiknn.

alexklibisz commented on May 13, 2024

@joseph-macraty If I may ask, what's your usecase for Elastiknn? (just trying to get a sense of how people are using it in practice, since I don't use Elasticsearch at work myself)

from elastiknn.

joseph-macraty commented on May 13, 2024

We are working on text based search engines for different use cases. Here's how we are currently using Elasitknn:

We had developed a couple of BERT based models for search and were testing it out with Elastic Cloud. We were satisified with it and wanted to use them in production. We initially thought we couldn't use ES because it did not support Approximate Vector Search. We explored other options like Faiss/Annoy but for it we had to modify a non-trivial amount of our existing pipeline/codebase (we were using ES earlier). They also added a significant computational cost.

That is when I came across one of your comments on the ES repo. It was really smooth to setup and we got up and running in <1hr. Our current index has about 5.8 million (768 dimmensional vectors) documents and on a Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz with 8GB ram, we have an average time of 1.56s. Recall has been great too with only a minor drop.

All in all, I think unless you have hundreds of millions of documents or require really fast search (and can afford the added effort and computational resources) Elastiknn is the best option. For a large number of use-cases, elastiknn could replace FAISS/Annoy. In our inital searches though, Elastiknn never came up and we could have easily missed it. Is there anyway to increase the visibilty of this excellent repo? I think semantic search is an upcoming field and hence there are only a few blogs on it and all of them use the other more popular ANN implementations. If there are any ways we can contribute (writing blogs?), we would love to!

from elastiknn.

alexklibisz commented on May 13, 2024

That's all great to hear. Great motivation for me to keep chipping away at this. Mind if I ask what company you're at?

The original source of this idea was a very similar problem to the one you described. That was at an old job, and I no longer have the problem day-to-day, but I got a lot better with Java/Scala/gradle/etc in my most recent job, so I've given this another pass.

In terms of visibility, I'm planning to do an "Introducing Elastiknn"-style blog post. The plan has been to do that after I get it integrated with the ann-benchmarks project. That seems to be table stakes for any ANN solution nowadays. It' been tough because the JVM is painfully slow compared to all of the C/C++/in-memory implementations used in that project. I'm pretty confident I can speedup one remaining bottleneck and that will make a big difference. Then once it's merged into ann-benchmarks I'll do a more celebratory writeup on medium or something.

from elastiknn.

Combining elastiknn with standard ES queries. about elastiknn HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent