Code Monkey home page Code Monkey logo

Comments (5)

greptile-apps avatar greptile-apps commented on July 22, 2024 1

Filtering in Lantern can be done using standard SQL WHERE clauses. Here is a small example:

CREATE TABLE small_world (id integer, vector real[3]);
INSERT INTO small_world (id, vector) VALUES (0, '{0,0,0}'), (1, '{0,0,1}');

-- Create an index on the vector column
CREATE INDEX ON small_world USING lantern_hnsw (vector);

-- Filter with a WHERE clause
SELECT id, vector FROM small_world WHERE id = 1;

For more details, refer to the README.md file.

References

/sql
/src/hooks
/test/sql
/README.md

Ask Greptile

from lantern.

dqii avatar dqii commented on July 22, 2024

Hi @lukebuehler ! The example above does do filtering on id = 1.
There's another example here in the docs: https://lantern.dev/docs/develop/query
which does filtering on published_at < 2010
Does that help? Let me know if you were looking for something else!

from lantern.

Ngalstyan4 avatar Ngalstyan4 commented on July 22, 2024

Hi @lukebuehler,

The filtering example from the pgvector docs you linked above will work with no changes for lantern indexes as well!
That is - the index will be consulted for nearest vectors and the vectors will be filtered after the fact with the given SQL predicate.

If the filter filters out a large portion of the rows, vector searches with pgvector will often return no results.
Unlike pgvector, lantern will continue searching and returning results from the index until the given LIMIT is reached.
Currently, this is done by recursively searching for more and more elements, but I just opened a PR (#322 ) that makes this filtered search more efficient by loading elements from the vector index in a streaming fashion.

Let me know if you have any other questions!

from lantern.

lukebuehler avatar lukebuehler commented on July 22, 2024

This is helpful thanks! Good to know that you are using a recursive post-filter in lantern. I missed the query example for filtering.

In pgvector, if you have an index on the where column, the data is pre-filtered for exact search, and you can also create multiple partial indexes on a column which you can then select with a where clause. However, that only works with =, not with other comparisons. So It's nice that you do recursive search for non-equality predicates!

As for partial indexes, does lantern support it? It would be helpful for multi-tenanting.

Just my 2c: I often look for filtering documentation for various vector dbs, and they are often not very clear. I think a doc page that explains the basics conditions when it pre-filters, post-filters, uses a partial index (or whatever your feature are), etc is really helpful. LanceDB has a decent page.

from lantern.

Ngalstyan4 avatar Ngalstyan4 commented on July 22, 2024

As for partial indexes, does lantern support it? It would be helpful for multi-tenanting.

Partial indexes are supported, exactly as in btree or pgvector indexes! Here is an example from our tests

Thanks for the feedback on the docs! Will improve them before closing this issue.
Let me know if you have any other questions!

from lantern.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.