Comments (5)
Filtering in Lantern can be done using standard SQL WHERE clauses. Here is a small example:
CREATE TABLE small_world (id integer, vector real[3]);
INSERT INTO small_world (id, vector) VALUES (0, '{0,0,0}'), (1, '{0,0,1}');
-- Create an index on the vector column
CREATE INDEX ON small_world USING lantern_hnsw (vector);
-- Filter with a WHERE clause
SELECT id, vector FROM small_world WHERE id = 1;
For more details, refer to the README.md file.
References
/sql
/src/hooks
/test/sql
/README.md
from lantern.
Hi @lukebuehler ! The example above does do filtering on id = 1
.
There's another example here in the docs: https://lantern.dev/docs/develop/query
which does filtering on published_at < 2010
Does that help? Let me know if you were looking for something else!
from lantern.
Hi @lukebuehler,
The filtering example from the pgvector docs you linked above will work with no changes for lantern indexes as well!
That is - the index will be consulted for nearest vectors and the vectors will be filtered after the fact with the given SQL predicate.
If the filter filters out a large portion of the rows, vector searches with pgvector will often return no results.
Unlike pgvector, lantern will continue searching and returning results from the index until the given LIMIT is reached.
Currently, this is done by recursively searching for more and more elements, but I just opened a PR (#322 ) that makes this filtered search more efficient by loading elements from the vector index in a streaming fashion.
Let me know if you have any other questions!
from lantern.
This is helpful thanks! Good to know that you are using a recursive post-filter in lantern. I missed the query example for filtering.
In pgvector, if you have an index on the where column, the data is pre-filtered for exact search, and you can also create multiple partial indexes on a column which you can then select with a where clause. However, that only works with =
, not with other comparisons. So It's nice that you do recursive search for non-equality predicates!
As for partial indexes, does lantern support it? It would be helpful for multi-tenanting.
Just my 2c: I often look for filtering documentation for various vector dbs, and they are often not very clear. I think a doc page that explains the basics conditions when it pre-filters, post-filters, uses a partial index (or whatever your feature are), etc is really helpful. LanceDB has a decent page.
from lantern.
As for partial indexes, does lantern support it? It would be helpful for multi-tenanting.
Partial indexes are supported, exactly as in btree or pgvector indexes! Here is an example from our tests
Thanks for the feedback on the docs! Will improve them before closing this issue.
Let me know if you have any other questions!
from lantern.
Related Issues (20)
- Only export symbols we expect postgres to use in `lantern.so`
- ldb_invariant() doesn't format the given string using the rest of the parameters
- consider enabling PostgreSQL data checksums
- throughput of inserted data HOT 16
- Flaky failure point test
- Parallel tests fail on upgrades
- Distance operator `<->` wrongly used in CROSS-JOIN queries without a vector index HOT 1
- Make sure tests in different test groups do not run in parallel
- Lantern binary versioning
- Add llvm bytecode generation to lantern build process
- Lantern’s Performance vs. pgvector - Authenticity and Future Improvements HOT 3
- Make sure reindex_lantern_indexes() properly calls lantern_reindex_external_index on updates
- Improve PQ API
- supabase compatible? HOT 1
- current main(fbf0d1704e1ec) build failure on Debian HOT 4
- [doc]: README.md does not specify prerequisites for build Lantern from source code
- [improvement]: current CheckMem function assumes vector type to be float array, which can be more precise.
- Docker setup doesn't work HOT 2
- Collect index statistics in a system table
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lantern.