Comments (5)
@tvondra To be pedantic, what kinds of embeddings are these (dimensions, dimension data type (fp32?)? My rough math says 768-dim, fp32.
Also, this looks really promising 😄 But the reason I'm being pedantic is I think it'd be good to test the contexts across some different dimensionalities. VectorDBBench has a few large datasets for 768/1536 dim, it may be worth to try it out on the 1536 dim to see if you still see a similar gain (my gut says there will be a gain).
from pgvector.
Yes, vector(768) with FP32.
I'd probably expect this to be even more beneficial for larger vectors.
from pgvector.
Hi @tvondra, nice find! From some initial testing (and the profile), it looks like this affects the on-disk phase of the index build and can likely be applied to inserts as well. I also think we may be able to avoid copying vectors entirely if they are outside of the max candidate distance, so will take a look at that as well as part of this.
from pgvector.
Pushed a version of less copying to the hnsw-less-copy branch. Initial results for 100k, 1536-dimension random vectors on my local machine (using 4 processes and 64MB for maintenance_work_mem
): 189 sec before, 165 sec after.
Edit: I think there's room for further optimization, as we don't need to load the element at all (for search or builds/inserts) if it's outside the max candidate distance.
from pgvector.
Merged the branch in the commit above. There still may be some room to tune the memory context, but this should get most of the benefit from my testing.
from pgvector.
Related Issues (20)
- Installation instructions unclear HOT 1
- Large vector data type will cause performance decline? HOT 1
- A question regard table_open() in background worker when building index HOT 3
- jVector Implementation
- Type Error when working with Langchain (Missing Positional Argument: evalue) HOT 1
- pgvector still use row-based storage instead of columnar storage ? HOT 1
- Can't get the query planner to use HNSW index HOT 3
- 【search failed】 2000w、768dim, data search failed HOT 1
- ERROR: index row size 6160 exceeds btree version 4 maximum 2704 for index HOT 3
- Make difficulties HOT 2
- Table Insert Performance with HNSW Index HOT 3
- Comparison with high-precision data HOT 2
- Weight in the filters HOT 5
- can't make pgvector HOT 1
- src\bitvec.c(43): warning C4141: 'dllexport': used more than once HOT 7
- Porting indexes from pinecone to pgvector HOT 1
- Error when creating a halfvec_ip_ops index HOT 3
- Compiling on a mac (Intel)- clang: error: unsupported argument 'native' to option '-march=' HOT 4
- Ability to skip/offset probes (in ivfflat) HOT 1
- Question about generating embeddings HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pgvector.