Comments (5)
@saulius I'm going to take some vacation shortly but when I get back I'll have a go at that change.
from croaring-rs.
Hi @e-dard,
I haven't run any benchmarks, but I still think we can expect this behaviour. The reason being that by materializing Bitmap
as a vector you allocate a chunk of linear memory for all Bitmap's values. It's hard or even impossible to beat iteration through linear memory which may as well even fit in one of your CPU caches. The obvious trade off here is of course that you have to allocate such amount of memory upfront. However, with Roaring Bitmap's iterator we just allocate a small iterator instance and keep a single bitmap index value in memory throughout iteration. So this requires constant, small amount of memory. Moreover, doing iteration requires at least a few additional function calls like iterator.current_value
and iterator.has_value
which of course have an impact on iteration performance as well. So you have a memory/performance trade off here. Hope this helps!
from croaring-rs.
That explanation makes a lot of sense to me. One suggestion I have is to provide an API that would allow one to provide a buffer to write the contents of the bitmap into.
So, as well as to_vec()
there would also be into_vec(&mut buf: Vec<u32>)
or something similar.
Do you think that would be a good change?
from croaring-rs.
That would definitely be a good change. croaring
has roaring_read_uint32_iterator
function for exactly this use case and it's definitely faster than unbounded iteration:
./iteration_benchmark ../../benchmarks/realdata/census1881/*
Data:
cardinality: 988653
Cycles/element:
roaring_advance_uint32_iterator(): 14.492519 13.798000 13.844965 13.889676 13.953517
roaring_read_uint32_iterator(bufsize=1): 19.190871 19.074035 20.081280 18.888611 19.243656
roaring_read_uint32_iterator(bufsize=4): 8.188877 8.147134 8.211180 7.934367 8.095384
roaring_read_uint32_iterator(bufsize=16): 5.059551 4.899865 5.011902 4.999459 5.105984
roaring_read_uint32_iterator(bufsize=128): 3.762317 3.737760 3.823841 3.847504 3.778094
roaring_read_uint32_iterator(bufsize=1024): 3.629963 3.755138 3.880000 3.829131 3.664821
from croaring-rs.
Please try https://docs.rs/croaring/latest/croaring/bitmap/struct.BitmapIterator.html#method.next_many in croaring
0.5.2
.
from croaring-rs.
Related Issues (20)
- Does it support java Roaring64NavigableMap? HOT 3
- Support for native serialization format HOT 1
- Possible Breaking changes to make HOT 7
- croaring-rs in RoaringBitmap organisation HOT 2
- Build failed HOT 5
- macOS Sonoma builds fail HOT 2
- Unable to build on macOS HOT 1
- Treemap::andnot_inplace clears the treemap when comparing to empty treemap HOT 2
- Expose roaring_read_uint32_iterator safely HOT 2
- Serialization always starts with same bytes HOT 1
- Consider supporting a later version of the bindgen HOT 1
- Deserialize on invalid bytes produces invalid Bitmap HOT 1
- Make -march=native entirely optional to the build in croaring-sys HOT 1
- failed to build on macOS 11.0.1 (Big Sur) HOT 2
- Failed to build on Apple Silicon HOT 1
- How to manually open AVX2 support in CRoaring? HOT 1
- build failing on musl libc HOT 1
- Bundle pre-bindgen'd bindings for -sys package HOT 2
- Support for frozen methods
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from croaring-rs.