Code Monkey home page Code Monkey logo

Comments (5)

e-dard avatar e-dard commented on May 23, 2024 1

@saulius I'm going to take some vacation shortly but when I get back I'll have a go at that change.

from croaring-rs.

saulius avatar saulius commented on May 23, 2024

Hi @e-dard,

I haven't run any benchmarks, but I still think we can expect this behaviour. The reason being that by materializing Bitmap as a vector you allocate a chunk of linear memory for all Bitmap's values. It's hard or even impossible to beat iteration through linear memory which may as well even fit in one of your CPU caches. The obvious trade off here is of course that you have to allocate such amount of memory upfront. However, with Roaring Bitmap's iterator we just allocate a small iterator instance and keep a single bitmap index value in memory throughout iteration. So this requires constant, small amount of memory. Moreover, doing iteration requires at least a few additional function calls like iterator.current_value and iterator.has_value which of course have an impact on iteration performance as well. So you have a memory/performance trade off here. Hope this helps!

from croaring-rs.

e-dard avatar e-dard commented on May 23, 2024

That explanation makes a lot of sense to me. One suggestion I have is to provide an API that would allow one to provide a buffer to write the contents of the bitmap into.

So, as well as to_vec() there would also be into_vec(&mut buf: Vec<u32>) or something similar.

Do you think that would be a good change?

from croaring-rs.

saulius avatar saulius commented on May 23, 2024

That would definitely be a good change. croaring has roaring_read_uint32_iterator function for exactly this use case and it's definitely faster than unbounded iteration:

./iteration_benchmark ../../benchmarks/realdata/census1881/*
Data:
  cardinality: 988653
Cycles/element:
  roaring_advance_uint32_iterator(): 14.492519 13.798000 13.844965 13.889676 13.953517
  roaring_read_uint32_iterator(bufsize=1): 19.190871 19.074035 20.081280 18.888611 19.243656
  roaring_read_uint32_iterator(bufsize=4): 8.188877 8.147134 8.211180 7.934367 8.095384
  roaring_read_uint32_iterator(bufsize=16): 5.059551 4.899865 5.011902 4.999459 5.105984
  roaring_read_uint32_iterator(bufsize=128): 3.762317 3.737760 3.823841 3.847504 3.778094
  roaring_read_uint32_iterator(bufsize=1024): 3.629963 3.755138 3.880000 3.829131 3.664821

from croaring-rs.

saulius avatar saulius commented on May 23, 2024

Please try https://docs.rs/croaring/latest/croaring/bitmap/struct.BitmapIterator.html#method.next_many in croaring 0.5.2.

from croaring-rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.