Code Monkey home page Code Monkey logo

rangemap's Introduction

rangemap

Crate Docs Build status Rust

RangeMap and RangeInclusiveMap are map data structures whose keys are stored as ranges. Contiguous and overlapping ranges that map to the same value are coalesced into a single range.

Corresponding RangeSet and RangeInclusiveSet structures are also provided.

Different kinds of ranges

RangeMap and RangeInclusiveMap correspond to the Range and RangeInclusive types from the standard library respectively. For some applications the choice of range type may be obvious, or even be dictated by pre-existing design decisions. For other applications the choice may seem arbitrary, and be guided instead by convenience or aesthetic preference.

If the choice is not obvious in your case, consider these differences:

  • If your key type K represents points on a continuum (e.g. f64), and the choice of which of two adjacent ranges "owns" the value where they touch is largely arbitrary, then it may be more natural to work with half-open Ranges like 0.0..1.0 and 1.0..2.0. If you were to use closed RangeInclusives here instead, then to represent two such adjacent ranges you would need to subtract some infinitesimal (which may depend, as it does in the case of f64, on the specific value of K) from the end of the earlier range. (See the last point below for more on this problem.)
  • If you need to represent ranges that include the maximum value in the key domain (e.g. 255u8) then you will probably want to use RangeInclusives like 128u8..=255u8. Sometimes it may be possible to instead work around this by using a wider key type than the values you are actually trying to represent (K=u16 even though you are only trying to represent ranges covering u8) but in these cases the key domain often represents discrete objects rather than points on a continuum, and so RangeInclusive may be a more natural way to express these ranges anyway.
  • If you are using RangeInclusive, then it must be possible to define successor and predecessor functions for your key type K, because adjacent ranges can not be detected (and thereby coalesced) simply by testing their ends for equality. For key types that represent points on a continuum, defining these functions may be awkward and error-prone. For key types that represent discrete objects, this is usually much more straightforward.

Example: use with Chrono

use chrono::offset::TimeZone;
use chrono::{Duration, Utc};
use rangemap::RangeMap;

fn main() {
    let people = ["Alice", "Bob", "Carol"];
    let mut roster = RangeMap::new();

    // Set up initial roster.
    let start_of_roster = Utc.ymd(2019, 1, 7);
    let mut week_start = start_of_roster;
    for _ in 0..3 {
        for person in &people {
            let next_week = week_start + Duration::weeks(1);
            roster.insert(week_start..next_week, person);
            week_start = next_week;
        }
    }

    // Bob is covering Alice's second shift (the fourth shift overall).
    let fourth_shift_start = start_of_roster + Duration::weeks(3);
    let fourth_shift_end = fourth_shift_start + Duration::weeks(1);
    roster.insert(fourth_shift_start..fourth_shift_end, &"Bob");

    for (range, person) in roster.iter() {
        println!("{} ({}): {}", range.start, range.end - range.start, person);
    }

    // Output:
    // 2019-01-07UTC (P7D): Alice
    // 2019-01-14UTC (P7D): Bob
    // 2019-01-21UTC (P7D): Carol
    // 2019-01-28UTC (P14D): Bob
    // 2019-02-11UTC (P7D): Carol
    // 2019-02-18UTC (P7D): Alice
    // 2019-02-25UTC (P7D): Bob
    // 2019-03-04UTC (P7D): Carol
}

Crate features

By default this crate has no dependencies on other crates.

If you enable the serde1 feature it will introduce a dependency on the serde crate and provide Serialize and Deserialize implementations for all map and set types in this crate.

You can enable the serde1 feature in your Cargo.toml file like so:

[dependencies]
rangemap = { version = "1", features = ["serde1"] }

Building without the Rust standard library

This crate can work without the full standard library available (e.g. when running on bare metal without an operating system) but relies on the presence of a global allocator — i.e. it links the core and alloc crates, but not std.

Presently there is no functionality in the crate that require the standard library. Such functionality will likely be introduced in the future, and will be gated behind a default-on std feature.

See The Rust Programming Language book for general information about operating without the standard library.

rangemap's People

Contributors

bors[bot] avatar ecclarke42 avatar eliageretto avatar genusistimelord avatar h33p avatar hecrj avatar jasonwhite avatar jeffparsons avatar kevinaboos avatar luukvanderduim avatar modprog avatar nmathewson avatar xfbs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

rangemap's Issues

`Eq` requirement for `V`

Is Eq really needed for the value type? Std maps like HashMap and BTreeMap does not pose any assumptions on the values except when implementing PartialEq and Eq, respectively.

RangeMap::overlapping_mut?

Thanks for this crate! Also thanks for adding RangeMap::overlapping recently; this is super useful.

Do you see any reason why we cannot also add RangeMap::overlapping_mut to mutate the values? BTreeMap has range_mut, so it should be easy to implement. I can't see how this would break any of the internal invariants of splitting or coalescing. As of now, the only way to mutate a value is by inserting a new value. This is quite cumbersome when the value is a Vec<_> that we might want to append to, for example.

If this seems reasonable, I'm happy to implement it.

Expose `len` and `is_empty` methods?

It would be great it rangemap could expose len and is_empty methods from underlying BTreeMap. But ideally get to the feature parity with BTreeMap itself if possible

Implement `From<[_; N]>` for range sets and maps

It would be nice if we implemented From<[T; N]> for the set types and From<[(K, V); N]> for the map types.

This would allow simplification of some code and easy initialization of maps and ranges from static data.

Missing contains_keys_in_range and get_in_range

Hello,

I've just discovered your crate and it suits all my needs but one. I need to know whether the maps (in my case) contains any key in a provided range.

For example:

let mut map = RangeMap::new();

// Fill map.
map.insert(10..20, "a");
map.insert(50..70, "b");

// Checks.
map.contains_keys_in_range(5..9); // false
map.contains_keys_in_range(5..35); // true
map.contains_kesy_in_range(20..45); // false

Also, it would be very useful to retrieve all elements that collide with the range:

let mut map = RangeMap::new();

// Fill map.
map.insert(10..20, "a");
map.insert(50..70, "b");

// Get.
map.get_in_range(5..9); // []
map.get_in_range(5..35); // ["a"]
map.get_in_range(20..45); // []

Note: feel free to change the method names.

Use K: RangeTrait instead of Range<T> as a key.

I would like to use my own struct as a range so I can keep the methods I have implemented on it.

struct MySpecialRange(...);
impl MySpecialRange {
    fn my_special_function(){}
}

let range_set = RangeSet<MySpecialRange>::new();

Unfortunately, at the moment RangeMap is implemented using core::ops::Range which is a struct so this is not possible.
Instead we could make RangeSet generic and bounded by RangeTrait where RangeTrait provides the same functionality as core::ops::Range.
This would be mean you could use any type that implemented RangeTrait as a key and my use-case above would work, given I implemented RangeTrait on it.

Add benchmarks

Not urgent in the short term, but essential if experimenting with different internal storage or other optimizations.

Serialize like HashMap

Currently, RangeMap and RangeInclusiveMap serialize (using serde) like

((key1start, key1end), value1),
((key2start, key2end), value2),
.
.
((keyNstart, keyNend), valueN)

However, HashMap serializes like

key1:value1,
key2:value2,
.
.
keyN:valueN

Since RangeMap and RangeInclusiveMap are analogous to HashMap, their serialization should preferably be consistent with the convention followed in the implementations coded within the serde crate.

I propose that the serialization be changed to

(key1start,key1end):value1,
(key2start,key2end):value2,
.
.
(keyNstart,keyNend):valueN

which will make it consistent with the convention followed in the serde crate.

(I understand that this will be a breaking change and will probably come with a major version change)

Conform to borrowing convention from `std`

BTreeMap has signatures like...

    pub fn get<Q: ?Sized>(&self, key: &Q) -> Option<&V>
        where K: Borrow<Q>,
              Q: Ord

Using Borrow in RangeMap and RangeSet would be better than the plain borrows it has now.

some possible inefficiences

rangemap

This is one my favorate data structures.
Thank you for implementing it.

I believe ranges are under utilized in practise in general.

I implemented the "same" thing in C++ years ago.

My implementation has some efficiency gaps and I think yours does too.

I'm not good at Rust, but when I went to reimplement
my C++ open and perfected, I ran into, something
I think you "messed up" too.

Specifically I think there need not be any linear searches.

Including here:

https://github.com/jeffparsons/rangemap/blob/master/src/map.rs#L155

    while let Some((stored_range_start_wrapper, stored_value)) = self

while I have not finished my code, I believe you can limit yourself
to two binary searches -- for the start and the end.

Also here, I think your implementation sometimes does an insert
where it can reuse an existing range.

I realize this structure has surprisingly many cases to handle and
making it perfect is not trivial.

e.g. you should avoid this:

// Delete the stored range, and then add back between

    // Delete the stored range, and then add back between
    // 0 and 2 subranges at the ends of the range to insert.
    self.btm.remove(&stored_range_start_wrapper);

Can we reduce dev-dependencies?

The dependency on skeptic is pulling in a huge number of dependencies.

It would be very nice if this could be configured away somehow.

I know it doesn't affect the binary, but in a security-conscious organisation anything admitted onto build-servers needs to pass a security review, and the "skeptic"-dependency pulls in tens of crates to verify.

Observing range reductions and splits as the result of an insertion

Motivation

I have a use-case where I need to be able to observe and mutate the ranges that:

  1. Are the result of a split, or
  2. whose size is reduced.

That is, I don't care about the ranges that are completely replaced by the new one, only the ones that get reduced in size or are split in half. This would be useful to ensure that the ranges in the map always have a unique value. This is also useful to entirely avoid the V: Clone requirement for insertions (.clone() may not have enough context for the type).

As far as I can tell, there isn't a way to do this efficiently (either before or after the insertion). Happy to be proven wrong on this!

API

I admit this might be a pretty niche use-case, but if you think it has a spot in this crate, then here's a possible interface for it:

pub enum MaybeSplit<'a, T> {
    /// The range needs to be split. Give a reference to one side of the split,
    /// so it can be cloned.
    Split(&'a T),
    /// The range size needs to be reduced. Allow the user to get ownership and
    /// remap the value.
    Reduced(T),
}

impl<K, V> RangeMap<K, V>
where
    K: Ord + Clone,
    V: Eq,
{
    pub fn insert_with<F>(&mut self, range: Range<K>, value: V, f: F)
    where
        F: FnMut(MaybeSplit<V>) -> V,
    {
        todo!()
    }
}

Here, f would be called if the insertion would trigger one of the two scenarios above. f would take the range's value and map it to a new value. Given the circumstances where it might be called, I believe it can only be called up to 2 times.

RangeMap::insert could also be implemented in terms of insert_with, where f would just clone in the MaybeSplit::Split case.

Example

let mut map: RangeMap<u32, u32> = RangeMap::new();
let mut id = 0;

map.insert(0..10, id);
id += 1;
map.insert(10..20, id);
id += 1;

map.insert_with(8..12, id, |v| {
    match v {
        // Give a new ID to ranges that get split.
        MaybeSplit::Split(_) => {
            let new_id = id;
            id += 1;
            new_id
        }
        // Keep the same value for reduced ranges.
        MaybeSplit::Reduced(v) => v,
    }
});
id += 1;

I'm definitely open to feedback on this interface. I think insertions/deletions are the only legitimate case where V needs Clone, so maybe this could also be a way to avoid that requirement. A similar interface could be implemented for RangeMap::remove.

As always, I'm happy to implement all of this if this seems reasonable.

cannot insert a range that the start equal with the end

in src/map.rs

/// Panics if range `start >= end`.
    pub fn insert(&mut self, range: Range<K>, value: V) {
        // We don't want to have to make empty ranges make sense;
        // they don't represent anything meaningful in this structure.
        assert!(range.start < range.end);

I think we should allow the range params that the start equal with the end, it's meaningful in real business scence,
would you mind I change the code for this?

`DoubleEndedIterator` for rangemap iterators

This is mentioned in #74, but I figured it make sense to track this in it's own issue.

It would be quite nice to support DoubleEndedIterator for the rangemap iterator types. This allows for walking over the ranges within the sets or the range-value tuples of the maps in reverse order.

Cannot insert a range that includes the maximum value of a given key type

I want to use rangemap using u16 indices, where the multiple entries in the map are expected to fill the full 0-65535 range, but since insert() only accepts an exclusive range, and 65536 is out of range for my index type (K), it is impossible to create an entry that spans the full range. I am forced to use a different type, which is error-prone, or be off by one.

Easiest solution would be to change all uses of Range with RangeInclusive and bump the minor version.

Flesh out README

Describe what's in the box, and why you might want it.

  • The types provided
  • The basics of how they're implemented (BTreeMap)
  • Basic example with numbers
  • More complex example with DateTime from Chrono
  • References to similar libraries
  • Minimum Rust version — find way to keep in sync with CI

Fix CI

Currently, the CI system is broken (perhaps due to a new dependency?). We should get this fixed, I think bumping the MSRV to 1.63 should do the trick.

`gaps` documentation is ambiguous w/r/t empty ranges

The documentation says:

    /// Gets an iterator over all the maximally-sized ranges
    /// contained in `outer_range` that are not covered by
    /// any range stored in the map.

This indicates that gaps will find holes in the range-map that intersect outer_range... We had an engineer check if a range was completely covered by the RangeMap by if range_map.gaps(&check_range).next().is_none() {...}, which seems like it should work (based on the documentation), but broke in our case, because the gaps iterator can return empty ranges:

    #[test]
    fn adjacent_no_coalesce() {
        let mut range_map: RangeMap<u32, bool> = RangeMap::new();
        range_map.insert(2..5, false);
        range_map.insert(5..8, true);
        let outer_range = 2..8;
        let mut gaps = range_map.gaps(&outer_range);
        // Should yield an empty range.
        let gap = gaps.next();
        assert_eq!(gap, Some(5..5));
        assert_eq!(gap.unwrap().is_empty(), true);
    }

I think the current behavior is good, but the documentation could be improved to better indicate that gaps can return empty ranges.

Thoughts on coalescing and mutability

At the moment the only way to change the values in the map is to insert new ones. There is no way to update existing ones. I understand that this has to do with the coalescing behaviour; if you change something, you have to check afterwards if it can be merged with an adjacent range. But having no mutable operations also makes some tasks really cumbersome.

In my particular case, I want to make a specific change to all items within a range, filling in gaps and splitting existing ranges at either end if necessary. To do that currently, I have to clone the items from the specified range, store them somewhere else, change them, then insert them back in. That temporary storage also needs an allocation, which is something I'd rather avoid as this is performance sensitive code.

Range coalescing can be useful for speed, but in my case above with a mutation-heavy workload, that speed benefit may well be lost. It also has the drawback that values need to be Eq, and comparing values may be costly. I therefore propose an alternative:

  • No coalescing is done anymore. Two touching ranges can have the same value. Eq is no longer needed.
  • Mutable operations can now be added since there is no invariant to uphold.
  • Add a split method to split a range in two at a given key.
  • Add a merge method that does the coalescing manually. This method would have an Eq bound on the value.

This would give the user control over the coalescing behaviour, and they can decide to use it or not, or use it less frequently. It would also help with my case; I could now call split at the ends of the range, then call range_mut (to be written) to mutate the ranges in place.

I'm willing to implement these changes if wanted.

[Feature Request] allow optional "no coalescing"

I'm trying to use rangemap::RangeSet (exactly what I was looking for, thanks!).
But in my case, I don't want neighboring ("adjacent") ranges to coalesce into a single range.

From the docs

If the inserted range either overlaps or is immediately adjacent any existing range, then the ranges will be coalesced into a single contiguous range.

For me, it would help if I could "turn off coalescing" (or "turn off adjacent coalescing") for some rangemap instances (RangeMap, RangeSet, etc.).

Ideally, this would be set at initialization time.

Another option is to allow something like my_rangemap.coalesce = false but require it be done before any modifying operations like insert, otherwise panic!.

Thanks for creating this crate!

gaps return empty ranges

version = "1.0.1"

Gaps returns empty ranges for this simple case:

let mut ss = RangeInclusiveMap::new();
ss.insert(0u32..=0u32, 0u32);
ss.insert(1u32..=1u32, 1u32);

for v in ss.iter() {
    log::error!("in {:?}", v)
}

for v in ss.gaps(&(0u32..=u32::MAX)) {
    log::error!("gap {:?} empty: {}", v, v.is_empty())
}

output:

[2022-04-14T09:27:26Z ERROR shine_crde::operations] in (0..=0, 0)
[2022-04-14T09:27:26Z ERROR shine_crde::operations] in (1..=1, 1)
[2022-04-14T09:27:26Z ERROR shine_crde::operations] gap 1..=0 empty: true             <---- I'd not expect this response
[2022-04-14T09:27:26Z ERROR shine_crde::operations] gap 2..=4294967295 empty: false

`rangemap!` macro

Thank you for maintaining this library!

I tried to create a literal RangeMap using this example:

use std::collections::{BTreeMap, BTreeSet, HashMap, HashSet};

macro_rules! collection {
    // map-like
    ($($k:expr => $v:expr),* $(,)?) => {{
        core::convert::From::from([$(($k, $v),)*])
    }};
    // set-like
    ($($v:expr),* $(,)?) => {{
        core::convert::From::from([$($v,)*])
    }};
}

fn main() {
    let s: BTreeSet<_> = collection! { 1, 2, 3 };
    println!("{:?}", s);

    let s: BTreeMap<_, _> = collection! { 1 => 2, 3 => 4 };
    println!("{:?}", s);
}

Unfortunately, the same trick does not work for RangeMap or RangeSet:

54 |         core::convert::From::from([$(($k, $v),)*])
   |         ------------------------- ^^^^^^^^^^^^^^^ the trait `From<[(std::ops::Range<usize>, Color); 1]>` is not implemented for `RangeMap<usize, Color>`

If you decide that you want to implement a rangemap! macro similar to vec!, smallvec!, I reccommend that you:

  • Implement From<[Range<K>, V; N]> for Range* types
  • Implement and export corresponding macros

I understand that you may be busy, so I may be able to help with implementing this feature.

Alternatives for the end user

In the meanwhile, this code works for me:

macro_rules! rangemap_lit {
    ($($k:expr => $v:expr),* $(,)?) => {{
        RangeMap::from_iter([$(($k, $v),)*])
    }};
}

Let's have panics with explanatory messages

Firstly, thank you for this crate!

Secondly, a cosmetic programmer-experience improvement:
The documentation clearly states that inserting an empty range into a RangeMap results in panic.
And it is checked with assert!().

It would be nice if the asserts get some messages, that explain that inserting an empty range is forbidden.
This would make debugging faster and understanding panics easier.

License files should be included in the repository

The Cargo.toml lists this as licensed under both MIT and Apache-2.0 terms, but only LICENSE-MIT is included in the repository. Can either the Cargo.toml be updated, or else add a LICENSE-APACHE file?

RangeInclusiveMap does not reliably coalesce contiguous ranges

Here is a simple example of a test program that doesn't work for me, using rangemap 0.1.8

fn main() {
    let mut map: RangeInclusiveMap<u16, u16, _> = RangeInclusiveMap::new();

    map.insert(99..=200, 7);
    map.insert(190..=200, 6);
    map.insert(190..=200, 7);

    dbg!(map);
}

I expect the output to be:

map = {
      99..=200: 7,
}

But instead I get:

[src/main.rs:11] map = {
    99..=189: 7,
    190..=200: 7,
}

Breaking change from 1.0.1 to 1.0.2 in gaps()

Commit 9d0b125 may have introduced a bug in another project and is likely a breaking change.

Psst (a Spotify client) is a user of RangeSet and it has a bug jpochyla/psst#301 that is evoked by the changes made in version v1.0.2 of rangemap but not by v1.0.1. Bisecting lead to 9d0b125 as the commit that introduced the problem.

In order to see what behavior changed in rangemap, I copied (~30) tests from map to set as RangeSet is used in Psst.
Then I copied all set tests to the v1.0.1 commit too and found one test that fails in v1.0.1, but passes in v1.0.2:

    #[test]
    fn empty_outer_range_with_items_away_from_both_sides() {
        let mut range_set: RangeSet<u32> = RangeSet::new();
        // 0 1 2 3 4 5 6 7 8 9
        // ◌ ◆---◇ ◌ ◌ ◌ ◌ ◌ ◌
        range_set.insert(1..3);
        // 0 1 2 3 4 5 6 7 8 9
        // ◌ ◌ ◌ ◌ ◌ ◆---◇ ◌ ◌
        range_set.insert(5..7);
        // 0 1 2 3 4 5 6 7 8 9
        // ◌ ◌ ◌ ◌ ◆ ◌ ◌ ◌ ◌ ◌
        let outer_range = 4..4;
        let mut gaps = range_set.gaps(&outer_range);
        // Should yield a gap covering the zero-width outer range.
        assert_eq!(gaps.next(), Some(4..4));
        // Gaps iterator should be fused.
        assert_eq!(gaps.next(), None);
        assert_eq!(gaps.next(), None);
    }

Which fails like so:

    ---- set::tests::empty_outer_range_with_items_away_from_both_sides stdout ----
    thread 'set::tests::empty_outer_range_with_items_away_from_both_sides' panicked at 'assertion failed: `(left == right)`
      left: `None`,
     right: `Some(4..4)`', src/set.rs:789:9
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

The 1.0.2 version says, yes the empty outer range can contain a gap, being the empty range itself.
However 1.0.1 disagrees.

Looking at the definition of ranges in std::ops::Range:

The range start..end contains all values with start <= x < end. It is empty if start >= end.

I would argue that 1.0.1 is correct as when the range's start is equal to end, it contains all values start <= x < start and the 'gap' x cannot be both smaller than start and greater than or equal to start.

intersection for RangeSet and RangeInclusiveSet

This is a really awesome crate, thank you very much for implementing it!

I have a use-case where I would like to get the intersection of two RangeInclusiveSet. I think this could be done somewhat efficiently.

Implement PartialEq and Eq for RangeMap

RangeMap currently does not implement Eq or PartialEq, even if the keys and values stored in it implement those traits.

It would be nice if these traits were implemented. One thing this allows would be to store RangeMaps inside RangeMaps.

My use case is the following:

struct SymbolFile {
    functions: RangeMap<u64, Function>,
}

struct Function {
    name: String,
    lines: RangeMap<u64, SourceLine>,
}

This currently doesn't work because you can't #[derive(Eq)] on Function due to the RangeMap inside it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.