Code Monkey home page Code Monkey logo

min-max-pqueue's Introduction

min-max-pqueue

A min-max priority queue provides efficient access to both its least element and its greatest element. Also known as double-ended priority queue.

This library provides two variants of min-max priority queues:

A min-max priority queue can be configured with a maximum size. Each time an insertion causes the queue to grow beyond the size limit, the greatest element will be automatically removed (rather than rejecting the insertion).

Their implementations are backed by Map prio (NonEmpty a) and IntMap (NonEmpty a), respectively. This means that certain operations are asymptotically more expensive than implementations backed by mutable arrays, e.g., peekMin and peekMax is O(n log n) vs. O(n), fromList is also O(n log n) vs. O(n). In a pure language like Haskell, a mutable array based implementation would be impure and need to operate inside monads. And in many applications, regardless of language, the additional time complexity would be a small or negligible price to pay to avoid destructive updates anyway.

If you only access one end of the queue (i.e., you need a regular priority queue), an implementation based on a kind of heap that is more amenable to purely functional implementations, such as binomial heap and pairing heap, is potentially more efficient. But always benchmark if performance is important; in my experience Map always wins, even for regular priority queues.

Advantages over Using Maps Directly

  • size is O(1), vs. O(n) for maps. Note that Data.Map.size is O(1) but it returns the number of keys, which is not the same as the number of elements in the queue. Data.IntMap.size, on the other hand, is O(k) where k is the number of keys.
  • A queue can have a size limit, and it is guaranteed that its size does not grow beyond the limit.
  • Many useful operations, such as takeMin, dropMin, are non-trivial to implement with Map prio (NonEmpty a) and IntMap (NonEmpty a).
  • The queue's fold operations operate on individual elements, as opposed to NonEmpty a.

Alternative Implementation

In Haskell, an alternative to the mutable array based implementation is to use immutable, general purpose arrays such as Seq. This would achieve O(1) peekMin and peekMax, but since lookup and update for Seq cost O(n log n), the cost of insert, deleteMin and deleteMax would become O(n log2 n).

A Seq-based implementation is provided for benchmarking purposes, which, as shown below, is more than an order of magnitude slower than the Map-based implementation for enqueuing and dequeuing 200,000 elements, proving that the improved time complexity of peekMin and peekMax is not worth the cost. In fact, if you perform peekMin and peekMax much more often than enqueuing and dequeuing operations, which means you perform peekMin and peekMax many times on the same queue, you should simply memoize the results.

Benchmarks

Benchmarking was done on my laptop in which 200,000 elements (which are integers) are inserted into the queue and subsequently removed one after another.

  • pq, intpq and sq represents MinMaxQueue, IntMinMaxQueue and SeqQueue.
  • asc, desc and rand represents inserting the elements in ascending, descending and random order.
  • min and max represents removing elements from the min-end and max-end.

As seen in the following result, IntMinMaxQueue is twice as fast as MinMaxQueue for integer keys, whereas SeqQueue is more than an order of magnitude slower.

benchmarking intpq-asc-min          
time                 27.15 ms   (23.85 ms .. 29.31 ms)
                     0.972 R²   (0.927 R² .. 0.997 R²)
mean                 30.84 ms   (29.67 ms .. 35.07 ms)
std dev              4.308 ms   (1.160 ms .. 7.915 ms)
variance introduced by outliers: 57% (severely inflated)

benchmarking intpq-desc-max         
time                 29.70 ms   (29.23 ms .. 30.41 ms)
                     0.998 R²   (0.995 R² .. 1.000 R²)
mean                 30.62 ms   (30.29 ms .. 31.02 ms)
std dev              803.7 μs   (548.0 μs .. 1.190 ms)

benchmarking intpq-rand-min         
time                 31.00 ms   (29.10 ms .. 33.05 ms)
                     0.985 R²   (0.973 R² .. 0.994 R²)
mean                 28.39 ms   (27.43 ms .. 29.46 ms)
std dev              2.216 ms   (1.968 ms .. 2.591 ms)
variance introduced by outliers: 32% (moderately inflated)

benchmarking intpq-rand-max         
time                 30.96 ms   (28.98 ms .. 32.96 ms)
                     0.987 R²   (0.976 R² .. 0.996 R²)
mean                 33.66 ms   (32.71 ms .. 34.49 ms)
std dev              1.820 ms   (1.473 ms .. 2.388 ms)
variance introduced by outliers: 18% (moderately inflated)

benchmarking pq-asc-min             
time                 69.02 ms   (61.94 ms .. 72.95 ms)
                     0.988 R²   (0.968 R² .. 0.997 R²)
mean                 71.41 ms   (68.99 ms .. 74.35 ms)
std dev              4.799 ms   (3.401 ms .. 6.974 ms)
variance introduced by outliers: 17% (moderately inflated)

benchmarking pq-desc-max            
time                 80.90 ms   (78.68 ms .. 85.06 ms)
                     0.997 R²   (0.994 R² .. 0.999 R²)
mean                 83.20 ms   (80.91 ms .. 89.15 ms)
std dev              5.853 ms   (2.234 ms .. 9.957 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking pq-rand-min            
time                 65.80 ms   (60.01 ms .. 69.62 ms)
                     0.987 R²   (0.965 R² .. 0.996 R²)
mean                 74.17 ms   (70.93 ms .. 79.86 ms)
std dev              7.495 ms   (4.557 ms .. 12.39 ms)
variance introduced by outliers: 35% (moderately inflated)

benchmarking pq-rand-max            
time                 68.29 ms   (65.07 ms .. 70.84 ms)
                     0.997 R²   (0.995 R² .. 1.000 R²)
mean                 74.03 ms   (71.64 ms .. 77.51 ms)
std dev              5.016 ms   (3.110 ms .. 7.556 ms)
variance introduced by outliers: 17% (moderately inflated)

benchmarking sq-asc-min             
time                 1.954 s    (1.369 s .. 2.838 s)
                     0.971 R²   (NaN R² .. 1.000 R²)
mean                 1.733 s    (1.592 s .. 1.861 s)
std dev              160.1 ms   (28.78 ms .. 203.2 ms)
variance introduced by outliers: 22% (moderately inflated)

benchmarking sq-rand-min            
time                 2.889 s    (2.000 s .. 3.658 s)
                     0.989 R²   (0.959 R² .. 1.000 R²)
mean                 2.915 s    (2.828 s .. 3.054 s)
std dev              130.7 ms   (442.1 μs .. 159.9 ms)
variance introduced by outliers: 19% (moderately inflated)

min-max-pqueue's People

Contributors

bodigrim avatar zliu41 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

min-max-pqueue's Issues

Consider a finger tree-based implementation

For performance reasons, this should probably copy ideas and some (modified) code from the fingertree package, but not use its actual FingerTree type. Here's the general idea:

data MinMax k = MinMax
  { smallest :: !k
  , largest :: !k }

instance Ord k => Semigroup (MinMax k) where
  MinMax min1 max1 <> MinMax min2 max2
    = MinMax (min min1 min2) (max max1 max2)

data Entry k v = Entry !k v

data Digit a = One !a | Two !a !a | Three !a !a !a | Four !a !a !a !a
data Node k a
  = Node2 {-# UNPACK #-} !(MinMax k) !a !a
  | Node3 {-# UNPACK #-} !(MinMax k) !a !a !a
data FingerTree k a
  = Single !a
  | Deep {-# UNPACK #-} !(MinMax k) !(Digit a) (FingerTree k (Node k a)) !(Digit a)

-- Cache the values associated with the smallest
-- and largest keys so the smallest and largest entries
-- can be viewed in constant time. This is optional.
data DEPQ k v
  = Empty
  | DEPQ
      { valMin :: v
      , valMax :: v
      , FingerTree k (Entry k v) }

The code for meld can be copied from the code for >< in fingertree, but using <> instead of mappend and being stricter in the spine than what that package currently does (it will probably be fixed soon). The code for minView and maxView should be similar to the code for Data.Sequence.deleteAt, but using the search procedure in fingertree.

This implementation should offer:

  • insert: persistently amortized O(1); worst-case O(log n).
  • minView, maxView: O(log n)
  • meld: O(log n)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.