Code Monkey home page Code Monkey logo

pqdict's People

Contributors

bitdeli-chef avatar dependabot[bot] avatar dkavraal avatar espdev avatar nvictus avatar rocky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pqdict's Issues

pqdict.copy() causes slow/un-necessary heapsort of all the data.

Hey,

I noticed that when I use pqdict.copy() to copy over a pqdict, it's abnormally slow.

I was able to track it down to the fact that copy() calls __init__, and it does this:

        if data is not None:
            self.update(data)
        self.heapify()

Which re-heapify everything, causing it to go through the whole data structure. But we already have a heapified pqdict, so it's not necessary.

Instead you should just copy over the state (_heap and _position).

Here is some benchmarking/test script I created to show the problem:

import pqdict
import random
import timeit
import copy

def get_dict():
    a = pqdict.pqdict()
    for i in range(10_000):
        a[random.random()] = random.random()
    return a

def copy_dict(a):
    return a.copy()

def copy_dict_test(a):
    new_dict = pqdict.pqdict()
    new_dict._heap = a._heap[:]
    new_dict._position = a._position.copy()
    return new_dict

def consume_dict(a):
    while a:
        assert a.popitem()[1]

a = get_dict()

print("Time it takes to copy using pqdict.copy():")
print(timeit.timeit(lambda: copy_dict(a), number=100))
print("Time it takes to copy by copying _heap and _position:")
print(timeit.timeit(lambda: copy_dict_test(a), number=100))

print("Time it takes to create the pqdict:")
print(timeit.timeit(lambda: get_dict(), number=100))
print("Time it takes to consume the pqdict after pqdict.copy():")
print(timeit.timeit(lambda: consume_dict(copy_dict(a)), number=100))
print("Time it takes to consume the pqdict after copy of _heap and _position:")
print(timeit.timeit(lambda: consume_dict(copy_dict_test(a)), number=100))

Would it be possible to re-implement the copy() method? I can make a PR if you would like.

Position of element in queue

I'm curious if you have any ideas of the best way of finding the relative position of an element in the queue that you wrote here.

A naive way would be to use the function nlargest, iterate through that list and keep track of the position as you iterate through the list. This would be problematic if n items is big. Can you think of better approach?

Squelch a 3.7+ Decprecation warning (and breakage in 3.9)

I really like this package! When I use it from 3.7+ I get the Decprecation message:

 ...site-packages/pqdict/__init__.py:37: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
    from collections import MutableMapping as _MutableMapping

I'll put in a PR for this soon.

Add typing support?

Hi,

Thanks a lot for pqdict! It's very helpful!

I'm using mypy in strict mode, and created this stub file. Would you like to add it?

I guess that an even better exercise would be to add typing, but then you'd lose support for python 2.

Thanks!
Noam

from typing import TypeVar, Callable, Any, Iterable, Self, Iterator, Mapping
from collections.abc import MutableMapping

K = TypeVar("K")
V = TypeVar("V")

class pqdict(MutableMapping[K, V]):
    def __init__(
        self,
        data: Any = ...,
        key: Callable[[V], Any] | None = ...,
        reverse: bool = ...,
        precedes: Callable[[K, K], Any] = ...,
    ) -> None: ...
    @property
    def precedes(self) -> Callable[[K, K], Any]: ...
    @property
    def keyfn(self) -> Callable[[V], Any] | None: ...
    @classmethod
    def fromkeys(cls, iterable: Iterable[Any], value: Any, **kwargs: Any) -> Self: ...
    def __len__(self) -> int: ...
    def __contains__(self, key: object) -> bool: ...
    def __iter__(self) -> Iterator[K]: ...
    def __getitem__(self, key: K) -> V: ...
    def __setitem__(self, key: K, value: V) -> None: ...
    def __delitem__(self, key: K) -> None: ...
    def copy(self) -> Self: ...
    def top(self) -> K: ...
    def popitem(self) -> tuple[K, V]: ...
    def topitem(self) -> tuple[K, V]: ...
    def additem(self, key: K, value: V) -> None: ...
    def pushpopitem(self, key: K, value: V) -> tuple[K, V]: ...
    def updateitem(self, key: K, new_val: V) -> None: ...
    def replace_key(self, key: K, new_key: K) -> None: ...
    def swap_priority(self, key1: K, key2: K) -> None: ...
    def popkeys(self) -> Iterator[K]: ...
    def popvalues(self) -> Iterator[V]: ...
    def popitems(self) -> Iterator[tuple[K, V]]: ...
    def heapify(self, key: K = ...) -> None: ...
    def topvalue(self, default: V = ...) -> V: ...

PQDict = pqdict

def minpq(*args: Any, **kwargs: Any) -> pqdict[Any, Any]: ...
def maxpq(*args: Any, **kwargs: Any) -> pqdict[Any, Any]: ...
def nlargest(n: int, mapping: Mapping[K, V], key: Callable[[V], Any] | None = ...) -> list[K]: ...
def nsmallest(n: int, mapping: Mapping[K, V], key: Callable[[V], Any] | None = ...) -> list[K]: ...

Key as Priority Element Instead of Value

Is it possible that in the 'key' parameter, I specify the keys of the dictionary as the priority function?

Example:
a_dict = {'b': 1, 'a': 3 , 'c': 2}

Instead of:
my_pqdict = {'b': 1, 'c': 2, 'a': 3} ... ordered by value

I obtain this:
my_pqdict = {'a': 3, 'b': 1, 'c': 2} ... ordered by key

It would really be helpful if the priority queue dictionary can do so.

Adding a function topvalue() in pqdict

Hi,

Thanks for creating this simple yet useful library.

While using the lib, I find that sometimes we need to find the value of pq[pq.top()] or pq.topitem()[1] where pq
is non-empty maxpq.

In general, I think, It will be useful to add a function .topvalue() in pqdict API.

Let me know your thoughts, and if you are okay, I will create a PR for this.

Thanks

Tips for Thread Safety?

Hi, I'm considering using this data struct for a project. The exposed interfaces (heap queue + dictionary access) and implementation are appealing, but it looks like it may not be thread-safe.

Any suggestions on what it would take to implement thread-safe behavior for the pqdict object? I plan to have mostly multiple writers at any given time. I might have to introduce a lock variable and add it to the main methods in the class whenever the items are added or removed.

Massive test failures when used from Python 3.8

I am getting these failures when used from Python 3.8.3:

$ nosetests test
EE......E.FE.E.EE.E.EEEEEEEEEEEEE.EEEEEEEEEEEEEEEEEEEE.EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE.EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE.EEEEEEEEEEEEEEEEEE.....SSSSSSSSSSSSSSSS.....................................S..............  C-c C-cS
======================================================================
ERROR: test.support.testresult.get_test_runner_class
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/rocky/.pyenv/versions/3.8.3/lib/python3.8/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
TypeError: get_test_runner_class() missing 1 required positional argument: 'verbosity'

======================================================================
ERROR: test.support.testresult.get_test_runner
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/rocky/.pyenv/versions/3.8.3/lib/python3.8/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
TypeError: get_test_runner() missing 2 required positional arguments: 'stream' and 'verbosity'

======================================================================
ERROR: test.audit-tests.test_open
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/rocky/.pyenv/versions/3.8.3/lib/python3.8/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/home/rocky/.pyenv/versions/3.8.3/lib/python3.8/test/audit-tests.py", line 201, in test_open
    (open, sys.argv[2], "r"),
IndexError: list index out of range
-------------------- >> begin captured stdout << ---------------------
sys.addaudithook 139704741433248
... [hundreds of other lines]

I note however that when i use this package from 3.8.3 things work fine.

If you'd like me to investigate and suggest a fix/PR, let me know.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.