Code Monkey home page Code Monkey logo

Comments (6)

palkeo avatar palkeo commented on June 11, 2024

And here is the output of the script on my machine:

Time it takes to copy using pqdict.copy():
2.7107771890005097
Time it takes to copy by copying _heap and _position:
0.017738743001245894

from pqdict.

palkeo avatar palkeo commented on June 11, 2024

Hmm I see my proposal is wrong, as the copy has Node objects pointing back to the original object, so adding/deleting elements would work, but updating them still changes the original.
So you would need to create a copy of the Node objects. That would still be much faster than doing a heapify() of everything, I imagine.
I will send a PR implementing that.

from pqdict.

palkeo avatar palkeo commented on June 11, 2024

After my PR, here is the output of the script:

Time it takes to copy using pqdict.copy():
0.3347427129992866
Time it takes to copy by copying _heap and _position:
0.02496067000174662

So it's a much smaller speedup than I though, if I do it properly by copying the Node objects, but still nearly 10x speedup compared to the original.

from pqdict.

palkeo avatar palkeo commented on June 11, 2024

And I sent another PR, reducing copy time to what I think was originally possible, by making the internal nodes immutable, which I think is pretty neat.

from pqdict.

nvictus avatar nvictus commented on June 11, 2024

@palkeo, thank you for spotting this, benchmarking it and submitting two solutions!

As for the immutable nodes solution, pqdict was originally designed with very frequent priority update patterns in mind, so my one concern is whether switching to immutables adds any significant overhead caused by creating and garbage collecting immutable nodes when updating existing keys.

On the other hand, I see no objection to the first solution, so I can probably merge that one right away and leave the immutable strategy open for further exploration.

from pqdict.

palkeo avatar palkeo commented on June 11, 2024

Thanks for looking into it!

Here is a test script, for your question above:

import pqdict
import random
import timeit
import copy

random.seed(1298472)

def get_dict():
    a = pqdict.pqdict()
    for i in range(10_000):
        a[random.random()] = random.random()
    return a

def update_dict(a):
    for key, value in a.items():
        a[key] = value + 0.001
    return a

def copy_dict(a):
    return a.copy()

def consume_dict(a):
    while a:
        assert a.popitem()[1]

a = get_dict()

print("Time it takes to create the pqdict:")
print(timeit.timeit(lambda: get_dict(), number=100))
print("Time it takes to copy using pqdict.copy():")
print(timeit.timeit(lambda: copy_dict(a), number=100))
print("Time it takes to consume the pqdict after pqdict.copy():")
print(timeit.timeit(lambda: consume_dict(copy_dict(a)), number=100))
print("Time it takes to update all keys in the dict")
print(timeit.timeit(lambda: update_dict(a), number=100))

Because copy is much faster, copying + consuming all data is faster in the immutable_node version.

However, the test with updating all the keys in the dict is about 10% slower in the new version. I think this is negligible for the benefit it adds in term of simplicity and copy performance. But I'll leave you evaluate and decide. Feel free to close this issue if you want.

Thanks!

from pqdict.

Related Issues (11)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.