Comments (6)
And here is the output of the script on my machine:
Time it takes to copy using pqdict.copy():
2.7107771890005097
Time it takes to copy by copying _heap and _position:
0.017738743001245894
from pqdict.
Hmm I see my proposal is wrong, as the copy has Node objects pointing back to the original object, so adding/deleting elements would work, but updating them still changes the original.
So you would need to create a copy of the Node objects. That would still be much faster than doing a heapify() of everything, I imagine.
I will send a PR implementing that.
from pqdict.
After my PR, here is the output of the script:
Time it takes to copy using pqdict.copy():
0.3347427129992866
Time it takes to copy by copying _heap and _position:
0.02496067000174662
So it's a much smaller speedup than I though, if I do it properly by copying the Node objects, but still nearly 10x speedup compared to the original.
from pqdict.
And I sent another PR, reducing copy time to what I think was originally possible, by making the internal nodes immutable, which I think is pretty neat.
from pqdict.
@palkeo, thank you for spotting this, benchmarking it and submitting two solutions!
As for the immutable nodes solution, pqdict was originally designed with very frequent priority update patterns in mind, so my one concern is whether switching to immutables adds any significant overhead caused by creating and garbage collecting immutable nodes when updating existing keys.
On the other hand, I see no objection to the first solution, so I can probably merge that one right away and leave the immutable strategy open for further exploration.
from pqdict.
Thanks for looking into it!
Here is a test script, for your question above:
import pqdict
import random
import timeit
import copy
random.seed(1298472)
def get_dict():
a = pqdict.pqdict()
for i in range(10_000):
a[random.random()] = random.random()
return a
def update_dict(a):
for key, value in a.items():
a[key] = value + 0.001
return a
def copy_dict(a):
return a.copy()
def consume_dict(a):
while a:
assert a.popitem()[1]
a = get_dict()
print("Time it takes to create the pqdict:")
print(timeit.timeit(lambda: get_dict(), number=100))
print("Time it takes to copy using pqdict.copy():")
print(timeit.timeit(lambda: copy_dict(a), number=100))
print("Time it takes to consume the pqdict after pqdict.copy():")
print(timeit.timeit(lambda: consume_dict(copy_dict(a)), number=100))
print("Time it takes to update all keys in the dict")
print(timeit.timeit(lambda: update_dict(a), number=100))
Because copy is much faster, copying + consuming all data is faster in the immutable_node
version.
However, the test with updating all the keys in the dict is about 10% slower in the new version. I think this is negligible for the benefit it adds in term of simplicity and copy performance. But I'll leave you evaluate and decide. Feel free to close this issue if you want.
Thanks!
from pqdict.
Related Issues (11)
- Position of element in queue HOT 1
- Key as Priority Element Instead of Value HOT 1
- Squelch a 3.7+ Decprecation warning (and breakage in 3.9) HOT 1
- Massive test failures when used from Python 3.8 HOT 2
- Missing priority key function argument in nsmallest/nlargest functions HOT 2
- Tips for Thread Safety? HOT 1
- Getting error : ValueError: 'create_entry' in __slots__ conflicts with class variable. HOT 1
- Adding a function topvalue() in pqdict HOT 3
- Add typing support? HOT 2
- Suggestion: Drop support for python versions < 3.7 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pqdict.