Comments (63)
Indeed I ship a dlmalloc based allocator because the default allocator on windows sometimes has a very high memory usage with sparsepp's pattern of allocation. There is some explanation in the README.
in your benchmarks, shouldn't all maps use the same allocator to get a more "fair" comparison?
They do. The custom allocator is not used by default. You have to define SPP_USE_SPP_ALLOC
.
from sparsepp.
@chondl , @AVJoKe , I looked into implementing the unordered_multimap, and unfortunately I don't think it will be possible to implement efficiently. One issue is that the equal_range() method needs to return two iterators allowing to iterate elements with the same key.
Because sparsepp uses open addressing and not chaining, it would be difficult to implement something that doesn't have an additional container per entry (like std::vector), and that would be very inefficient both time and space wise.
Maybe you can use Google's cpp-btree instead? It is very memory efficient, although the lookup is slower than sparsepp. But it supports the multimap.
from sparsepp.
Great sparsepp!!!
I had to put 6B of 64bit integers to set container for filtering duplicated numbers, it was very fast and stable. Thank you so much. I love it!
from sparsepp.
I'm using sparsepp as kind of like a sparse array in high-performance lookups and it is fantastic, well outperformed the other attempts in a variety of other libraries. I love it!
from sparsepp.
Hum, maybe I am misunderstanding the graphs, but sparsepp seems to have about the same memory usage as khash.
One thing where sparsepp shines is not the memory usage itself, but the memory high water mark when the table resizes (when inserting more items). In my experience this is often what limits the number of items a table can hold, if you don't know the final size in advance.
In any case, I have found that there is not much point in considering a single specific benchmark. This particular case uses very small keys (one int). Does it measure how fast it is to iterate over the gash table, to delete an element, etc...
The best thing to do is to actually try different hash tables with your own use case. You may be surprised by the results (which don't always match posted benchmarks).
from sparsepp.
from sparsepp.
I liked the reduced memory footprint and am switching my code to this where appropriate.
from sparsepp.
Thanks David!
from sparsepp.
@greg7mdp Is it possible to open source the benchmarking code? Could be part of this or separate repository...
from sparsepp.
@romange, just added the repository hash-table-shootout which is the benchmarking code I used (updated from https://github.com/aggregateknowledge).
from sparsepp.
@greg7mdp Thanks!
from sparsepp.
Awesome work. I made small R packages which allow to easily use sparsepp with R packages and Rcpp - https://github.com/dselivanov/sparsepp.
Initially I evaluated Sparse++ with my another project (https://github.com/dselivanov/text2vec) where main data structure is unordered_map< pair<uint32_t, uint32_t>, T >
, T
is int
or float
.
Memory improvement was 2x and speed up was 1.5x (lookup and insert operations).
Thank you for amazing work.
from sparsepp.
Thank you, @dselivanov, I really appreciate the feedback and I am glad you find sparsepp useful.
from sparsepp.
so, i evaluated sparsepp as a drop-in replacement for std::unordered_map in a personal library (header, more like) that i use in my work over the break. i have tuned the library for performance, and it seems at first blush that sparsepp can at best match the STL container in my benchmarking on a test suite that i consider representative. in past incarnations, my library used dense_hash_map with good results (0-20% improvement), but it seems like the STL implementation has bridged the gap with emplace/reserve/etc. at this point. although i cannot share the test suite, is there a good harness/logging approach that i can use to capture the trace and help move the sparsepp effort along? in my library, a user-defined class that i usually hash to a unit32_t is used as the key and the value is also a uint32_t (an array index). no deletions occur in the lifetime of a single run and i can happily reserve space for the max number of entries that i expect to see in a run (20 million give or take). cheers!
from sparsepp.
@kmohanram, thanks for the feedback!
dense_hash_map should be much faster than the STL's unordered_map, I am surprised that you see only a 0-20% improvement.
So I suspect that the time is spent somewhere else than in the hash table code itself. Maybe the hash implementation for your 'key' class is taking most of the time. Or maybe copying it takes a long time? Do you have a move constructor?
from sparsepp.
std::hash<uint32_t> is pass through and my hash simply adds three uint32_t fields in the user-defined class and returns that value. turns out that is a darn close (serendipitous, if i may confess) value to a little over the number of entries in the hash table. so, it is almost like an insertion hint to an empty place in the hash map, or a collision that more than likely is a hit (i.e., this is not a unique entry). my attempts to refine the hash further have fallen flat with measurable performance hits. so, i have eliminated the hash as a possible suspect already. the load factor confirms that the distribution is near uniform and that i am over-provisioning (which i am not worried about here and now).
re: copies and/or the move constructor, it is a trivial class with only these three uint32_t fields. ok, lest i seem like a joker, it is a packed class and the three fields collapse to a single uint64_t, but i am letting the compiler do the bit-twiddling for me. and also using all the compiler-defined constructors/etc. so unless i am wildly off, i cannot suspect anything here as well. but hey, life (and C++) is (are) full of surprises.
btw, since dense_hash_map does not quite support C++11, i hoped that sparsepp would fill in the blanks and improve performance by at least 10%. all i literally did was redefine the map and everything ran with a minor drag of 5% or so ...
from sparsepp.
i just ran the suite with the call to reserve commented out and i see a 20% performance degradation with sparsepp (on my mac, fwiw, since that can also affect these numbers). happy to work with you to help comprehend and resolve this, even if the use case is an anomaly and even if it is unique to the mac ...
interestingly enough, the map's load factor for the STL run is 0.80 while the load factor for the sparsepp run is 0.30 when i run with reserve commented out. with reserve in use, the load factors in both cases is 0.6ish.
from sparsepp.
@kmohanram, for your hash function, can you try the following:
struct Hasher {
size_t operator()(const T &t) const
{
const uin32_t base = 2166136261U;
const uin32_t p = 16777619UL;
size_t res = base;
res ^= t.int1;
res *= p;
res ^= t.int2;
res *= p;
res ^= t.int3;
res *= p;
return res;
}
};
from sparsepp.
well, this hash function (with reserve still in place) is a drag to the extent of 40% on the STL implementation. and sparsepp seems to drag by a further 10% over the STL implementation at first blush. it took a 35+% beating in several instances btw, and never does better.
from sparsepp.
i just think my hash function is -- serendipitously, i must ack again -- way too good for my application. i recall trying other prime-based hashing approaches when i used an array instead of a hash_map to store my entries and nothing came even close, unfortunately/fortunately. btw, do you have any suggestions as to how i can capture a meaningful subset of map-related ops (inserts/lookups) within my app so that we can abstract to a testbench and you can debug offline? i'm thinking all emplace calls need to be logged with the respective inputs. though my application control path depends on the results of these ops, at least you will have something to work with, especially if the results are consistent (sparsepp underperforms std::unordered_map).
from sparsepp.
@kmohanram, please open a separate issue so we don't hijack this one.
From what you are writing, it looks like the hash is indeed not the issue.
What you could do is dump the inserts/lookups into a binary file, to be replayed later with a program like the one in issue #20 (closed now). If you do that I'll be happy to see if I find a problem, however the fact that there is not big difference between the std::unordered_map and the dense_hash_map makes me think that the time may be spent outside of the hash table code. This will be easy to check if you do implement the binary file testing method I suggest above.
from sparsepp.
@greg7mdp @kmohanram I just opened #21, I wonder if this is related to the issue seen here.
from sparsepp.
Hi Greg!
I love your hash map and use it in my research where I have hundreds of millions and sometimes billions of entries with hundreds of thousands of accesses per second. Your map turns out to be the fastest among the ones I have tested and I am very pleased with it. Thanks for this!
from sparsepp.
Wow, thanks @Elypson, really glad you like sparsepp. It is good to know that people out there use it and appreciate it.
from sparsepp.
I'm looking forward to using this in our application to reduce memory usage. Currently we are using STL unordered_map in an application compiled with Emscripten to run in the browser. Happy to report that the library worked as advertised as a drop-in replacement and our automated test suite found no bugs after the change.
Two questions:
- Is there an implementation of unordered_multimap with similar memory usage? I didn't see one in the header files.
- Is there an easy way to get the amount of memory consumed by an instance of the map? We have some stats collection code built into our application would like to be able to report on the memory consumed by each instance.
from sparsepp.
@chondl , thanks for the feedback.
- I'm sorry there is not an implementation of unordered_multimap in sparsepp. I will put it on my todo list, but I don't expect to have it soon.
- There is currently no API returning the map memory usage. I could add one, however it would return only the memory actually allocated by the map, and would not take into account the malloc() overhead. But probably that is better than nothing, so I will do it.
from sparsepp.
No worries about unordered_multimap, we'll figure something out. Glad to have the space and time efficient hash. Even having a total memory allocated outside of malloc would be helpfuls.
from sparsepp.
I would second the request for a unordered_multimap replacement, since some nested solutions won't work for me (like <key,std::vector>). Nevertheless I implemented your sparse_hash_map where it was feasible and could see great performance improvements compared to STL. Keep up the good work!
from sparsepp.
hi @greg7mdp,
i would like to thank you for sparsepp. i am using it for my research where i have hundreds of millions genomic entries.
I need some advice about searching in sparse_hash_map. Is there any efficent way for searching in serialized hash map?
from sparsepp.
@samedsaka , thank you, I'm glad you like sparsepp!
About your question, are you asking about doing a search on the serialized file on disk, without loading it into memory? If that is the question, I'm afraid I don't think it would be easy, the serialization format is not designed for lookup, but only for doing a save/restore of the hash map on disk.
from sparsepp.
Hi - just a quick note to thank you for sparsepp. We now use it extensively in place of google’s sparse_hash_map and various unordered_maps after significant testing showed numerous real world performance gains across a set of highish memory (200GB or more) applications. No issues or other feedback, but thanks!
from sparsepp.
@jackstockdale, thank you so much for the feedback! I am really glad people use sparsepp... sometimes it is hard to tell. I very much appreciate you taking the time to let me know.
from sparsepp.
Thanks for the feedback @kwangswei and @OvermindDL1, I am delighted you find sparsepp useful!
from sparsepp.
In an earlier post, someone asked if you have a "memory byte count" API to obtain the number of memory bytes allocated by the map/set. You had said that you were going to add it. Did you? If yes, what is the API call to obtain the memory byte count?
from sparsepp.
Hi Greg, I am using your hash-map in one of my projects and it seems the memory improvement is 2X but I cannot see any time improvement. I am using the following hash-map
sparse_hash_map<uint64_t, pair<uint8_t, uint32_t>> MAP
from sparsepp.
Thanks for the feedback @srbehera11 . sparsepp main claim to fame is its low memory usage, there is no guarantee that you would see a speed improvement. It will depend on what you compare it with (there are different implementations of unordered_hash
in different C++ library implementations).
It may also depend on the quality of your hash function. You could try #define SPP_MIX_HASH 1
before including spp.h
to see if it makes a difference.
from sparsepp.
someone asked if you have a "memory byte count" API to obtain the number of memory bytes allocated by the map/set.
@jmbnyc , it would not be easy to provide this functionality in a meaningful, as it depends on the overhead of the memory allocator used. sparsepp does allocate multiple small buffers, so the memory allocator padding can have a big influence on the amount of memory actually used.
from sparsepp.
Greg,
Thanks for the response. I would argue that returning what the map allocates regardless of the allocator would be meaningful. Why? Well, IMO it allows one to understand the amount of memory allocated vs the memory to hold an array of the objects contained in the map, e.g. it provides the overhead associated with the map storing the elements at any point in time.
from sparsepp.
Greg,
One more thing. I observed this from coverity (from sparse_growth_policy.h):
static constexpr bool is_power_of_two(std::size_t value) {
1. Condition value != 0, taking false branch.
2. overflow: Subtract operation overflows on operands value and 1UL.
CID 10488 (#1 of 1): Overflowed return value (INTEGER_OVERFLOW)
3. overflow_sink: Overflowed or truncated value (or a value computed from an overflowed or truncated value) value != 0UL && (value & value - 1UL) == 0UL used as return value.
113 return value != 0 && (value & (value - 1)) == 0;
114 }
from sparsepp.
@jmbnyc If you want to see what sparsepp allocates, you could provide an allocator to that sparse_hash_map that keeps track of the memory allocated.
about the coverity report, this doesn't look like sparsepp code.
from sparsepp.
Greg,
Hmm. Let me check.
from sparsepp.
Greg,
Apologies. This was from a test setup where I was comparing multiple implementations.
from sparsepp.
Jeffrey, no problem, I really appreciate the feedback and comments.
from sparsepp.
from sparsepp.
@greg7mdp Thanks for your reply !! I tried #define SPP_MIX_HASH 1
before including spp.h
, but it did not make any difference.
I am now using a vector of maps vector < sparse_hash_map<uint32_t, uint32_t> > MAP(64)
where the key is a hash value of an element generated by MurmurHash3_x86_32
. In my program, I need to deallocate the memory of some maps based on conditions. Does MAP[i].clear();
will deallocate the memory of ith map?
from sparsepp.
@srbehera11, MAP[i].clear();
will not deallocate all the memory. I suggest you do instead:
sparse_hash_map<uint32_t, uint32_t>().swap(MAP[i])
.
You may want to use a typedef:
typedef sparse_hash_map<uint32_t, uint32_t> SMap;
SMap().swap(MAP[i]);
from sparsepp.
@greg7mdp Thanks !! What exactly swap()
function does ? Currently, my program takes 2gb for 135m entries. I just noticed that you mentioned here to change #define SPP_ALLOC_SZ 0
to #define SPP_ALLOC_SZ 1
to increase memory efficiency. I need to change in spp.h
file, right?
from sparsepp.
@srbehera11, swap() swaps the content ow the two maps, so the one in the vector now is empty as it was replaced by a newly created one (and the temporary is destroyed). You can try -D SPP_ALLOC_SZ=1
on the compiler command line, no need to change spp_config.h
.
from sparsepp.
@greg7mdp Is there any way to get the total memory usage from the map? Something like cpp-btree's bytes_used function (https://github.com/algorithm-ninja/cpp-btree/blob/master/btree.h#L1173)? Thanks!
from sparsepp.
any way to get the total memory usage from the map?
@eglaser77, Unfortunately, there is no exact way to provide this information. sparsepp makes many small memory allocations, and while it would be possible to compute the memory requested, each memory allocation has an overhead which is unknown to sparsepp, so the returned result would not be accurate.
However, you could use the function GetProcessMemoryUsed() from spp_memory.h to measure memory usage before and after filling the sparsepp container, to get the total memory usage.
from sparsepp.
However, you could use the function GetProcessMemoryUsed() from spp_memory.h to measure memory usage before and after filling the sparsepp container, to get the total memory usage.
@greg7mdp Thanks for the response. I'll take a look at GetProcessMemoryUsed but I assume it gets all system memory and so it would be conflated with whatever other allocations/deallocations are going on at the same time as the map is being used.
from sparsepp.
@eglaser77 I believe this will return the memory used by your process only, not the whole system.
from sparsepp.
Is there a way to use this from golang? thank you
from sparsepp.
Is there a possibility of updating your iterator class to remove the warning from VS2017? They are deprecating inheriting from the std:: iterator class.
from sparsepp.
Is there a possibility of updating your iterator class to remove the warning from VS2017? They are deprecating inheriting from the std:: iterator class.
@anthony-royal-bisimulations , what warning do you get? I don't see it with VS2017 (version 15.9.4)
from sparsepp.
Is there a way to use this from golang? thank you
Not that I know, sorry.
from sparsepp.
what warning do you get? I don't see it with VS2017 (version 15.9.4)
Warning C4996 'std::iterator<iter_type,T,ptrdiff_t,_Ty *,_Ty &>::iterator_category': warning STL4015: The std::iterator class template (used as a base class to provide typedefs) is deprecated in C++17. (The header is NOT deprecated.) ...
It says to use a define to ignore the deprecation warnings which we are doing for now since we compile with warnings as errors.
We are using version 1.22 with the CPP17 preprocessor which enables more c++ 17 features. We are using VS2017 15.6.4.
from sparsepp.
Its specifically pointing to class Two_d_iterator : public std::iterator<iter_type, T>
from sparsepp.
I don't know how to reproduce the issue, but I should be able to fix it anyways - will try to get to it today. In the meanwhile you could add #pragma warning(disable : 4996).
in your file.
from sparsepp.
@anthony-royal-bisimulations - warning should be fixed. Thanks for reporting the issue.
from sparsepp.
I'm interested on your thoughts on https://attractivechaos.wordpress.com/2018/01/13/revisiting-hash-table-performance/. This benchmark suggests that klib's khash is much faster than sparsepp with lower memory usage. If this benchmark isn't flawed, how could sparsepp be improved to match khash's performance?
from sparsepp.
I've noticed you are shipping your own version of dlmalloc with sparsepp. In the light of other malloc implementations that might be installed system wide (like tcmalloc from gperftools), this seems kind of redundant to me.
What influenced your decision to includ dlmalloc? Did you encounter bad performance with the default system malloc?
And finally, in your benchmarks, shouldn't all maps use the same allocator to get a more "fair" comparison?
from sparsepp.
Really appreciate the great feedback. Closing the issue as my focus is more on phmap now.
from sparsepp.
Related Issues (20)
- Support operator[] with rvalue HOT 1
- When hashing on vector HOT 13
- Help me to serialize/deserialize sparse_hash_map<int , vector< std::string > > HOT 1
- missing return value in function spp_sptr& operator=(spp_sptr &&o) ? HOT 1
- SIGSEGV on using move-assigned-from hash set HOT 3
- Unable to create map for value type with deleted copy constructor HOT 2
- Can't insert std::tuple: spp uses deleted constructor/destructor HOT 1
- Lookup non-present keys becomes very slow. HOT 1
- Do you support C++11 stateful allocators? HOT 1
- Default allocator doesn't throw and caller doesn't check for NULL HOT 1
- SEGV after moving hash_map HOT 2
- conan package HOT 2
- sparse_hash_map<uint32_t, uint64_t> takes more memory than sparse_hash_map<uint64_t, uint64_t> HOT 1
- memcpy can happen with null "source" parameter HOT 6
- class-memaccess warning when compiling with GCC 8 or later HOT 1
- Problem when compiling on macOS catalina HOT 6
- Can sparse_hash_map support unique_ptr?
- Compiler warning on realloc call within sparsepp. HOT 1
- Error when compile: is not a special member function which can be defaulted HOT 2
- Missing return value. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sparsepp.