martinus / map_benchmark Goto Github PK

View Code? Open in Web Editor NEW

290.0 13.0 27.0 20.16 MB

Comprehensive benchmarks of C++ maps

License: MIT License

C++ 61.88% C 0.06% Python 0.86% Ruby 29.79% CMake 4.76% Shell 0.13% Jinja 2.04% Gnuplot 0.48%

c-plus-plus hashmaps benchmark

map_benchmark's Introduction

map_benchmark

Comprehensive benchmarks of C++ maps

Results

Please see here.

building

Install dependencies for folly::F14, see https://github.com/facebook/folly/#ubuntu-1604-lts
Install ninja and cmake

git clone --recurse-submodules https://github.com/martinus/map_benchmark.git
./map_benchmark/tools/build.sh

Updating Submodules

git submodule update --force --remote

Sourcecode Layout

The implementation if the benchmark is open source, get it here: martinus/map_benchmark. It is split in several parts:

external: all map implementations available through github are added as git submodules here.
src/hashes: One directory for each hashing algorithm, each directory contains a Hash.h which basically contains a using instruction for the hash, e.g. like this:

template <class Key>
using Hash = robin_hood::hash<Key>;

src/maps: One directory for each unordered map implementation, each directory contains a Map.h which basically contains a using instruction for the map. It includes Hash.h. E.g. like this:

#include "Hash.h"
template <class Key, class Val>
using Map = robin_hood::unordered_flat_map<Key, Val, Hash<Key>>;

Add a new Hashmap

In external, add a submodule:

cd external
git submodule add -b master https://github.com/rigtorp/HashMap.git rigtorp__HashMap

Create a directory in src/map/ with a file Hash.h. See the others for example.

Maps I couldn't add

QHash: It's interface is too different to be easily includeable. e.g. iterator->first and iterator->second do not exist.
rigtorp::HashMap: Doesn't have a default constructor

Reliable Benchmarks

Run lscpu --extended to find out if you have hyperthreadding. E.g. for me it shows

$ lscpu --extended
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ    MINMHZ
0   0    0      0    0:0:0:0       yes    4600,0000 800,0000
1   0    0      1    1:1:1:0       yes    4600,0000 800,0000
2   0    0      2    2:2:2:0       yes    4600,0000 800,0000
3   0    0      3    3:3:3:0       yes    4600,0000 800,0000
4   0    0      4    4:4:4:0       yes    4600,0000 800,0000
5   0    0      5    5:5:5:0       yes    4600,0000 800,0000
6   0    0      0    0:0:0:0       yes    4600,0000 800,0000
7   0    0      1    1:1:1:0       yes    4600,0000 800,0000
8   0    0      2    2:2:2:0       yes    4600,0000 800,0000
9   0    0      3    3:3:3:0       yes    4600,0000 800,0000
10  0    0      4    4:4:4:0       yes    4600,0000 800,0000
11  0    0      5    5:5:5:0       yes    4600,0000 800,0000

Isolate a CPU with it's hyperthreading companion. I'm isolating CPU 5 and 11.
Edit /etc/default/grub and change GRUB_CMDLINE_LINUX_DEFAULT so it looks like this:
```
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=5,11 rcu_nocbs=5,11"
```
Run sudo update-grub
reboot
Edit bench.rb so the taskset -c ... prefix is correct.
Install Python module perf, see https://perf.readthedocs.io/en/latest/
Run sudo python3 -m perf system tune
Start the benchmarks: ../tools/bench.rb |tee ../data/all_new.txt

Sources:

map_benchmark's People

Contributors

Stargazers

Watchers

map_benchmark's Issues

Add a copy benchmark

See this discussion:
https://www.reddit.com/r/cpp/comments/bg4z2l/big_hashmap_comparison/eljd0ku/

Add benchmark to

copy map by sequentially iterating it
copy map by using copy ctor
copy almost empty map?
copy with high fill factor

Would please add also the copy-on-write hashmap from Asteria?

Source code is available at https://github.com/lhmouse/asteria/blob/master/rocket/cow_hashmap.hpp.

The interface of this container is very similar to std::unordered_map, except that:

begin(), find() etc. return const_iterator. In order to get mutable iterators, it is necessary to call mut_begin(), mut_find() etc.
equal_range() is not supported.
Iterators are bidirectional.
There are no bucket iterators.
erase() may invalidate iterators.
Incomplete types are allowed, even when the destructor is referenced.

Thanks in advance.

add find large

RandomFind_2000 is small dataset (load factor = 0.49)
and RandomFind_5000000 is medium dataset(load factor = 0.48)
I think change it to 600000 with load factor = 0.57 is better.

add a large dataset 12'000'0000(load factor = 0.71)


BENCHMARK(RandomFind_12000000) {

    static constexpr auto lower32bit = UINT64_C(0x00000000FFFFFFFF);
    static constexpr auto upper32bit = UINT64_C(0xFFFFFFFF00000000);
    static constexpr auto medium32bit = UINT64_C(0x0000FFFFFFFF0000);

    static constexpr size_t numInserts = 12'000'000;
    static constexpr size_t numFindsPerInsert = 10;

    bench.endMeasure(169388,    randomFindInternal(bench, 4, upper32bit, numInserts, numFindsPerInsert));
    bench.endMeasure(30126934,  randomFindInternal(bench, 3, upper32bit, numInserts, numFindsPerInsert));
    bench.endMeasure(60082938,  randomFindInternal(bench, 2, medium32bit, numInserts, numFindsPerInsert));
    bench.endMeasure(90041230,  randomFindInternal(bench, 1, lower32bit, numInserts, numFindsPerInsert));
    bench.endMeasure(119999853, randomFindInternal(bench, 0, lower32bit, numInserts, numFindsPerInsert));
}

compiler errors on Windows (VS)

For VS 2017 compiler have errors in sfc64.h (line 65, operator must have some return value)
Missing sstream include:
RandomFind.cpp
RandomFindString.cpp

shiwrevision numbers

E.g show the version of boost used, absl etc.

add descriptions to benchmarks

Somehow inline in the benchmark interface? Or in a separate file? Make somehow sure this does not go out of sync

benchmark sorted maps

obviously not hash tables, but just to see where they make or break, range of input where they may fare better. Including std::map/set, boost::container::flat_map/set for starters...

Exclude the time of inserts from the time of finds

The timing part of the current find has some issues to discuss. The time measured now includes the time used for inserts. The hash table is slowly expanded to its final size during the measurement. This brings two problems:

The time we measured is not the time for find function. It's the time for find and insert. And when the number of elements in the hash table is larger, the ratio of the number of insert operations to the number of find operations is larger. So this problem is especially noticeable when the number of elements is large (e.g. RandomFind_500000).
The time of RandomFind_N is not the time measured on a hash table of size N. It's the average time for all hash tables of size 1 to size N. This makes the results less informative for users who want to know the performance of a hash table of size N.

In summary, I think the method of measuring time in the find section should be improved.

Folly linking errors during building map_benchmark

Hello, I'm trying to build map_benchmark. I installed all of the dependencies and folly library, but the build fails at linking folly related files. I tried ninja and make but they are both giving the same errors.
This is ninja error during build:

Linking CXX executable bench_folly_F14NodeMap__folly_hasher
FAILED: bench_folly_F14NodeMap__folly_hasher 
: && /usr/bin/g++-8    -fuse-ld=gold   -rdynamic CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/app/getRSS.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/app/main.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/CtorDtor.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/Insert.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/Iterate.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomDistinct.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomFind.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomInsertErase.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomInsertEraseStrings.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/ShowHash.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/Strings.cpp.o  -o bench_folly_F14NodeMap__folly_hasher  external/facebook__folly/libfolly.a external/facebook__folly/libfolly.a -pthread -lboost_context -lboost_chrono -lboost_date_time -lboost_filesystem -lboost_program_options -lboost_regex -lboost_system -lboost_thread -lboost_atomic -lpthread -ldouble-conversion /usr/lib/x86_64-linux-gnu/libgflags.so.2.2.1 -lpthread -lglog -levent -lssl -lcrypto -lz -llzma -llz4 -lzstd -lsnappy -ldwarf -Wl,-Bstatic -liberty -Wl,-Bdynamic -ldl -lunwind && :
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/CtorDtor.cpp.o:CtorDtor.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<int, int, folly::hasher<int, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/Iterate.cpp.o:Iterate.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<unsigned long, unsigned long, folly::hasher<unsigned long, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomInsertEraseStrings.cpp.o:RandomInsertEraseStrings.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, folly::hasher<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
collect2: error: ld returned 1 exit status
[478/506] Building CXX object external/facebook__folly/folly/CMakeFiles/follybenchmark.dir/Benchmark.cpp.o
ninja: build stopped: subcommand failed.

This is make errors during build:

CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/CtorDtor.cpp.o:CtorDtor.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<int, int, folly::hasher<int, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/Iterate.cpp.o:Iterate.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<unsigned long, unsigned long, folly::hasher<unsigned long, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomInsertEraseStrings.cpp.o:RandomInsertEraseStrings.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, folly::hasher<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
collect2: error: ld returned 1 exit status
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/build.make:380: recipe for target 'bench_folly_F14NodeMap__folly_hasher' failed
make[2]: *** [bench_folly_F14NodeMap__folly_hasher] Error 1
CMakeFiles/Makefile2:189: recipe for target 'CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/all' failed
make[1]: *** [CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

use gtl maps: https://github.com/greg7mdp/gtl

https://github.com/greg7mdp/gtl

Could you add the results please

Adding some results in the readme or on the wiki would be great. Perhaps be useful as a reference for machines with different specs.

Request for inclusion of Boost.MultiIndex

I'd very much like to see Boost.MultiIndex included in this benchmark. As it happens, the library provides a set-like interface rather than that of a map, but the latter can easily be emulated as follows:

#pragma once

#include "Hash.h"
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/member.hpp>
#include <utility>

static const char* MapName = "boost::multi_index::hashed_unique";

template <class T1, class T2>
struct MapElement
{
    using first_type = T1;
    using second_type = T2;
    
    template<class Q1>
    MapElement(Q1&& first):
        first{std::forward<Q1>(first)},
        second{}
        {}

    template<class Q1, class Q2>
    MapElement(Q1&& first, Q2&& second):
        first{std::forward<Q1>(first)},
        second{std::forward<Q2>(second)}
        {}
    
    T1         first;
    mutable T2 second;
};

template <class Key, class Val>
using MapBase = boost::multi_index_container<
    MapElement<Key, Val>,
    boost::multi_index::indexed_by<
        boost::multi_index::hashed_unique<
            boost::multi_index::member<
                MapElement<Key, Val>, Key, &MapElement<Key, Val>::first
            >,
            Hash<Key>
        >
    >
>;

template <class Key, class Val>
struct Map: MapBase<Key, Val>
{
  using super = MapBase<Key, Val>;
  using super::super;
  
  template <class K>
  Val& operator[](K&& key){
      return this->emplace(std::forward<K>(key)).first->second; 
  }
};

boost::hash removed

Hi, 1a6323c removes boost::hash because, allegedly, "it's exactly the same as std::hash". But I don't think is is the case at all, unless I'm missing something in context.

Thank you!

Adding hordi::hash_map into the benchmark

Please, add hash_map from this github.com/hordi/hash/blob/master/include/hash_set.h into your benchmark

"copy ctor" benchmark doesn't scale linearly

I was comparing my implementation against @ktprime's hash_map8, which uses almost the same copy implementation as my implementation (malloc + memcpy), but for some reason mine was >2 times slower.

hash_map8 allocated 32780176 bytes, while my implementation allocated 35651601 bytes, so there shouldn't be a 2x performance difference. (it's even worse when run on godbolt: https://godbolt.org/z/oqe4rY3qq)

After a bunch of testing and help from a friend, I figured out that the malloc mmap threshold (DEFAULT_MMAP_THRESHOLD_MAX) happens to be 33554432, which is exactly between the two sizes.

I don't think that this is a good idea to have this influence the benchmark that much, as the constants involved are arbitrary.

I'm not certain what the best way to fix this is, so I haven't written an PR yet.

We could bump the size of the elements to DEFAULT_MMAP_THRESHOLD_MAX, so every map is treated the same, or we could lower the threshold with mallopt.

a interesting benchmark

random Insert and erase begin is a quite interesting benchmark
I think you know why some hash maps maybe very slow and others are quite efficient.

#include "Map.h"
#include "bench.h"
#include "hex.h"
#include "sfc64.h"

#include <algorithm>
#include <sstream>

BENCHMARK(RandomInsertEraseBegin) {
    size_t max_n = 10000;
    using M = Map<uint64_t, uint32_t>;

    for (int i = 0; i < 6; ++i) {
#ifdef USE_POOL_ALLOCATOR
        Resource<uint64_t, uint32_t> resource;
        M map{0, M::hasher{}, M::key_equal{}, &resource};
#else
        M map;
#endif

        std::stringstream ss;
        ss << (max_n / 1000000.) << " M cycles";
        sfc64 rng(999 + i);

        // benchmark randomly inserting & erasing begin
        bench.beginMeasure(ss.str().c_str());
        for (size_t i = 0; i < max_n; ++i) 
            map.emplace(rng(), 0);

        for (size_t i = 0; i < 2 * max_n; ++i) {
	    map.erase(map.begin());
            map.emplace(rng(), 0);
        }

        bench.endMeasure(map.size(), map.size());
        max_n *= 3;
    }
}

don't split up benchmark results by hash
Maybe split up into open address hashing and chained hashing y or node based
- Split up into 2 categories: open address hashing (boost maps, std::unordered_map), and all of them.
Having a filter for the results would be nice
Disable zoom? at least make a wider view
Add one summary page with the geomean of all find & insert benchmarks (except the ctor benchmarks)
Add a conclusio page:
- Use a reasonable hash that spreads entrophy in upper bits to lower bits. std::hash or boost::hash's identity was and is a bad idea. Doesn't need to withstand randomness tests, mumx seems to be good enough)
- Use a pool allocator (boost or PoolAllocator). It's faster and uses much less RAM.
Create a sortable table: X axis benchmark, y axis map & hash. With one entry that's the GEOMEAN.
- See https://cpu.userbenchmark.com/
- See https://betterprogramming.pub/sort-and-filter-dynamic-data-in-table-with-javascript-e7a1d2025e3c