Code Monkey home page Code Monkey logo

map_benchmark's Introduction

map_benchmark

Comprehensive benchmarks of C++ maps

Results

Please see here.

building

  1. Install dependencies for folly::F14, see https://github.com/facebook/folly/#ubuntu-1604-lts
  2. Install ninja and cmake
git clone --recurse-submodules https://github.com/martinus/map_benchmark.git
./map_benchmark/tools/build.sh

Updating Submodules

git submodule update --force --remote                                                                                                                                                                                                                                    

Sourcecode Layout

The implementation if the benchmark is open source, get it here: martinus/map_benchmark. It is split in several parts:

  1. external: all map implementations available through github are added as git submodules here.
  2. src/hashes: One directory for each hashing algorithm, each directory contains a Hash.h which basically contains a using instruction for the hash, e.g. like this:
template <class Key>
using Hash = robin_hood::hash<Key>;
  1. src/maps: One directory for each unordered map implementation, each directory contains a Map.h which basically contains a using instruction for the map. It includes Hash.h. E.g. like this:
#include "Hash.h"
template <class Key, class Val>
using Map = robin_hood::unordered_flat_map<Key, Val, Hash<Key>>;

Add a new Hashmap

  1. In external, add a submodule:
    cd external
    git submodule add -b master https://github.com/rigtorp/HashMap.git rigtorp__HashMap
    
  2. Create a directory in src/map/ with a file Hash.h. See the others for example.

Maps I couldn't add

  • QHash: It's interface is too different to be easily includeable. e.g. iterator->first and iterator->second do not exist.
  • rigtorp::HashMap: Doesn't have a default constructor

Reliable Benchmarks

  1. Run lscpu --extended to find out if you have hyperthreadding. E.g. for me it shows
    $ lscpu --extended
    CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ    MINMHZ
    0   0    0      0    0:0:0:0       yes    4600,0000 800,0000
    1   0    0      1    1:1:1:0       yes    4600,0000 800,0000
    2   0    0      2    2:2:2:0       yes    4600,0000 800,0000
    3   0    0      3    3:3:3:0       yes    4600,0000 800,0000
    4   0    0      4    4:4:4:0       yes    4600,0000 800,0000
    5   0    0      5    5:5:5:0       yes    4600,0000 800,0000
    6   0    0      0    0:0:0:0       yes    4600,0000 800,0000
    7   0    0      1    1:1:1:0       yes    4600,0000 800,0000
    8   0    0      2    2:2:2:0       yes    4600,0000 800,0000
    9   0    0      3    3:3:3:0       yes    4600,0000 800,0000
    10  0    0      4    4:4:4:0       yes    4600,0000 800,0000
    11  0    0      5    5:5:5:0       yes    4600,0000 800,0000
    
  2. Isolate a CPU with it's hyperthreading companion. I'm isolating CPU 5 and 11.
  3. Edit /etc/default/grub and change GRUB_CMDLINE_LINUX_DEFAULT so it looks like this:
    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=5,11 rcu_nocbs=5,11"
    
  4. Run sudo update-grub
  5. reboot
  6. Edit bench.rb so the taskset -c ... prefix is correct.
  7. Install Python module perf, see https://perf.readthedocs.io/en/latest/
  8. Run sudo python3 -m perf system tune
  9. Start the benchmarks: ../tools/bench.rb |tee ../data/all_new.txt

Sources:

map_benchmark's People

Contributors

baryluk avatar camel-cdr avatar ktprime avatar lhmouse avatar martinus avatar renzibei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

map_benchmark's Issues

Would please add also the copy-on-write hashmap from Asteria?

Source code is available at https://github.com/lhmouse/asteria/blob/master/rocket/cow_hashmap.hpp.

The interface of this container is very similar to std::unordered_map, except that:

  1. begin(), find() etc. return const_iterator. In order to get mutable iterators, it is necessary to call mut_begin(), mut_find() etc.
  2. equal_range() is not supported.
  3. Iterators are bidirectional.
  4. There are no bucket iterators.
  5. erase() may invalidate iterators.
  6. Incomplete types are allowed, even when the destructor is referenced.

Thanks in advance.

add find large

RandomFind_2000 is small dataset (load factor = 0.49)
and RandomFind_5000000 is medium dataset(load factor = 0.48)
I think change it to 600000 with load factor = 0.57 is better.

add a large dataset 12'000'0000(load factor = 0.71)


BENCHMARK(RandomFind_12000000) {

    static constexpr auto lower32bit = UINT64_C(0x00000000FFFFFFFF);
    static constexpr auto upper32bit = UINT64_C(0xFFFFFFFF00000000);
    static constexpr auto medium32bit = UINT64_C(0x0000FFFFFFFF0000);

    static constexpr size_t numInserts = 12'000'000;
    static constexpr size_t numFindsPerInsert = 10;

    bench.endMeasure(169388,    randomFindInternal(bench, 4, upper32bit, numInserts, numFindsPerInsert));
    bench.endMeasure(30126934,  randomFindInternal(bench, 3, upper32bit, numInserts, numFindsPerInsert));
    bench.endMeasure(60082938,  randomFindInternal(bench, 2, medium32bit, numInserts, numFindsPerInsert));
    bench.endMeasure(90041230,  randomFindInternal(bench, 1, lower32bit, numInserts, numFindsPerInsert));
    bench.endMeasure(119999853, randomFindInternal(bench, 0, lower32bit, numInserts, numFindsPerInsert));
}

compiler errors on Windows (VS)

For VS 2017 compiler have errors in sfc64.h (line 65, operator must have some return value)
Missing sstream include:
RandomFind.cpp
RandomFindString.cpp

benchmark sorted maps

obviously not hash tables, but just to see where they make or break, range of input where they may fare better. Including std::map/set, boost::container::flat_map/set for starters...

Exclude the time of inserts from the time of finds

The timing part of the current find has some issues to discuss. The time measured now includes the time used for inserts. The hash table is slowly expanded to its final size during the measurement. This brings two problems:

  1. The time we measured is not the time for find function. It's the time for find and insert. And when the number of elements in the hash table is larger, the ratio of the number of insert operations to the number of find operations is larger. So this problem is especially noticeable when the number of elements is large (e.g. RandomFind_500000).
  2. The time of RandomFind_N is not the time measured on a hash table of size N. It's the average time for all hash tables of size 1 to size N. This makes the results less informative for users who want to know the performance of a hash table of size N.

In summary, I think the method of measuring time in the find section should be improved.

Folly linking errors during building map_benchmark

Hello, I'm trying to build map_benchmark. I installed all of the dependencies and folly library, but the build fails at linking folly related files. I tried ninja and make but they are both giving the same errors.
This is ninja error during build:

Linking CXX executable bench_folly_F14NodeMap__folly_hasher
FAILED: bench_folly_F14NodeMap__folly_hasher 
: && /usr/bin/g++-8    -fuse-ld=gold   -rdynamic CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/app/getRSS.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/app/main.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/CtorDtor.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/Insert.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/Iterate.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomDistinct.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomFind.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomInsertErase.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomInsertEraseStrings.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/ShowHash.cpp.o CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/Strings.cpp.o  -o bench_folly_F14NodeMap__folly_hasher  external/facebook__folly/libfolly.a external/facebook__folly/libfolly.a -pthread -lboost_context -lboost_chrono -lboost_date_time -lboost_filesystem -lboost_program_options -lboost_regex -lboost_system -lboost_thread -lboost_atomic -lpthread -ldouble-conversion /usr/lib/x86_64-linux-gnu/libgflags.so.2.2.1 -lpthread -lglog -levent -lssl -lcrypto -lz -llzma -llz4 -lzstd -lsnappy -ldwarf -Wl,-Bstatic -liberty -Wl,-Bdynamic -ldl -lunwind && :
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/CtorDtor.cpp.o:CtorDtor.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<int, int, folly::hasher<int, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/Iterate.cpp.o:Iterate.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<unsigned long, unsigned long, folly::hasher<unsigned long, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomInsertEraseStrings.cpp.o:RandomInsertEraseStrings.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, folly::hasher<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
collect2: error: ld returned 1 exit status
[478/506] Building CXX object external/facebook__folly/folly/CMakeFiles/follybenchmark.dir/Benchmark.cpp.o
ninja: build stopped: subcommand failed.

This is make errors during build:

CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/CtorDtor.cpp.o:CtorDtor.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<int, int, folly::hasher<int, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/Iterate.cpp.o:Iterate.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<unsigned long, unsigned long, folly::hasher<unsigned long, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/src/benchmarks/RandomInsertEraseStrings.cpp.o:RandomInsertEraseStrings.cpp:function folly::f14::detail::F14Table<folly::f14::detail::NodeContainerPolicy<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, folly::hasher<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void>, void, void> >::rehashImpl(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)::{lambda()#1}::operator()() const: error: undefined reference to 'folly::f14::detail::F14LinkCheck<(folly::f14::detail::F14IntrinsicsMode)1>::check()'
collect2: error: ld returned 1 exit status
CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/build.make:380: recipe for target 'bench_folly_F14NodeMap__folly_hasher' failed
make[2]: *** [bench_folly_F14NodeMap__folly_hasher] Error 1
CMakeFiles/Makefile2:189: recipe for target 'CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/all' failed
make[1]: *** [CMakeFiles/bench_folly_F14NodeMap__folly_hasher.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

Could you add the results please

Adding some results in the readme or on the wiki would be great. Perhaps be useful as a reference for machines with different specs.

Request for inclusion of Boost.MultiIndex

I'd very much like to see Boost.MultiIndex included in this benchmark. As it happens, the library provides a set-like interface rather than that of a map, but the latter can easily be emulated as follows:

#pragma once

#include "Hash.h"
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/member.hpp>
#include <utility>

static const char* MapName = "boost::multi_index::hashed_unique";

template <class T1, class T2>
struct MapElement
{
    using first_type = T1;
    using second_type = T2;
    
    template<class Q1>
    MapElement(Q1&& first):
        first{std::forward<Q1>(first)},
        second{}
        {}

    template<class Q1, class Q2>
    MapElement(Q1&& first, Q2&& second):
        first{std::forward<Q1>(first)},
        second{std::forward<Q2>(second)}
        {}
    
    T1         first;
    mutable T2 second;
};

template <class Key, class Val>
using MapBase = boost::multi_index_container<
    MapElement<Key, Val>,
    boost::multi_index::indexed_by<
        boost::multi_index::hashed_unique<
            boost::multi_index::member<
                MapElement<Key, Val>, Key, &MapElement<Key, Val>::first
            >,
            Hash<Key>
        >
    >
>;

template <class Key, class Val>
struct Map: MapBase<Key, Val>
{
  using super = MapBase<Key, Val>;
  using super::super;
  
  template <class K>
  Val& operator[](K&& key){
      return this->emplace(std::forward<K>(key)).first->second; 
  }
};

boost::hash removed

Hi, 1a6323c removes boost::hash because, allegedly, "it's exactly the same as std::hash". But I don't think is is the case at all, unless I'm missing something in context.

Thank you!

"copy ctor" benchmark doesn't scale linearly

I was comparing my implementation against @ktprime's hash_map8, which uses almost the same copy implementation as my implementation (malloc + memcpy), but for some reason mine was >2 times slower.

hash_map8 allocated 32780176 bytes, while my implementation allocated 35651601 bytes, so there shouldn't be a 2x performance difference. (it's even worse when run on godbolt: https://godbolt.org/z/oqe4rY3qq)

After a bunch of testing and help from a friend, I figured out that the malloc mmap threshold (DEFAULT_MMAP_THRESHOLD_MAX) happens to be 33554432, which is exactly between the two sizes.

I don't think that this is a good idea to have this influence the benchmark that much, as the constants involved are arbitrary.

I'm not certain what the best way to fix this is, so I haven't written an PR yet.

We could bump the size of the elements to DEFAULT_MMAP_THRESHOLD_MAX, so every map is treated the same, or we could lower the threshold with mallopt.

a interesting benchmark

random Insert and erase begin is a quite interesting benchmark
I think you know why some hash maps maybe very slow and others are quite efficient.

#include "Map.h"
#include "bench.h"
#include "hex.h"
#include "sfc64.h"

#include <algorithm>
#include <sstream>

BENCHMARK(RandomInsertEraseBegin) {
    size_t max_n = 10000;
    using M = Map<uint64_t, uint32_t>;

    for (int i = 0; i < 6; ++i) {
#ifdef USE_POOL_ALLOCATOR
        Resource<uint64_t, uint32_t> resource;
        M map{0, M::hasher{}, M::key_equal{}, &resource};
#else
        M map;
#endif

        std::stringstream ss;
        ss << (max_n / 1000000.) << " M cycles";
        sfc64 rng(999 + i);

        // benchmark randomly inserting & erasing begin
        bench.beginMeasure(ss.str().c_str());
        for (size_t i = 0; i < max_n; ++i) 
            map.emplace(rng(), 0);

        for (size_t i = 0; i < 2 * max_n; ++i) {
	    map.erase(map.begin());
            map.emplace(rng(), 0);
        }

        bench.endMeasure(map.size(), map.size());
        max_n *= 3;
    }
}

Very old version of Boost required

Looks like src/maps/boost_unordered_map/dependencies.cmake is requesting version 1.51 of Boost, which is quite old (Aug 2012). You may want to update this to Boost 1.69.

Some notes for the big update

  • don't split up benchmark results by hash
  • Maybe split up into open address hashing and chained hashing y or node based
    • Split up into 2 categories: open address hashing (boost maps, std::unordered_map), and all of them.
  • Having a filter for the results would be nice
  • Disable zoom? at least make a wider view
  • Add one summary page with the geomean of all find & insert benchmarks (except the ctor benchmarks)
  • Add a conclusio page:
    • Use a reasonable hash that spreads entrophy in upper bits to lower bits. std::hash or boost::hash's identity was and is a bad idea. Doesn't need to withstand randomness tests, mumx seems to be good enough)
    • Use a pool allocator (boost or PoolAllocator). It's faster and uses much less RAM.
  • Create a sortable table: X axis benchmark, y axis map & hash. With one entry that's the GEOMEAN.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.