Code Monkey home page Code Monkey logo

verstable's Introduction

Verstable

Verstable is a versatile generic hash table intended to bring the speed and memory efficiency of state-of-the-art C++ hash tables such as Abseil/Swiss, Boost, and Bytell to C.

Its features include:

API:

  • Type safety.
  • Customizable hash, comparison, and destructor functions.
  • Single header.
  • C99 compatibility.
  • Generic API in C11 and later.

Performance:

  • High speed mostly impervious to load factor.
  • Only two bytes of overhead per bucket.
  • Tombstones-free deletion.

Benchmarks comparing Verstable to the aforementioned hash tables and several Robin Hood-based hash tables are available here.

Verstable is distributed under the MIT license.

Try it online here.

Installation

Just download verstable.h and place it in your project's directory or your common header directory.

Example

Using the generic macro API (C11 and later):
#include <stdio.h>

// Instantiating a set template.
#define NAME int_set
#define KEY_TY int
#include "verstable.h"

// Instantiating a map template.
#define NAME int_int_map
#define KEY_TY int
#define VAL_TY int
#include "verstable.h"

int main( void )
{
  // Set.

  int_set our_set;
  vt_init( &our_set );

  // Inserting keys.
  for( int i = 0; i < 10; ++i )
  {
    int_set_itr itr = vt_insert( &our_set, i );
    if( vt_is_end( itr ) )
      exit( 1 ); // Out of memory.
  }

  // Erasing keys.
  for( int i = 0; i < 10; i += 3 )
    vt_erase( &our_set, i );

  // Retrieving keys.
  for( int i = 0; i < 10; ++i )
  {
    int_set_itr itr = vt_get( &our_set, i );
    if( !vt_is_end( itr ) )
      printf( "%d ", itr.data->key );
  }
  // Printed: 1 2 4 5 7 8

  // Iteration.
  for(
    int_set_itr itr = vt_first( &our_set );
    !vt_is_end( itr );
    itr = vt_next( itr )
  )
    printf( "%d ", itr.data->key );
  // Printed: 4 5 2 8 1 7

  vt_cleanup( &our_set );

  // Map.

  int_int_map our_map;
  vt_init( &our_map );

  // Inserting keys and values.
  for( int i = 0; i < 10; ++i )
  {
    int_int_map_itr itr =
      vt_insert( &our_map, i, i + 1 );
    if( vt_is_end( itr ) )
      exit( 1 ); // Out of memory.
  }

  // Erasing keys and values.
  for( int i = 0; i < 10; i += 3 )
    vt_erase( &our_map, i );

  // Retrieving keys and values.
  for( int i = 0; i < 10; ++i )
  {
    int_int_map_itr itr = vt_get( &our_map, i );
    if( !vt_is_end( itr ) )
      printf(
        "%d:%d ",
        itr.data->key,
        itr.data->val
      );
  }
  // Printed: 1:2 2:3 4:5 5:6 7:8 8:9

  // Iteration.
  for(
    int_int_map_itr itr = vt_first( &our_map );
    !vt_is_end( itr );
    itr = vt_next( itr )
  )
    printf(
      "%d:%d ",
      itr.data->key,
      itr.data->val
    );
  // Printed: 4:5 5:6 2:3 8:9 1:2 7:8

  vt_cleanup( &our_map );
}
Using the prefixed functions API (C99 and later):
#include <stdio.h>

// Instantiating a set template.
#define NAME int_set
#define KEY_TY int
#define HASH_FN vt_hash_integer
#define CMPR_FN vt_cmpr_integer
#include "verstable.h"

// Instantiating a map template.
#define NAME int_int_map
#define KEY_TY int
#define VAL_TY int
#define HASH_FN vt_hash_integer
#define CMPR_FN vt_cmpr_integer
#include "verstable.h"

int main( void )
{
  // Set.

  int_set our_set;
  int_set_init( &our_set );

  // Inserting keys.
  for( int i = 0; i < 10; ++i )
  {
    int_set_itr itr =
      int_set_insert( &our_set, i );
    if( int_set_is_end( itr ) )
      exit( 1 ); // Out of memory.
  }

  // Erasing keys.
  for( int i = 0; i < 10; i += 3 )
    int_set_erase( &our_set, i );

  // Retrieving keys.
  for( int i = 0; i < 10; ++i )
  {
    int_set_itr itr = int_set_get( &our_set, i );
    if( !int_set_is_end( itr ) )
      printf( "%d ", itr.data->key );
  }
  // Printed: 1 2 4 5 7 8

  // Iteration.
  for(
    int_set_itr itr =
      int_set_first( &our_set );
    !int_set_is_end( itr );
    itr = int_set_next( itr )
  )
    printf( "%d ", itr.data->key );
  // Printed: 4 5 2 8 1 7

  int_set_cleanup( &our_set );

  // Map.

  int_int_map our_map;
  int_int_map_init( &our_map );

  // Inserting keys and values.
  for( int i = 0; i < 10; ++i )
  {
    int_int_map_itr itr =
      int_int_map_insert( &our_map, i, i + 1 );
    if( int_int_map_is_end( itr ) )
      exit( 1 ); // Out of memory.
  }

  // Erasing keys and values.
  for( int i = 0; i < 10; i += 3 )
    int_int_map_erase( &our_map, i );

  // Retrieving keys and values.
  for( int i = 0; i < 10; ++i )
  {
    int_int_map_itr itr =
      int_int_map_get( &our_map, i );
    if( !int_int_map_is_end( itr ) )
      printf(
      	"%d:%d ",
      	itr.data->key,
      	itr.data->val
    );
  }
  // Printed: 1:2 2:3 4:5 5:6 7:8 8:9

  // Iteration.
  for(
    int_int_map_itr itr =
      int_int_map_first( &our_map );
    !int_int_map_is_end( itr );
    itr = int_int_map_next( itr )
  )
    printf(
      "%d:%d ",
      itr.data->key,
      itr.data->val
    );
  // Printed: 4:5 5:6 2:3 8:9 1:2 7:8

  int_int_map_cleanup( &our_map );
}

API

Full API documentation is available here.

FAQ

How does it work?

Verstable is an open-addressing hash table using quadratic probing and the following additions:

  • All keys that hash (i.e. "belong") to the same bucket (their "home bucket") are linked together by an 11-bit integer specifying the quadratic displacement, relative to that bucket, of the next key in the chain.

  • If a chain of keys exists for a given bucket, then it always begins at that bucket. To maintain this policy, a 1-bit flag is used to mark whether the key occupying a bucket belongs there. When inserting a new key, if the bucket it belongs to is occupied by a key that does not belong there, then the occupying key is evicted and the new key takes the bucket.

  • A 4-bit fragment of each key's hash code is also stored.

  • The aforementioned metadata associated with each bucket (the 4-bit hash fragment, the 1-bit flag, and the 11-bit link to the next key in the chain) are stored together in a uint16_t array rather than in the bucket alongside the key and (optionally) the value.

One way to conceptualize this scheme is as a chained hash table in which overflowing keys are stored not in separate memory allocations but in otherwise unused buckets. In this regard, it shares similarities with Malte Skarupke’s Bytell hash table and traditional "coalesced hashing".

Advantages of this scheme include:

  • Fast lookups impervious to load factor: If the table contains any key belonging to the lookup key's home bucket, then that bucket contains the first in a traversable chain of all keys belonging to it. Hence, only the home bucket and other buckets containing keys belonging to it are ever probed. Moreover, the stored hash fragments allow skipping most non-matching keys in the chain without accessing the actual buckets array or calling the (potentially expensive) key comparison function.

  • Fast insertions: Insertions are faster than they are in other schemes that move keys around (e.g. Robin Hood) because they only move, at most, one existing key.

  • Fast, tombstone-free deletions: Deletions, which usually require tombstones in quadratic-probing hash tables, are tombstone-free and only move, at most, one existing key.

  • Fast iteration: The separate metadata array allows keys in sparsely populated tables to be found without incurring the frequent cache misses that would result from traversing the buckets array.

The generic macro API available in C11 is based on the extendible-_Generic mechanism detailed here.

How is it tested?

Verstable has been tested under GCC, Clang, MinGW, and MSVC. tests/unit_tests.c includes unit tests for sets and maps, with an emphasis on corner cases. tests/tests_against_stl.cpp includes randomized tests that perform the same operations on Verstable sets and maps, on one hand, and C++'s std::unordered_set and std::unordered_map, on the other, and then check that they remain in sync. Both test suites use a tracking and randomly failing memory allocator in order to detect memory leaks and test out-of-memory conditions.

Why the name?

The name is a contraction of "versatile table". Verstable handles various conditions that strain other hash table schemes—such as large keys or values that are expensive to move, high load factors, expensive hash or comparison functions, and frequent deletions, iteration, and unsuccessful lookups—without significant performance loss. In other words, it is designed to be a good default choice of hash table for most use cases.

verstable's People

Contributors

jacksonallan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

verstable's Issues

Custom allocator context

Hi, thanks for sharing an awesome library. love the efficient design and clean api.

Regarding the topic of custom allocator context raised by @skeeto here https://old.reddit.com/r/C_Programming/comments/18gnxkd/verstable_a_versatile_highperformance_generic/kd25s9z/

Would you think the approach https://github.com/stclib/STC?tab=readme-ov-file#per-container-instance-customization fit Verstable well? It seems "zero cost" at runtime if not opted in, also less intrusive than what @skeeto's approach.

Regards!

Compile Error on gcc 4.8.5

#include <stdio.h>

#define NAME int_int_map
#define KEY_TY int 
#define VAL_TY int  
#include "verstable.h"

int main(int argc, char const *argv[])
{
    int_int_map htable;
    vt_init(&htable);

    return 0;
}
[root@DESKTOP-OAKHD5F src]# gcc -std=c11 main.c 
In file included from main.c:6:0:
verstable.h: In function ‘int_int_map_evict’:
verstable.h:894:3: warning: implicit declaration of function ‘_Generic’ [-Wimplicit-function-declaration]
   size_t home_bucket = HASH_FN( table->buckets[ bucket ].key ) & ( table->bucket_count - 1 );
   ^
verstable.h:763:44: error: expected expression before ‘char’
 #define HASH_FN _Generic( ( KEY_TY ){ 0 }, char *: vt_hash_string, default: vt_hash_integer )
                                            ^
verstable.h:894:24: note: in expansion of macro ‘HASH_FN’
   size_t home_bucket = HASH_FN( table->buckets[ bucket ].key ) & ( table->bucket_count - 1 );
                        ^
verstable.h: In function ‘int_int_map_insert_raw’:
verstable.h:763:44: error: expected expression before ‘char’
 #define HASH_FN _Generic( ( KEY_TY ){ 0 }, char *: vt_hash_string, default: vt_hash_integer )
                                            ^
verstable.h:966:19: note: in expansion of macro ‘HASH_FN’
   uint64_t hash = HASH_FN( key );
                   ^
verstable.h:772:44: error: expected expression before ‘char’
 #define CMPR_FN _Generic( ( KEY_TY ){ 0 }, char *: vt_cmpr_string, default: vt_cmpr_integer )
                                            ^
verstable.h:1007:9: note: in expansion of macro ‘CMPR_FN’
         CMPR_FN( table->buckets[ bucket ].key, key )
         ^
verstable.h: In function ‘int_int_map_get’:
verstable.h:763:44: error: expected expression before ‘char’
 #define HASH_FN _Generic( ( KEY_TY ){ 0 }, char *: vt_hash_string, default: vt_hash_integer )
                                            ^
verstable.h:1209:19: note: in expansion of macro ‘HASH_FN’
   uint64_t hash = HASH_FN( key );
                   ^
verstable.h:772:44: error: expected expression before ‘char’
 #define CMPR_FN _Generic( ( KEY_TY ){ 0 }, char *: vt_cmpr_string, default: vt_cmpr_integer )
                                            ^
verstable.h:1221:74: note: in expansion of macro ‘CMPR_FN’
     if( ( table->metadata[ bucket ] & VT_HASH_FRAG_MASK ) == hashfrag && CMPR_FN( table->buckets[ bucket ].key, key ) )
                                                                          ^
verstable.h: In function ‘int_int_map_erase_itr_raw’:
verstable.h:763:44: error: expected expression before ‘char’
 #define HASH_FN _Generic( ( KEY_TY ){ 0 }, char *: vt_hash_string, default: vt_hash_integer )
                                            ^
verstable.h:1277:25: note: in expansion of macro ‘HASH_FN’
       itr.home_bucket = HASH_FN( table->buckets[ itr_bucket ].key ) & ( table->bucket_count - 1 );
                         ^
In file included from main.c:6:0:
main.c: In function ‘main’:
verstable.h:579:65: error: expected expression before ‘vt_table_0000’
 #define vt_init( table ) _Generic( *( table ) VT_GENERIC_SLOTS( vt_table_, vt_init_ ) )( table )
                                                                 ^
verstable.h:405:25: note: in definition of macro ‘VT_CAT_’
 #define VT_CAT_( a, b ) a##b
                         ^
verstable.h:544:40: note: in expansion of macro ‘VT_CAT’
 #define VT_GENERIC_SLOT( ty, fn, n ) , VT_CAT( ty, n ): VT_CAT( fn, n )
                                        ^
verstable.h:546:35: note: in expansion of macro ‘VT_GENERIC_SLOT’
 #define VT_R1_1( ty, fn, d3, d2 ) VT_GENERIC_SLOT( ty, fn, VT_CAT_4( 0, d3, d2, 0 ) )
                                   ^
verstable.h:405:25: note: in expansion of macro ‘VT_R1_1’
 #define VT_CAT_( a, b ) a##b
                         ^
verstable.h:579:47: note: in expansion of macro ‘VT_GENERIC_SLOTS’
 #define vt_init( table ) _Generic( *( table ) VT_GENERIC_SLOTS( vt_table_, vt_init_ ) )( table )
                                               ^
main.c:13:5: note: in expansion of macro ‘vt_init’
     vt_init(&flow_table);
     ^
[root@DESKTOP-OAKHD5F src]# gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.