Code Monkey home page Code Monkey logo

buffet's Introduction

Buffet

All-inclusive Buffer for C

schema

Experimental string buffer featuring

  • SSO (small string optimization) inline short data into the type.
  • views (or 'slices') no-copy references to sub-strings.
  • refcount (reference counting) account for views before mutation or release.
  • automated allocations

Aims at security with decent speed.
Todo: thread safety

API

CI

Layout

Buffet is a tagged union with 4 modes.

// Hard values show 64-bit

union Buffet {
    struct ptr {
        char*   data
        size_t  len
        size_t  off:62, tag:2 // tag = OWN|SSV|VUE
    }
    struct sso {
        char    data[22]
        uint8_t refcnt
        uint8_t len:6, tag:2  // tag = SSO
    }
}

sizeof(Buffet) == 24

The tag sets a Buffet's mode :

  • OWN co-owning slice of a store
  • SSO embedded char array
  • SSV (small string view) view on an SSO
  • VUE non-owning view on any data

If OWN, Buffet.data points into an allocated heap store :

struct Store {
    size_t   cap    // store capacity
    size_t   len    // store length
    uint32_t refcnt // number of views on store
    uint32_t canary // invalidates store if modified
    char     data[] // buffer data, shared by owning views
}

Schema

schema

#include "../buffet.h"

int main() {
    
// SHARED OWN =================

    char large[] = "DATA STORE IS HEAP ALLOCATION.";

    Buffet own1 = bft_memcopy(large, sizeof(large)-1);
    // Now own1 owns a store housing a copy of `large`
    bft_dbg(&own1); 
    //-> OWN 30 "DATA STORE ..."
    // View "STORE" in own1 :
    Buffet own2 = bft_view(&own1, 5, 5);
    // Now own1 and own2 share the store, whose refcount is 2
    bft_dbg(&own2); 
    //-> OWN 5 "STORE"

// SSO & SSV =================

    char small[] = "SMALL STRING";

    Buffet sso1 = bft_memcopy(small, sizeof(small)-1);
    bft_dbg(&sso1); 
    //-> SSO 12 "SMALL STRING"
    // View "STRING" in sso1 :
    Buffet ssv1 = bft_view(&sso1, 6, 6);
    bft_dbg(&ssv1); 
    //-> SSV 6 "STRING"

// VUE =======================

    char any[] = "SOME BYTES";

    // View "BYTES" in `any` :
    Buffet vue1 = bft_memview(any+5, 5);
    bft_dbg(&vue1); 
    //-> VUE 5 "BYTES"

    return 0;
}

Build & check

make && make check

While extensive, unit tests may not yet cover all cases.

Security

Buffet aims at preventing memory faults, including from user.
(Except of course losing scope and such.)

// (pseudo code)

// overflow
buf = new(8)
append(buf, large_str) // Done

// invalid ref
buf = memcopy(short_str) // SSO
view = view(buf)
append(buf, large_str) // would mutate SSO to OWN
// => abort & warn "Append would invalidate views on SSO"

// double-free
bft_free(buf)
bft_free(buf) // OK

// use-after-free
bft_free(buf)
append(buf, "foo") // Done. Now buf is "foo".

// aliasing
alias = buf // should be `alias = bft_dup(buf)`
bft_free(buf)
bft_free(alias) // OK. Possible warning "Bad canary. Double free ?"

// Etc...

To this end, operations like view() or free() may check the store's header.
If wrong, the operation aborts and returns an empty buffet.

Checks are enabled by #define MEMCHECK or building with

MEMCHECK=1 make

Warnings are enabled by #define DEBUG or building with

DEBUG=1 make

NB: Even with checks, some aliasing can be fatal.

own = memcopy(large_str)
view = view(own)
alias = view
bft_free(view)
bft_free(own) // refcnt == 0, free(store) !
// alias now points into freed memory...

See src/check.c unit-tests and warnings output.

Bench

make && make bench (requires libbenchmark-dev)

NB: The lib is not much optimized and the bench maybe amateurish.

On a weak Core i3 :

MEMVIEW_cpp/8            0.609 ns
MEMVIEW_buffet/8          6.36 ns
MEMCOPY_c/8               16.7 ns
MEMCOPY_buffet/8          11.9 ns
MEMCOPY_c/32              15.3 ns
MEMCOPY_buffet/32         26.3 ns
MEMCOPY_c/128             16.8 ns
MEMCOPY_buffet/128        29.8 ns
MEMCOPY_c/512             24.9 ns
MEMCOPY_buffet/512        39.3 ns
MEMCOPY_c/2048            94.1 ns
MEMCOPY_buffet/2048        109 ns
MEMCOPY_c/8192             196 ns
MEMCOPY_buffet/8192        282 ns
APPEND_cpp/8/4            10.9 ns
APPEND_buffet/8/4         16.3 ns
APPEND_cpp/8/16           36.5 ns
APPEND_buffet/8/16        30.2 ns
APPEND_cpp/24/4           49.0 ns
APPEND_buffet/24/4        30.1 ns
APPEND_cpp/24/32          48.1 ns
APPEND_buffet/24/32       28.8 ns
SPLITJOIN_c               2782 ns
SPLITJOIN_cpp             3317 ns
SPLITJOIN_buffet          1397 ns

API

bft_new
bft_memcopy
bft_memview
bft_copy
bft_copyall
bft_view
bft_dup (don't alias buffets, use this)
bft_append
bft_split
bft_splitstr
bft_join
bft_free

bft_cmp
bft_cap
bft_len
bft_data
bft_cstr
bft_export

bft_print
bft_dbg

bft_new

Buffet bft_new (size_t cap)

Create a new empty Buffet of minimum capacity cap.

Buffet buf = bft_new(40);
bft_dbg(&buf); 
// OWN 0 ""

bft_memcopy

Buffet bft_memcopy (const char *src, size_t len)

Create a new Buffet by copying len bytes from src.

Buffet copy = bft_memcopy("Bonjour", 3);
// SSO 3 "Bon"

bft_memview

Buffet bft_memview (const char *src, size_t len)

Create a new Buffet viewing len bytes from src.
You get a window into src without copy or allocation.
NB: You shouldn't directly memview a Buffet's data. Use view()

char src[] = "Eat Buffet!";
Buffet view = bft_memview(src+4, 6);
// VUE 6 "Buffet"

bft_copy

Buffet bft_copy (const Buffet *src, ptrdiff_t off, size_t len)

Copy len bytes at offset off from Buffet src into a new Buffet.

Buffet src = bft_memcopy("Bonjour", 7);
Buffet cpy = bft_copy(&src, 3, 4);
// SSO 4 "jour"

bft_copyall

Buffet bft_copyall (const Buffet *src)

Copy all bytes from Buffet src into a new Buffet.

bft_view

Buffet bft_view (Buffet *src, ptrdiff_t off, size_t len)

View len bytes of Buffet src, starting at off.
You get a window into src without copy or allocation.

The return internal type depends on src type :

  • view(SSO) -> SSV (refcounted)
  • view(SSV) -> SSV on src's target
  • view(OWN) -> OWN (as refcounted store co-owner)
  • view(VUE) -> VUE on src's target

If the return is OWN, the target store won't be released before either

  • the return is discarded by bft_free
  • the return is detached by e.g. appending to it.
#include "../buffet.h"

int main() {

    char text[] = "Bonjour monsieur Buddy. Already speaks french!";

    // view sso
    Buffet sso = bft_memcopy(text, 16); // "Bonjour monsieur"
    Buffet ssv = bft_view(&sso, 0, 7);
    bft_dbg(&ssv);

    // view ssv
    Buffet Bon = bft_view(&ssv, 0, 3);
    bft_dbg(&Bon);

    // view own
    Buffet own = bft_memcopy(text, sizeof(text));
    Buffet ownview = bft_view(&own, 0, 7);
    bft_dbg(&ownview);

    // detach view
    bft_append (&ownview, "!", 1);
    // bft_free(&ownview); 
    bft_free(&own); // Done

    // view vue
    Buffet vue = bft_memview(text+8, 8); // "Good"
    Buffet mon = bft_view(&vue, 0, 3);
    bft_dbg(&mon);

    return 0;
}
$ cc view.c libbuffet.a -o view && ./view
SSV 7 data:"Bonjour"
SSV 3 data:"Bon"
OWN 7 data:"Bonjour"
VUE 3 data:"mon"

bft_dup

Buffet bft_dup (const Buffet *src)

Create a shallow copy of src.
Use this intead of aliasing a Buffet.

Buffet src = bft_memcopy("Hello", 5);
Buffet cpy = src; // BAD
Buffet cpy = bft_dup(&src); // GOOD
bft_dbg(&cpy);
// SSO 5 "Hello"

Rem: aliasing would mostly work but mess up refcounting (without crash if store protections are enabled) :

Buffet alias = sso; //ok if sso was not viewed
Buffet alias = own; //not refcounted
Buffet alias = vue; //ok

bft_free

void bft_free (Buffet *buf)

Discards buf.

  • aborts if buf is an SSO with views.
  • otherwise, buf is zeroed into an empty SSO.
  • if buf was a view, its target refcount is decremented.
  • if buf was the last view on a store, the store is released.

Security:

  • the zeroing makes double-free harmless.
  • the only problematic use-after-free would be of a OWN alias (not recommended), but the store management prevents stale memory access.
#include "../buffet.h"

int main() {

    char text[] = "Le grand orchestre de Patato Valdez";

    Buffet own = bft_memcopy(text, sizeof(text));
    Buffet ref = bft_view(&own, 9, 9); // "orchestre"
    bft_free(&own); // A bit soon but ok, --refcnt
    bft_dbg(&own);  // SSO 0 ""
    bft_free(&ref); // Was last co-owner, store is released

    Buffet sso = bft_memcopy(text, 8); // "Le grand"
    Buffet ref2 = bft_view(&sso, 3, 5); // "grand"
    bft_free(&sso); // WARN line:328 bft_free: SSO has views on it
    bft_free(&ref2);
    bft_free(&sso); // OK now
    bft_dbg(&sso);  // SSO 0 ""

    return 0;
}
$ valgrind  --leak-check=full ./bin/ex/free
All heap blocks were freed -- no leaks are possible

bft_cat

size_t bft_cat (Buffet *dst, const Buffet *buf, const char *src, size_t len)

Concatenates buf and len bytes of src into resulting dst.
Returns total length or 0 on error.

Buffet buf = bft_memcopy("abc", 3);
Buffet dst;
size_t totlen = bft_cat(&dst, &buf, "def", 3);
bft_dbg(&dst);
// SSO 6 "abcdef"

bft_append

size_t bft_append (Buffet *dst, const char *src, size_t len)

Appends len bytes from src to dst.
Returns new length or 0 on error.

Buffet buf = bft_memcopy("abc", 3); 
size_t newlen = bft_append(&buf, "def", 3);
bft_dbg(&buf);
// SSO 6 "abcdef"

NB: returns failure if buf has views and would mutate from SSO to OWN to increase capacity, invalidating the views :

Buffet foo = bft_memcopy("short foo ", 10);
Buffet view = bft_view(&foo, 0, 5);
// would mutate to OWN :
size_t rc = bft_append(&foo, "now too long for SSO");
assert(rc==0); // meaning aborted

To prevent this, release views before appending to a small buffet.

bft_split

Buffet* bft_split (const char* src, size_t srclen, const char* sep, size_t seplen, 
int *outcnt)

Splits src along separator sep into a Buffet Vue list of length *outcnt.

Being made of views, you can free(list) without leak provided no element was made an owner by e.g appending to it.

bft_splitstr

Buffet* bft_splitstr (const char *src, const char *sep, int *outcnt);

Convenient split using strlen internally.

int cnt;
Buffet *parts = bft_splitstr("Split me", " ", &cnt);
for (int i=0; i<cnt; ++i)
    bft_print(&parts[i]);
// VUE 5 "Split"
// VUE 2 "me"
free(parts);

bft_join

Buffet bft_join (Buffet *list, int cnt, const char* sep, size_t seplen);

Joins list on separator sep into a new Buffet.

int cnt;
Buffet *parts = bft_splitstr("Split me", " ", &cnt);
Buffet back = bft_join(parts, cnt, " ", 1);
bft_dbg(&back);
// SSO 8 'Split me'

bft_cmp

int bft_cmp (const Buffet *a, const Buffet *b)

Compare two buffets' data using memcmp.

bft_cap

size_t bft_cap (Buffet *buf)

Get current capacity.

bft_len

size_t bft_len (Buffet *buf)`

Get current length.

bft_data

const char* bft_data (const Buffet *buf)`

Get current data pointer.
To ensure null-termination at buf.len, use bft_cstr.

bft_cstr

const char* bft_cstr (const Buffet *buf, bool *mustfree)

Get current data as a null-terminated C string of max length buf.len.
If needed (when buf is a view), the data is copied into a new C string that must be freed if mustfree is set.

bft_export

char* bft_export (const Buffet *buf)

Copies data up to buf.len into a new C string that must be freed.

bft_print

void bft_print (const Buffet *buf)`

Prints data up to buf.len.

bft_dbg

void bft_dbg (Buffet *buf)

Prints buf state.

Buffet buf;
bft_memcopy(&buf, "foo", 3);
bft_dbg(&buf);
// SSO 3 "foo"

TODO

  • ! views : decide clearly if r/o + CoW or writable
  • store.len : not updated when tailing view is detached..
    We lose future in-place optimization.
    Maybe record second to last range ?
  • store checks : too many ?
    Decide user responsabilty on mis-handling.
    Add #define ENABLE_MEMCHECKS
  • SSO auto-release like own ?
  • test 32-bit and small RAM

API

  • write(), sprintf()
  • resize()

buffet's People

Contributors

alcover avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

buffet's Issues

Project might want a license.

Most other people are unable to lawfully use your code without a license. If your intent is to allow use, then you should add a license. This is mechanically trivial following the Github instructions.

Selecting a license is of course nontrivial. My personal choice when I intend unlimited use is the MIT license and as a consumer of licenses it is one that makes no barriers.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.