Code Monkey home page Code Monkey logo

twemcache's Introduction

Twemcache: Twitter Memcached

status: retired Build Status

Twemcache is no longer actively maintained. See twitter/pelikan for our latest caching work.

Twemcache (pronounced "tw-em-cache") is the Twitter Memcached. Twemcache is based on a fork of Memcached v.1.4.4 that has been heavily modified to make to suitable for the large scale production environment at Twitter.

Build

To build twemcache from distribution tarball:

$ ./configure
$ make
$ sudo make install

To build twemcache from distribution tarball with a non-standard path to libevent install:

$ ./configure --with-libevent=<path>
$ make
$ sudo make install

To build twemcache from distribution tarball with a statically linked libevent:

$ ./configure --enable-static=libevent
$ make
$ sudo make install

To build twemcache from distribution tarball in debug mode with assertion panics enabled:

$ CFLAGS="-ggdb3 -O0" ./configure --enable-debug=full
$ make
$ sudo make install

To build twemcache from source with debug logs enabled and assertions disabled:

$ git clone [email protected]:twitter/twemcache.git
$ cd twemcache
$ autoreconf -fvi
$ ./configure --enable-debug=log
$ make V=1
$ src/twemcache -h

Help

Usage: twemcache [-?hVCELdkrDS] [-o output file] [-v verbosity level]
           [-A stats aggr interval]
           [-t threads] [-P pid file] [-u user]
           [-x command logging entry] [-X command logging file]
           [-R max requests] [-c max conns] [-b backlog] [-p port] [-U udp port]
           [-l interface] [-s unix path] [-a access mask] [-M eviction strategy]
           [-f factor] [-m max memory] [-n min item chunk size] [-I slab size]
           [-z slab profile]

Options:
  -h, --help                  : this help
  -V, --version               : show version and exit
  -E, --prealloc              : preallocate memory for all slabs
  -L, --use-large-pages       : use large pages if available
  -k, --lock-pages            : lock all pages and preallocate slab memory
  -d, --daemonize             : run as a daemon
  -r, --maximize-core-limit   : maximize core file limit
  -C, --disable-cas           : disable use of cas
  -D, --describe-stats        : print stats description and exit
  -S, --show-sizes            : print slab and item struct sizes and exit
  -o, --output=S              : set the logging file (default: stderr)
  -v, --verbosity=N           : set the logging level (default: 5, min: 0, max: 11)
  -A, --stats-aggr-interval=N : set the stats aggregation interval in usec (default: 100000 usec)
  -t, --threads=N             : set number of threads to use (default: 4)
  -P, --pidfile=S             : set the pid file (default: off)
  -u, --user=S                : set user identity when run as root (default: off)
  -x, --klog-entry=N          : set the command logging entry number per thread (default: 512)
  -X, --klog-file=S           : set the command logging file (default: off)
  -R, --max-requests=N        : set the maximum number of requests per event (default: 20)
  -c, --max-conns=N           : set the maximum simultaneous connections (default: 1024)
  -b, --backlog=N             : set the backlog queue limit (default 1024)
  -p, --port=N                : set the tcp port to listen on (default: 11211)
  -U, --udp-port=N            : set the udp port to listen on (default: 11211)
  -l, --interface=S           : set the interface to listen on (default: all)
  -s, --unix-path=S           : set the unix socket path to listen on (default: off)
  -a, --access-mask=O         : set the access mask for unix socket in octal (default: 0700)
  -M, --eviction-strategy=N   : set the eviction strategy on OOM (default: 2, random)
  -f, --factor=D              : set the growth factor of slab item sizes (default: 1.25)
  -m, --max-memory=N          : set the maximum memory to use for all items in MB (default: 64 MB)
  -n, --min-item-chunk-size=N : set the minimum item chunk size in bytes (default: 72 bytes)
  -I, --slab-size=N           : set slab size in bytes (default: 1048576 bytes)
  -z, --slab-profile=S        : set the profile of slab item chunk sizes (default: off)

Features

  • Supports the complete memcached ASCII protocol.
  • Supports tcp, udp and unix domain sockets.
  • Observability through lock-less stats collection and klogger.
  • Pluggable eviction strategies.
  • Easy debuggability through assertions and logging.

Slabs and Items

Memory in twemcache is organized into fixed sized slabs whose size is configured using the -I or --slab-size=N command-line argument. Every slab is carved into a collection of contiguous, equal size items. All slabs that are carved into items of a given size belong to a given slabclass. The number of slabclasses and the size of items they serve can be configured either from a geometric sequence with the inital item size set using -n or --min-item-chunk-size=N argument and growth ratio set using -f or --factor=D argument, or from a profile string set using -z or --slab-profile=S argument.

Eviction

Eviction is triggered when a cache reaches full memory capacity. This happens when all cached items are unexpired and there is no space available to store newer items. Twemcache supports the following eviction strategies, configured using the -M or --eviction-strategy=N command-line argument:

  • No eviction (0) - don't evict, respond with server error reply.
  • Item LRU eviction (1) - evict only existing items in the same slab class, least recently updated first; essentially a per-slabclass LRU eviction.
  • Random eviction (2) - evict all items from a randomly chosen slab.
  • Slab LRA eviction (4) - choose the least recently accessed slab, and evict all items from it to reuse the slab.
  • Slab LRC eviction (8) - choose the least recently created slab, and evict all items from it to reuse the slab. Eviction ignores freeq & lruq to make sure the eviction follows the timestamp closely. Recommended if cache is updated on the write path.

Eviction strategies can be stacked, in the order of higher to lower bit. For example, -M 5 means that if slab LRA eviciton fails, Twemcache will try item LRU eviction.

Observability

Stats

Stats are the primary form of observability in twemcache. Stats collection in twemcache is lock-less in a sense that each worker thread only updates its thread-local metrics, and a background aggregator thread collects metrics from all threads periodically, holding only one thread-local lock at a time. Once aggregated, stats polling comes for free. There is a slight trade-off between how up-to-date stats are and how much burden stats collection puts on the system, which can be controlled by the aggregation interval -A or --stats-aggr-interval=N command-line argument. By default, the aggregation interval is set to 100 msec. You can set the aggregation interval at run time using config aggregate <num>\r\n command. Stats collection can be disabled at run time by passing a negative aggregation interval or at build time through the --disable-stats configure option.

Metrics exposed by twemcache are of three types - timestamp, counter and gauge and are collected both at the global level and per slab level. You can read about the description of all stats exposed by twemcache using the -D or --describe-stats command-line argument.

The following commands can be used to query stats from a running twemcache

  • stats\r\n
  • stats settings\r\n
  • stats slabs\r\n
  • stats sizes\r\n
  • stats cachedump <id> <limit>\r\n

Klogger (Command Logger)

Command logger allows users to capture the details of every incoming request. Each line of the command log gives precise information on the client, the time when a request was received, the command header including the command, key, flags and data length, a return code, and reply message length. Few example klog lines look as follows:

172.25.135.205:55438 - [09/Jul/2012:18:15:45 -0700] "set foo 0 0 3" 1 6
172.25.135.205:55438 - [09/Jul/2012:18:15:46 -0700] "get foo" 0 14
172.25.135.205:55438 - [09/Jul/2012:18:15:57 -0700] "incr num 1" 3 9
172.25.135.205:55438 - [09/Jul/2012:18:16:05 -0700] "set num 0 0 1" 1 6
172.25.135.205:55438 - [09/Jul/2012:18:16:09 -0700] "incr num 1" 0 1
172.25.135.205:55438 - [09/Jul/2012:18:16:13 -0700] "get num" 0 12

The command logger supports lockless read/write into ring buffers, whose size can be configured with -x or --klog-entry=N command-line argument. Each worker thread logs to a thread-local buffer as they process incoming queries, and a background thread asynchronously dumps buffer contents to a file configured with -X or --klog-file=S command-line argument.

Since this feature has the capability of generating hundreds of MBs of data per minute, the use must be planned carefully. An enabled klog moduled can be started or stopped by sending config klog run start\r\n and config klog run stop\r\n respectively. To control the speed of log generation, the command logger also supports sampling. Sample rate can be set over with config klog sampling <num>\r\n command, which samples one of num commands.

Logging

Logging in twemcache is only available when it is built with logging enabled (--enable-debug=[full|yes|log]). By default logs are written to stderr. Twemcache can also be configured to write logs to a specific file through the -o or --output=S command-line argument.

On a running twemcache, we can turn log levels up and down by sending it SIGTTIN and SIGTTOU signals respectively and reopen log files by sending it SIGHUP signal. Logging levels can be set to a specific value using the verbosity <num>\r\n command.

Issues and Support

Have a bug? Please create an issue here on GitHub!

https://github.com/twitter/twemcache/issues

Versioning

For transparency and insight into our release cycle, releases are be numbered with the semantic versioning format: <major>.<minor>.<patch> and constructed with the following guidelines:

  • Breaking backwards compatibility bumps the major
  • New additions without breaking backwards compatibility bumps the minor
  • Bug fixes and misc changes bump the patch

Other Work

  • twemproxy - a fast, light-weight proxy for memcached.
  • twemperf - a tool for measuring memcached server performance.
  • twctop.rb - a tool like top for monitoring a cluster of twemcache servers.

Contributors

License

Copyright 2003, Danga Interactive, Inc.

Copyright 2012 Twitter, Inc.

Licensed under the New BSD License, see the LICENSE file.

twemcache's People

Contributors

alanbato avatar caniszczyk avatar monadbobo avatar soarpenguin avatar willnorris avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

twemcache's Issues

memory slab calcification problem.

Once the object is allocated to some slabs and if the repartition of items size changes, some slabs will miss space whereas others are full of empty pages.

mc_strtoll function impl maybe something wrong ?

original codes are from here: https://github.com/twitter/twemcache/blob/master/src/mc_util.c

bool 
mc_strtoll(const char *str, int64_t *out)
{
    char *endptr;
    long long ll;

    errno = 0;
    *out = 0LL;

    ll = std::strtoll(str, &endptr, 10);

    if (errno == ERANGE) {
        return false;
    }

#if 0
    // bug: if str is "123 abc"
    if (isspace(*endptr) || (*endptr == '\0' && endptr != str)) {
        *out = ll;
        return true;
    }
#endif
    
    std::string s(endptr);
    if (std::all_of(s.begin(), s.end(), [](unsigned char c) { return std::isspace(c); }) || *endptr == '\0' && endptr != str) {
        *out = ll;
        return true;
    }
    return false;
}

Be able to turn off UDP

In wake of recent memcached amplification problems
https://blog.cloudflare.com/memcrashed-major-amplification-attacks-from-port-11211/

it makes sense to disable UDP support.

The code of twemcache seem to be created in a way to require UDP binding always:

if (tcp_specified && !udp_specified) {

    if (tcp_specified && !udp_specified) {
        settings.udpport = settings.port;
    } else if (udp_specified && !tcp_specified) {
        settings.port = settings.udpport;
    }

php user

hi:

how can i use it in php?
is there have a php-extension?

Scalability of Twemcache

Dear Twemcache team,
I would like to ask about the scalability of Twemcache. What is the maximum throughput you could achieve in terms of concurrent requests per second? and how many cores could you scale up to?

Based on my own experience, I could saturate only two (Xeon) cores with a single twemcache process achieving a maximum throughput of 320K requests per second. My request mix is 95% reads to 5% writes and each request reads/writes a 1KB record.

Is there any proof that Twemcache can be scaled-up to a higher number of cores? higher number of max concurrent requests?
Could you please share your experience on the twemcache scalability issues? Are there any known scalability bottlenecks that I should consider in my tuning process?

Regards,

Add metrics to track memory/heap consumption

Learning about the real memory consumption of Twemcache is important to correctly estimate overhead and avoid paging. To many people's surprise, slab memory doesn't account for the entire heap size in many cases, and it would be helpful to have metrics reflecting actual heap size and its composition.

Aside from slabs, large memcache instances usually allocate a lot of memory into hashtable(s); and for instances with a lot of connections, connection buffer is also a significant source of memory overhead. So it would be nice to have the following metrics for starters:

heap_curr /* total heap size, everything allocated through mc_*alloc */
heap_hashtable /* size of the current hashtable, and if in transition, hashtables */
heap_conn /* connection buffer related overhead */

There are others that could be added, such as slab size (which can currently be computed from slab_curr and slab size), suffix buffer for reply messages, etc. It would be nice to come up a more comprehensive component list, but they probably aren't as important as the above ones.

maximum memory of twemcache

Hi,
I found the service may extend maximum memory with "set" then "get",
twemcache -d -m 20
test script
import memcache
mc = memcache.Client(['localhost:11211'])
i = 0
while True:
i+=1
key = value = str(i)
mc.set(key, value)
mc.get(key)

The memory size keeps increasing and extends the maximum. Is it an issue with some internal data structure?

thanks

Typo in documentation

On documentation page https://github.com/twitter/twemcache it is written:

"No eviction (0) ...
Item LRU eviction (1) ...
Random eviction (2) ...
Slab LRA eviction (4) ...
Slab LRC eviction (8) ..."

and later:
"For example, -M 5 means that if slab LRU eviciton fails, Twemcache will try item LRU eviction".

But there is no slab LRU in a previous list. I suppose it means slab LRA.

Please change that abbreviation accordingly.

Best regards,
Maxim

Can't install ,please help!

I download the zip file. and unzip it.

then I install it with the command ./configure

it show configure file not found

memory leak

It seems that there is still a memory leak.
I run Twemcache under Valgrind, and submitted many small objects (a sequence of set/get of object with key "i" and random value, with i = 1, 2, 3, ...).
The memory keeps growing.
The output of Valgrind says that, when a suffix is created, the sequence of calls are
asc_respond_get (mc_ascii.c) -> asc_create_suffix (mc_ascii.c) -> cache_alloc (mc_cache.c)
but the created memory is not properly freed or managed (blocks are definitely lost).

Could you please check?

All threads spike to 100% CPU

Strace on one of the treads didn't give much information besides confirm that it was in some type of loop.

epoll_wait(13, {{EPOLLIN, {u32=44, u64=44}}}, 32, 4294967295) = 1
epoll_wait(13, {{EPOLLIN, {u32=44, u64=44}}}, 32, 4294967295) = 1
epoll_wait(13, {{EPOLLIN, {u32=44, u64=44}}}, 32, 4294967295) = 1
epoll_wait(13, {{EPOLLIN, {u32=44, u64=44}}}, 32, 4294967295) = 1

Process is still responsive to requests and only way to fix has been to restart. It was running across a few hundred machines and was happening about twice a day.

Has anybody else experienced this issue?

memcached.org changes?

Hi,

I apologize if I'm grumpy at all, as this sort of thing is a reoccuring theme for us over at memcached central.

Have you folks seen at all the performance and slab balancing changes that have gone in six months ago? From what I see on your end most of the other changes are around stats and logging, which are pretty trivial.

In my own testing (with mc-crusher) I wasn't able to get the stats locks (as of 1.4.13) to cause any contention (most of the stats were split into per-thread locks ages ago). This is probably untrue with a NUMA machine, and I would be curious to know how you caused those issues.

Thanks!

mc_cache problem

mc_cache.c line 61
base on the ohter functions, i think there should be:
ptr = mc_calloc(initial_pool_size, sizeof(char *));

Add Garbage Collector

I see that you've invested a lot of time in stackable Eviction Strategies.
That's good for me, because that means, that you've already made the first step : admitted that the default LRU is not as good as it could.

Perhaps you would like to incorporate this changeset:
https://groups.google.com/forum/?fromgroups#!topic/memcached/MdNPv0oxhO8
which makes sure that expired items are never in memory, and does so in O(1).
We use it in nk.pl since years at it works great (evictions dropped to 0, and monitoring memory consumption provides more information now -- also, slabs now have a chance to become emptied and disposed).

The only drawback I can see with it is the additional O(1) memory per item for doubly linked list pointers. I believe that Twitter hires some tough hackers which could make this number smaller (how about the trick with XORed pointers?).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.