Code Monkey home page Code Monkey logo

ra's Introduction

A Raft Implementation for Erlang and Elixir

Ra is a Raft implementation by Team RabbitMQ. It is not tied to RabbitMQ and can be used in any Erlang or Elixir project. It is, however, heavily inspired by and geared towards RabbitMQ needs.

Ra (by virtue of being a Raft implementation) is a library that allows users to implement persistent, fault-tolerant and replicated state machines.

Project Maturity

This library has been extensively tested and is suitable for production use. This means the primary APIs (ra, ra_machine modules) and on disk formats will be backwards-compatible going forwards in line with Semantic Versioning. Care has been taken to version all on-disk data formats to enable frictionless future upgrades.

Status

The following Raft features are implemented:

  • Leader election
  • Log replication
  • Cluster membership changes: one server (member) at a time
  • Log compaction (with limitations and RabbitMQ-specific extensions)
  • Snapshot installation

Build Status

Build Status

Supported Erlang/OTP Versions

Ra requires Erlang/OTP 21.3+. Erlang 22+ is highly recommended because of distribution traffic fragmentation.

Quick start

%% First we have to start the Ra application
ra:start(),

%% All servers in a Ra cluster are named processes.
%% Create some Server Ids to pass to the configuration
ErlangNodes = [ra@node1, ra@node2, ra@node3],
ServerIds = [{quick_start, N} || N <- ErlangNodes],

%% start a simple distributed addition state machine with an initial state of 0
ClusterName = quick_start,
{ok, ServersStarted, ServersNotStarted} = ra:start_cluster(ClusterName, {simple, fun erlang:'+'/2, 0}, ServerIds),

%% Add a number to the state machine
%% Simple state machines always return the full state after each operation
{ok, StateMachineResult, LeaderId} = ra:process_command(hd(ServersStarted), 5),

%% use the leader id from the last command result for the next
{ok, 12, LeaderId1} = ra:process_command(LeaderId, 7).

"Simple" state machines like the above can only take you so far. See Ra state machine tutorial for how to write a state machine by implementing the ra_machine behaviour.

Design Goals

  • Low footprint: use as few resources as possible, avoid process tree explosion
  • Able to run thousands of ra clusters within an Erlang node
  • Provide adequate performance for use as a basis for a distributed data service

Use Cases

This library is primarily developed as the foundation for replication layer for replicated queues in a future version of RabbitMQ. The design it aims to replace uses a variant of Chain Based Replication which has two major shortcomings:

  • Replication algorithm is linear
  • Failure recovery procedure requires expensive topology changes

Documentation

Examples

A number of examples can be found in a separate repository.

Configuration

  • data_dir:

A directory name where ra will store it's data.

  • wal_data_dir:

A directory name where ra will store it's WAL (Write Ahead Log) data. If unspecified, data_dir is used.

  • wal_max_size_bytes:

The maximum size of the WAL in bytes. Default: 512Mb.

  • wal_compute_checksums:

Indicate whether the wal should compute and validate checksums. Default: true

  • wal_write_strategy:

    • default:

    The default. Actual write(2) system calls are delayed until a buffer is due to be flushed. Then it writes all the data in a single call then fsyncs. Fastest but incurs some additional memory use.

    • o_sync:

    Like default but will try to open the file with O_SYNC and thus wont need the additional fsync(2) system call. If it fails to open the file with this flag this mode falls back to default

  • wal_sync_method:

    • datasync:

    The default. Uses the fdatasync system call after each batch. This avoids flushing file meta-data after each write batch and thus may be slightly faster than sync on some system. When datasync is configured the wal will try to pre allocate the entire WAL file. NB: not all systems support fdatasync. Please consult system documentation and configure it to use sync instead if it is not supported.

    • sync:

    Uses the fsync system call after each batch.

  • wal_max_batch_size:

Controls the internal max batch size that the WAL will accept. Higher numbers may result in higher memory use. Default: 32768.

  • logger_module:

Allows the configuration of a custom logger module. The default is logger. The module must implement a function of the same signature as logger:log/4 (the variant that takes a format not the variant that takes a fun).

  • metrics_key:

Metrics key. The key used to write metrics into the ra_metrics table.

  • low_priority_commands_flush_size:

When commands are pipelined using the low priority mode Ra tries to hold them back in favour of normal priority commands. This setting determines the number of low priority commands that are added to the log each flush cycle. Default: 25

[{data_dir, "/tmp/ra-data"},
 {wal_max_size_bytes, 134217728},
 {wal_compute_checksums, true},
 {wal_write_strategy, default},
]

Copyright and License

(c) 2017-2020, VMware Inc or its affiliates.

Double licensed under the ASL2 and MPL1.1. See LICENSE for details.

ra's People

Contributors

kjnilsson avatar michaelklishin avatar hairyhum avatar dcorbacho avatar acogoluegnes avatar dumbbell avatar vanlightly avatar lukebakken avatar benoitc avatar gerhard avatar spring-operator avatar 0xflotus avatar lbragaglia avatar philipcristiano avatar timbuchwaldt avatar zambal avatar mryawe avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.