Code Monkey home page Code Monkey logo

Comments (6)

cole-miller avatar cole-miller commented on July 22, 2024 2

It seems pretty clear to me that the description of Raft in the dissertation envisions that

  1. only committed + applied entries are incorporated into a snapshot
  2. the InstallSnapshot RPC includes the "latest configuration as of lastIndex", which because of (1) can only be a committed configuration

So if we're storing uncommitted configurations in snapshots at all, that seems like a deviation from the algorithm as presented.

from raft.

as58545652 avatar as58545652 commented on July 22, 2024

One possible solution would be to take a snapshot only if there is no uncommitted configuration, and to restore the configuration from the snapshot if there is no configuration entry in the log.

from raft.

MathieuBordere avatar MathieuBordere commented on July 22, 2024

Thanks for this, will investigate.

from raft.

as58545652 avatar as58545652 commented on July 22, 2024

raft/src/start.c

Lines 16 to 87 in cbb9cd7

static int restoreMostRecentConfiguration(struct raft *r,
struct raft_entry *entry,
raft_index index)
{
struct raft_configuration configuration;
int rv;
raft_configuration_init(&configuration);
rv = configurationDecode(&entry->buf, &configuration);
if (rv != 0) {
raft_configuration_close(&configuration);
return rv;
}
configurationTrace(r, &configuration, "restore most recent configuration");
raft_configuration_close(&r->configuration);
r->configuration = configuration;
r->configuration_index = index;
return 0;
}
/* Restore the entries that were loaded from persistent storage. The most recent
* configuration entry will be restored as well, if any.
*
* Note that we don't care whether the most recent configuration entry was
* actually committed or not. We don't allow more than one pending uncommitted
* configuration change at a time, plus
*
* when adding or removing just a single server, it is safe to switch directly
* to the new configuration.
*
* and
*
* The new configuration takes effect on each server as soon as it is added to
* that server's log: the C_new entry is replicated to the C_new servers, and
* a majority of the new configuration is used to determine the C_new entry's
* commitment. This means that servers do notwait for configuration entries to
* be committed, and each server always uses the latest configuration found in
* its log.
*
* as explained in section 4.1.
*
* TODO: we should probably set configuration_uncommitted_index as well, since we
* can't be sure a configuration change has been committed and we need to be
* ready to roll back to the last committed configuration.
*/
static int restoreEntries(struct raft *r,
raft_index snapshot_index,
raft_term snapshot_term,
raft_index start_index,
struct raft_entry *entries,
size_t n)
{
struct raft_entry *conf = NULL;
raft_index conf_index = 0;
size_t i;
int rv;
logStart(&r->log, snapshot_index, snapshot_term, start_index);
r->last_stored = start_index - 1;
for (i = 0; i < n; i++) {
struct raft_entry *entry = &entries[i];
rv = logAppend(&r->log, entry->term, entry->type, &entry->buf,
entry->batch);
if (rv != 0) {
goto err;
}
r->last_stored++;
if (entry->type == RAFT_CHANGE) {
conf = entry;
conf_index = r->last_stored;
}
}
if (conf != NULL) {
rv = restoreMostRecentConfiguration(r, conf, conf_index);

The most recent configuration might not be a committed one, which would probably lead to another membership rollback issue.

from raft.

MathieuBordere avatar MathieuBordere commented on July 22, 2024

One possible solution would be to take a snapshot only if there is no uncommitted configuration, and to restore the configuration from the snapshot if there is no configuration entry in the log.

There are degenerate cases where not taking a snapshot while there's an uncommitted configuration could lead to never taking a snapshot anymore.

There's this fragment in the raft dissertation in Chapter 5

Raft also retains the latest configuration from the discarded log prefix in order to support cluster
membership changes.

In think the straightforward solution is to save the last known configuration when taking a snapshot so that it can be restored.

from raft.

as58545652 avatar as58545652 commented on July 22, 2024

One possible solution would be to take a snapshot only if there is no uncommitted configuration, and to restore the configuration from the snapshot if there is no configuration entry in the log.

There are degenerate cases where not taking a snapshot while there's an uncommitted configuration could lead to never taking a snapshot anymore.

There's this fragment in the raft dissertation in Chapter 5

Raft also retains the latest configuration from the discarded log prefix in order to support cluster
membership changes.

In think the straightforward solution is to save the last known configuration when taking a snapshot so that it can be restored.

Yeah, I think you might be right. We just need to save the previous configuration if there is an uncommitted one when taking a snapshot. But I have to say that an uncommitted configuration could still stand in the way because a snapshot can never be taken beyond any uncommitted index.

from raft.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.