Code Monkey home page Code Monkey logo

mit-6.824's People

Contributors

andrewlee302 avatar dvorak42 avatar stebalien avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

mit-6.824's Issues

Concurrency inconsistency with unreliable RPC calls in MIT 6.824 Lab2 (Part-B)

Bug description

In such case with one client emits several concurrent put/append requests to P/B (S1/S2). Primary and backup do check whether the uuid of the request has been processed locally.
The basic logics are as follows:

// Client
for {
  ok := RPC(primary, op)
  if ok {
    return
  } else {
    update(&primary)
  }
}

// Primay
mutex.Lock()
defer mutex.Unlock()
if op.uuid existed in seen {
  return value
} else {
  ok := B.Remote(op)
  if ok {
     Local(op)
     seen[op.uuid] = true
  } else {
     return err
  }
}

// Backup
mutex.Lock()
defer mutex.Unlock()
if op.uuid existed in seen {
  return value
} else {
  Local(op)
}
  1. Case 1
    req1: Put <1,54>
    req2: Put <1,53>
action status after action
req1 S2 succ reply fail S1<1,nil> S2<1,54>
req2 S2 succ S1<1,nil> S2<1,53>
req2 S1 succ  S1<1,53> S2<1,53>
req1 S2 No Action S1<1,53> S2<1,53>
req1 S1 <1,54> S1<1,54> S2<1,53>
  1. Case 2
    req1: Append <1, x>
    req2: Append <1, y>
action status after action
req1 S2 succ reply fail S1<1, nil> S2<1, x>
req2 S2 succ S1<1, nil> S2<1, xy>
req2 S1 succ  S1<1, y> S2<1, xy>
req1 S2 No Action S1<1, y> S2<1, xy>
req1 S1 <1,54> S1<1, yx> S2<1, xy>

So primary and backup do have different values of Key 1 at the end in two cases. If primary gets down and the backup is promoted to the primary, then the GET requests in the two times have different values. The root cause is that a PUT/APPEND op in P/B is not binding together. So the adjacent retrying executions may be interrupted by another PUT/APPEND op.

Solutions research

  1. If only PUT op is supported, since PUT is an idempotent op, the backup side seems to have no needs to check whether it's executed previously. It just assign the value to the key and the uuid to the seen map, for filtering duplication request if being promoted to primary.
    This solution will incur errors when facing the APPEND ops. So we can translate the APPEND op to PUT op in the primary side, although it costs more network throughput. The case 2 will become the followings.
    req1: Append <1, x>
    req2: Append <1, y>
action translate action status after action
req1 S2 succ reply fail Put<1, x> S1<1, nil> S2<1, x>
req2 S2 succ Put<1, y> S1<1, nil> S2<1, y>
req2 S1 succ  Put<1, y> S1<1, y> S2<1, y>
req1 S2 succ reply succ Put<1, yx> S1<1, y> S2<1, yx>
req1 S1 <1,54> Put<1, yx> S1<1, yx> S2<1, yx>
  1. Version. We can use versions to bind the operations in the primary and backup at the same speed, i.e. the semantics that the backup will keep the same version as the primary. Like the CAS operation, the operation with the version transferring to the backup will be done at the base of the same version. We can have an invariant that the version of one key in backup is equal to the one in the primary or one more than the latter.
// Client
for {
  ok := RPC(primary, op)
  if ok {
    return
  } else {
    update(&primary)
  }
}

// Primay
seen map[string]bool
versions map[string]int64
last_dataset map[string]{Seq:int64,Value:string}

PutAppend (op) {
  mutex.Lock()
  defer mutex.Unlock()
  if op.uuid existed in seen {
    return value
  } 
  
  version := versions[op.key]
  if backupHost != "" {
    var reply FrowardReply
    ok := B.Remote(op, version, &reply)
    if ok {
       versions[key] = reply.version
       last_dataset[key] = dataset[key]
       Local(op)
       seen[op.uuid] = true
    } else {
       return err
    }
  } else {
     versions[key] = version + 1
     last_dataset[key] = dataset[key]
     Local(op)
     seen[op.uuid] = true
  }
  }
}

// Backup
seen map[string]bool
versions map[string]int64
last_dataset map[string]{Seq:int64,Value:string}

BackupOp (op, version) {
  mutex.Lock()
  defer mutex.Unlock()
  backup_version := versions[key]
  // assert(backup_version == version or backup_version == version + 1)
  if version == 0 {
    // reset to null value
    dataset[key] = ""
  } else if backup_version = version + 1 {
    dataset[key] = last_dataset[key]
  }
  last_dataset[key] = dataset[key]
  versions[key] = version + 1
  LocalOp(op)
  seen[op.uuid] = true
}

If the primary is down and the backup will be promoted to the primary, then the previous failed request with the reply failure have no access to the new primary (as backup previously), because it has been tagged with being processed. So we can delete the tag if we found the value is overridden in the backup. The new version of Backup part is as follows.

// Backup
seen map[string]bool
versions map[string]int64
last_dataset map[string]{Seq:int64,Value:string}

BackupOp (op, version) {
  mutex.Lock()
  defer mutex.Unlock()
  backup_version := versions[key]
  // assert(backup_version == version or backup_version == version + 1)
  if version == 0 {
    // reset to null value
    dataset[key] = ""
  } else if backup_version = version + 1 {
    wait_ack_uuid_set.add(last_procede_uuid[key])
    dataset[key] = last_dataset[key].Value
    delete(seen, last_dataset[key].Seq)
  }
  last_dataset[key] = {Seq:op.uuid, Value:dataset[key]}
  versions[key] = version + 1
  LocalOp(op)
  seen[op.uuid] = true
}

It also has bugs. We should notice that if connection P->B is interrupted but the Backup receive the rpc request, it doesn't mean Primary observe the error after the ending of the rpc call. Primary could get the error info at the moment the net.UnixConn.File() is closed by syscall.Shutdown(). So the rpc call itself will be executed at the random moment. Then the old rpc request in the backup will be done after the normal request, consequently incurring the normal value is override by the old one.

GET op consistency issue in Lab 2 (Part-B)

The assumption that p/b won't crash is kept in the discussion below.
Consider the local op in primary won't be executed until get the OK from the remote op in the backup. If GET op can be executed in the backup side, then it will occur that a PUT/APPEND op has modified in the backup side, and the RPC reply failure will induce the inconsistency between P/B. So the results of GET op from two sides are different temporarily, which doesn't satisfy the strong consistency. But the eventually consistent can be satisfied at all.
If we want to implement the strong consistency, maybe we need keep more information and more operations, such write cache and commit operation.

How to do practically?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.