Code Monkey home page Code Monkey logo

Comments (3)

tgross avatar tgross commented on August 20, 2024

I'll keep this in mind. Right now we're missing any kind of modeling that outlines what the consistency guarantees actually are and then tells us our implementation of those guarantees is correct. I've looked into semi-synchronous replication and it's not clear to me yet how it effects consistency guarantees; at my first-pass it looks like it only gives you a bit more reliability on the (extremely minimal) consistency guarantees that async replication gives you, at the cost of some availability.

from mysql.

misterbisson avatar misterbisson commented on August 20, 2024

There's no expectation of immediate action, I just wanted to log the detail.

The consistency guarantee appears much better than async replication:

The master waits only until at least one slave has received and logged the events. It does not wait for all slaves to acknowledge receipt, and it requires only receipt, not that the events have been fully executed and committed on the slave side.

This appears to promise that it requires a failure in both the primary and the semisync replica for data loss to occur. That's a lot better than async replication.

However, the failure mode is for write availability over multi-host consistency:

If a timeout occurs without any slave having acknowledged the transaction, the master reverts to asynchronous replication. When at least one semisynchronous slave catches up, the master returns to semisynchronous replication.

Both quotes are from https://dev.mysql.com/doc/refman/5.7/en/replication-semisync.html

from mysql.

tgross avatar tgross commented on August 20, 2024

This appears to promise that it requires a failure in both the primary and the semisync replica for data loss to occur. That's a lot better than async replication.

It is better in terms of data loss (i.e. client-acknowledged writes are less likely to have been lost) but I'm not as certain that it's better from the standpoint of consistency in the face of implicit non-fatal failures. My primary concern with this mode of operation comes from this section of the docs:

It does not wait for all slaves to acknowledge receipt, and it requires only receipt, not that the events have been fully executed and committed on the slave side.

Replicas only acknowledge receipt, not a completed write. Which hypothetically isn't any worse than async replication (where we don't even ack receipt), but it isn't explicitly described what happens in this scenario and how a replica catches up to the primary if it's dropped a write for an acknowledged receipt. It could very well be -- and I'd expect it to be -- that a replica that has a temporary netsplit just catches back up using the last GTID. If not it's possible to not just get lost data but inconsistent data, which is much worse. But this doesn't appear in the docs so I want to make sure we genuinely understand the behavior.

If a timeout occurs without any slave having acknowledged the transaction, the master reverts to asynchronous replication. When at least one semisynchronous slave catches up, the master returns to semisynchronous replication.

This implicit degradation of behavior seems potentially dangerous. My overall feel on this feature is that semi-synchronous replication is going to encourage application developers to try to read their writes from the replicas, which is incorrect. Semi-synchronous seems like a bad compromise between async and sync.

from mysql.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.