Data Replication in Distributed Systems - Leaderless Replication

Data Replication in Distributed Systems - Leaderless Replication

·

2 min read

If you ever answer yes to any of the below, you might have ended on right article. Tired of TL;DR feeling going through huge articles with ton of information? Ever miss out important information due to verbose explanations? Want to focus on what matters and skip self elaborative info? Ugh, so much to read, can someone show me an illustration instead?

Note: this is continued from Multileader Replication

Leaderless Replication (Quorum)

Nodes in the leaderless setting are considered peers and all of them accept writes and reads from the client. Without a leader that handles all write requests, leaderless replication offers better availability. Any follower can become leader if leader goes down. scan 2022-07-17 19.10.19n_8.jpg

Quorum configuration is defining rules for complete operations. e.g. 2 out of 3 nodes should sent ack to client for read/write request to mark it complete W-write R-read T-total nodes

How do I define quorum configuration

This is the most challenging aspect of this type of replication but there are few ways to help out.

  1. Strict quorum
    • requires ack from number of nodes equal to W+R>T
    • system still works when min(T-W, T-R) nodes are down
    • this offers better consistency because it makes sure the data is written to more nodes
  2. Sloppy quorum and hinted handoff
    • Suppose N nodes serve different partitions of the database, each row of data replicated on T nodes. If too many T nodes are down leading to insufficient quorum, N-T nodes can accept write requests (handoff) temporarily until T nodes are back online
    • this offers higher availability but can result in stale reads as it doesn't enforce W+R>T

If this confused you rather than helping, you can try

  • if system has balanced read/write, we can set W/R=T/2+1
  • if system is read heavy, we can set lower R and higher W

This sounds better but does it have any drawbacks?

Yes, there is no one shoe fits all so everything has some drawbacks. Some of those are

  1. Illusion of strong consistency - even with strict quorum, system can’t guarantee strong consistency due to unstable network
  2. Tricky failure handling - suppose all nodes received request but acks are delayed, request will be marked failed but client can see dirty write

Did you find this article valuable?

Support Write what you know by becoming a sponsor. Any amount is appreciated!