Data Replication in Distributed Systems - Multileader Replication

Data Replication in Distributed Systems - Multileader Replication

·

2 min read

If you ever answer yes to any of the below, you might have ended on right article. Tired of TL;DR feeling going through huge articles with ton of information? Ever miss out important information due to verbose explanations? Want to focus on what matters and skip self elaborative info? Ugh, so much to read, can someone show me an illustration instead?

Note: this is continued from Single Leader Replication

Multileader Replication

Basically more than one node accept writes. Process of replication is still the same. Instead of followers getting updates from a single leader, each node forwards write to each other (active/active replication or master-master replication). Thus each node acts as a leader as well as a follower to other leader nodes. scan 2022-07-17 19.10.19n_6.jpg

If it is so good, how will conflict occur? See below example

scan 2022-07-17 19.10.19n_7.jpg

How do we resolve conflict?

Apologies, this gets descriptive, can't get away here

  1. Avoid (try to) conflicts altogether - send writes for a specific account to the same leader every time by calculating hash. But this can also fail when that leader goes down and have to redirect to another leader
  2. Converging data to consistent state - we can follow this if transactions are not critical (like financial transactions) because this usually ends up in data loss. Few ways to do it,
    • Last Write wins
    • Leader precedence (always accept update from leader with precedence)
    • Merge writes (simple to write but will get messy)
  3. Record and keep writes somewhere else and apply custom conflict resolution e.g.
    • During write - write handler can resolve conflict as mentioned in 2
    • During read - prompt user to choose/modify current conflict like Git
  4. Automatic conflict resolution (purposefully not elaborated)
    • Operational transformation
    • Conflict Free Replicated Datatypes (CRDTs)

Are there topologies for nodes to lead/follow?

First two have a big drawback where one faulty node breaks the loop while third one might be slow due to latency in replicating on all nodes

  1. Circular
  2. Star shaped
  3. All to all

Did you find this article valuable?

Support Write what you know by becoming a sponsor. Any amount is appreciated!