Leader Election and Term Numbers: Preventing Split-Brain

When a swarm of agents needs one voice to speak for all, Arbiter runs secure elections. But network partitions can cause two agents to both believe they're leader—a split-brain scenario that corrupts state.

Arbiter prevents this with term numbers—strictly monotonic counters that make on-chain state authoritative.

The Split-Brain Problem

Consider this scenario:

  1. Agent A is leader, communicating with most of the swarm
  2. Network partition separates Agent A from Agent B
  3. Agent B can't see Agent A's heartbeats
  4. Agent B triggers an election and wins (it can see the other agents)
  5. Now both Agent A and Agent B think they're leader

Result: two leaders making conflicting decisions. State corruption. Chaos.

How Arbiter Prevents Split-Brain

Arbiter implements Raft-style leader election with term numbers:

  1. Any swarm member can trigger an election if the current leader's heartbeat times out
  2. Agents vote for candidates using EIP-712 signed messages
  3. First candidate to reach quorum becomes leader
  4. Leadership is recorded on-chain with a term number
  5. Leaders must periodically checkpoint on-chain to prove liveness
  6. Term numbers are strictly monotonic—no agent can claim an old term

Term Numbers Are Authoritative

The term number is critical. It's recorded on-chain via the ArbiterFinality contract. This makes it the source of truth.

If network partitions cause two agents to both believe they're leader:

  • Both agents check the on-chain term number
  • The agent with the higher term is the legitimate leader
  • The agent with the lower term must immediately step down

On-chain state resolves the dispute. No ambiguity.

Example Flow

Term 1: Agent A is leader, term 1 recorded on-chain

Network partition: Agent A separated from rest of swarm

Term 2: Agent B triggers election, wins, term 2 recorded on-chain

Partition heals: Agent A sees term 2 on-chain

Agent A steps down: Term 1 < Term 2, so Agent A is no longer leader

Split-brain prevented. On-chain term number is authoritative.

Periodic Checkpoints

Leaders must periodically checkpoint on-chain to prove liveness:

  • If a leader fails to checkpoint, other agents can trigger an election
  • This prevents "zombie leaders" that are unresponsive but still think they're leader
  • Checkpoints include the current term number, proving the leader is still active

EIP-712 Signed Votes

Agents vote for leaders using EIP-712 typed signatures:

  • Votes are cryptographically signed
  • Votes can be verified on-chain if disputes arise
  • Votes are tied to specific term numbers
  • Voting in the wrong term is detected and rejected

This prevents vote manipulation and ensures votes are for the correct election.

Why This Matters

Term numbers enable:

  • Split-brain prevention: On-chain state resolves disputes
  • Leader liveness: Checkpoints prove leaders are active
  • Verifiable elections: Votes are signed and can be verified
  • Automatic recovery: Failed leaders are replaced automatically

Without term numbers, split-brain scenarios are inevitable. With them, distributed agent systems can safely elect leaders.


Part of the EchoRift infrastructure series. Learn more about Arbiter.