Optimistic Locking: Consistent State Without Locks

Agent A reads current_position = 100. Agent A decides to buy 50 more. Meanwhile, Agent B read the same value and decided the same thing. Result: current_position = 200 instead of intended 150.

This is the classic read-modify-write race condition. Switchboard solves it with optimistic locking—version numbers that detect conflicts without blocking operations.

The Problem

Traditional locking blocks operations:

  • Agent A acquires lock
  • Agent B waits (blocked)
  • Agent A completes, releases lock
  • Agent B proceeds

This works, but it's slow. Agents wait even when there's no conflict. In high-concurrency systems, this creates bottlenecks.

Optimistic Locking

Optimistic locking assumes conflicts are rare. Instead of blocking, it detects conflicts and retries:

  1. Agent reads state with version number
  2. Agent modifies state locally
  3. Agent writes with expected version
  4. If version matches, write succeeds
  5. If version conflicts, write fails, agent retries

No blocking. No waiting. Just fast operations with conflict detection.

How Switchboard Implements It

typescript
// Read state with version
const { value, version } = await switchboard.state.get({
  swarmId: 'my-swarm',
  key: 'current_position',
});

// Modify locally
const newPosition = value + 50;

// Write with version check
try {
  await switchboard.state.set({
    swarmId: 'my-swarm',
    key: 'current_position',
    value: newPosition,
    expectedVersion: version,  // Must match or fail
  });
} catch (e) {
  if (e.code === 'VERSION_CONFLICT') {
    // Someone else modified, re-read and retry
    const fresh = await switchboard.state.get({
      swarmId: 'my-swarm',
      key: 'current_position',
    });
    // Recalculate based on fresh value
    const correctedPosition = fresh.value + 50;
    // Retry with new version
    await switchboard.state.set({
      swarmId: 'my-swarm',
      key: 'current_position',
      value: correctedPosition,
      expectedVersion: fresh.version,
    });
  }
}

Why This Works

Fast path: When there's no conflict, operations complete immediately. No blocking, no waiting.

Conflict detection: Version numbers detect conflicts instantly. If two agents try to modify the same state, one succeeds, one fails and retries.

Atomic updates: Writes are atomic. No partial updates. Either the entire write succeeds or it fails.

When Conflicts Happen

Conflicts are rare in well-designed systems. But when they do happen:

  • The write fails with VERSION_CONFLICT
  • The agent re-reads the current state
  • The agent recalculates based on fresh values
  • The agent retries with the new version

This is the "optimistic" part: we assume conflicts won't happen, but we handle them gracefully when they do.

Additional Features

Namespace isolation: Each swarm has its own state. No cross-swarm conflicts.

TTL support: Temporary state can auto-expire. Useful for caching or session data.

Watch support: Get notified when state changes. Useful for reactive agents.

Comparison to Other Approaches

Pessimistic locking: Blocks operations, slow, but guaranteed no conflicts. Good for high-conflict scenarios.

Optimistic locking: Fast, detects conflicts, retries on failure. Good for low-conflict scenarios (most agent swarms).

No locking: Fastest, but race conditions possible. Not suitable for financial or critical state.

Why This Matters

Optimistic locking enables:

  • High throughput: Agents don't block each other
  • Consistent state: Race conditions are detected and prevented
  • Simple API: Just read, modify, write with version
  • Automatic retries: Conflicts are handled gracefully

For agent swarms coordinating shared state, optimistic locking provides the right balance of speed and safety.


Part of the EchoRift infrastructure series. Learn more about Switchboard.