Fencing Tokens: Protecting Against Stale Locks
Distributed locks are essential for coordinating access to shared resources. But there's a dangerous edge case that most lock implementations miss: what happens when an agent thinks it has a lock, but the lock has actually expired?
Arbiter solves this with fencing tokens—globally monotonic counters that prevent stale locks from corrupting shared state.
The Stale Lock Problem
Consider this scenario:
- Agent A acquires a lock to modify a shared resource
- Agent A experiences a long garbage collection pause or network partition
- The lock expires while Agent A is paused
- Agent B acquires the lock and modifies the resource
- Agent A "wakes up" and thinks it still has the lock
- Agent A modifies the resource using stale information
Result: corrupted state. Agent A's modifications are based on outdated data, overwriting Agent B's legitimate changes.
This isn't theoretical. It happens in production systems with GC pauses, network partitions, or clock skew. Traditional locks don't protect against it.
How Fencing Tokens Work
Arbiter's distributed locks include a fencing token—a globally monotonic counter that always increases. The process:
- Agent requests lock with maximum duration
- If lock is free (or expired), agent receives lock + fencing token
- Fencing token is a globally monotonic counter—always increases
- Before any protected operation, agent must present fencing token
- Downstream services reject tokens lower than the highest they've seen
The key insight: even if Agent A thinks it has a lock, if its fencing token is stale (lower than what the resource has seen), the resource rejects the operation.
Example Flow
Time T1: Agent A acquires lock, receives fencing token 42
Time T2: Agent A pauses (GC, network partition)
Time T3: Lock expires, Agent B acquires lock, receives fencing token 43
Time T4: Agent B modifies resource, resource records highest token: 43
Time T5: Agent A "wakes up", tries to modify resource with token 42
Result: Resource rejects token 42 (it's seen 43), Agent A's stale operation fails
On-Chain Fencing Token Registry
Arbiter maintains the fencing token registry on-chain via the ArbiterFinality contract on Base. This ensures:
- Global monotonicity: Tokens always increase, never decrease
- Verifiable state: Any service can query the current highest token
- Tamper-proof: On-chain state can't be manipulated
- Consistent view: All services see the same token sequence
When an agent acquires a lock, the fencing token is recorded on-chain. When the agent releases the lock, the token remains in the registry as the highest seen value. This creates an immutable history that prevents stale tokens from being accepted.
Integration Pattern
To use fencing tokens with your resources:
typescript
// Acquire lock with fencing token
const lock = await arbiter.lock.acquire({
swarmId: 'my-swarm',
resourceId: 'shared-database',
maxDuration: 300, // 5 minutes
});
// Store the fencing token
const fencingToken = lock.fencingToken;
// Before modifying resource, validate token
const isValid = await arbiter.lock.validate({
swarmId: 'my-swarm',
resourceId: 'shared-database',
fencingToken: fencingToken,
});
if (!isValid) {
// Token is stale, don't proceed
throw new Error('Fencing token invalid');
}
// Modify resource, passing fencing token
await modifyResource({
fencingToken: fencingToken,
changes: {...},
});
// Resource checks: if token < highestSeen, reject
Why This Matters
Fencing tokens are essential for:
- Financial operations: Preventing double-spending or duplicate transactions
- State mutations: Ensuring modifications are based on current state
- Resource coordination: Preventing conflicts when multiple agents access shared resources
- Audit trails: Providing verifiable proof of lock ownership
Without fencing tokens, distributed locks provide a false sense of security. With them, locks become truly safe for autonomous agent systems.
Part of the EchoRift infrastructure series. Learn more about Arbiter.