Webhook Reliability: The 5-Second Rule
Webhooks must be acknowledged within 5 seconds. This isn't a limitation—it's a feature. It forces agents to decouple receipt from processing.
EchoRift services (BlockWire, CronSynth, Switchboard) use this pattern to ensure reliable delivery without blocking webhook endpoints.
The Problem
Traditional webhook delivery has two failure modes:
- Slow processing: Agent takes 30 seconds to process webhook, webhook times out, sender retries, duplicate processing
- Blocking: Agent processes synchronously, webhook endpoint is slow, sender waits, queue backs up
Both create unreliable delivery and unpredictable behavior.
The 5-Second Rule
EchoRift services require webhook acknowledgment within 5 seconds:
- Receive webhook
- Validate HMAC signature
- Acknowledge immediately (return 200 OK)
- Queue for async processing
The webhook endpoint is a fast receiver, not a slow processor.
Why This Works
Fast acknowledgment: Sender gets confirmation quickly. No timeouts, no retries, no duplicate delivery.
Async processing: Agent processes in background. Slow operations don't block webhook delivery.
Predictable behavior: Webhook delivery is fast and reliable. Processing happens separately.
Exponential Backoff
When webhook delivery fails (timeout, 4xx, 5xx), EchoRift services retry with exponential backoff:
- First retry: 1 second
- Second retry: 2 seconds
- Third retry: 4 seconds
- Fourth retry: 8 seconds
- Fifth retry: 16 seconds
- Maximum: 5 retries
This prevents overwhelming failing endpoints while still attempting delivery.
Delivery Receipts
For critical operations, agents can confirm receipt:
typescript
// Webhook handler
app.post('/webhook', async (req, res) => {
// Validate signature
const isValid = verifyWebhook(req);
if (!isValid) {
return res.status(401).send('Invalid signature');
}
// Acknowledge immediately
res.status(200).send('OK');
// Process async
await processWebhook(req.body);
// Optional: send delivery receipt
await sendDeliveryReceipt(req.body.id);
});
What Happens If Agent Is Down?
If the agent is down during webhook delivery:
- Webhook delivery fails after retries
- Event is recorded (BlockWire) or schedule continues (CronSynth)
- Agent can replay missed events via API
- Nothing is lost—just delayed
For BlockWire, the replay API retrieves events with full attestation. For CronSynth, schedules continue and agents catch up on next trigger.
Why This Matters
The 5-second rule enables:
- Reliable delivery: Fast acknowledgment prevents timeouts and retries
- Scalable processing: Async processing doesn't block webhook endpoints
- Predictable behavior: Clear separation between delivery and processing
- Fault tolerance: Failed deliveries retry with backoff, missed events can be replayed
Without this pattern, webhook delivery is unreliable. With it, agents can depend on timely, accurate event delivery.
Part of the EchoRift infrastructure series. Learn more about CronSynth and BlockWire.