Skip to content

Durable Objects with alarm-based proactive token refresh

ADR-005: Durable Objects with alarm-based proactive token refresh

Status: Accepted Date: 2026-04-06 Deciders: Mishaal Murawala

Context

OAuth tokens expire. The gateway needs fresh tokens available in KV for every request. Options: (a) refresh on-demand when a request finds an expired token, (b) use KV TTL to trigger refresh, (c) use Durable Objects with alarm-based proactive refresh.

Decision

We will use Cloudflare Durable Objects with alarm() to proactively refresh tokens 10 minutes before expiry, writing the new token to KV. The request path never contacts a Durable Object.

Consequences

Positive

  • Request path only reads KV (sub-ms) — never waits for a token refresh
  • Single-threaded DO concurrency eliminates race conditions in token refresh
  • alarm() fires even if no requests are active — tokens stay fresh 24/7
  • One DO per {tenant}:{provider}:{account_id} triplet — clean isolation

Negative

  • DOs add cost (~$0.15/million requests + $0.15/GB-month storage)
  • KV eventual consistency means a refreshed token may take up to 60s to propagate. Mitigated by the 10-minute early refresh window.
  • Debugging DO state requires the admin API — can’t query DOs directly from Wrangler

Risks

  • If a DO alarm fails repeatedly (e.g., provider’s token endpoint is down), the token in KV will expire. Mitigated by exponential retry in the alarm handler and Slack alerts on repeated failures.

Alternatives Considered

On-demand refresh in the request path

  • Rejected because: adds 200-500ms latency to requests that hit an expired token. Violates the ≤10ms gateway overhead invariant.

KV TTL-based refresh

  • Rejected because: KV doesn’t support TTL callbacks. The key just disappears, and there’s no mechanism to trigger a refresh when it does.

External cron (n8n / Lambda)

  • Rejected because: adds an external dependency for a critical path. The entire point of V5 is running on CF edge with no external vendors in the token path.