Decompose `executor.ts` into Composed Middleware

ADR-049: Decompose `executor.ts` into Composed Middleware

Status: Accepted
Date: 2026-05-13
Author: Claude Code (session d7958259)
Trigger: Tech-debt register row “executor.ts complexity — 7 concerns in 437 lines” (added 2026-05-09). Immediate reversal trigger: any significant change to the circuit-breaker or anomaly logic.

Context

src/core/executor.ts owns a single exported function, executeRequest, that handles all outbound API requests through the gateway. As of 2026-05-13, the function is 437 lines and contains seven logically distinct concerns:

#	Concern	Location (approx.)	Change risk
1	Response cache check + HIT return	Lines 51–77	Medium (TTL logic, cache-key generation)
2	Circuit breaker enforcement	Lines 79–94	High (opens/closes per tenant+domain, affects SLA)
3	401 KV propagation race detection	Lines 146–215	High (dual-check: `refreshed_at:` marker key + embedded token field; race window is 90–120s)
4	3xx redirect blocking	Lines 113–143	Low (pure guard, no state)
5	Response body truncation	Lines 280–309	Low (streaming reader, 10MB cap)
6	Cache write + mutation invalidation	Lines 323–346	Medium (async, non-blocking)
7	Async metrics + anomaly detection	Lines 349–360	Medium (recordCall + trackUsage + checkAnomaly + fireAnomalyAlert, all via waitUntil)

The seventh concern (async D1 logging via ctx.waitUntil) is woven throughout all error paths; it is a pattern, not a discrete module, and is left in-place.

Why this matters: Any production incident touching executeRequest — a changed circuit-breaker threshold, a new propagation race condition, an anomaly false positive — requires a developer to read and reason about the full 437-line function. The 401 dual-check logic (concern 3) is the highest-risk block: it makes two KV reads in the request path at a combined cost of ~2ms. It is buried at line 146 inside a 90-line error-handling block. Misreading it has caused incorrect token-refresh decisions in the past.

Invariant constraint (invariant #2): KV is the sole hot-path data source. The 401 check currently makes two KV reads in the request path. This is permitted today because both reads (refreshedAt marker + token data) are fast KV lookups. Any decomposition must preserve this behavior exactly — extracting to a middleware function does not eliminate the reads.

Decision

Decompose executeRequest into a composed middleware pipeline under src/core/middleware/. Each module exports a single async function. executor.ts becomes a thin orchestrator calling each middleware in order.

Module map

src/core/middleware/
├── cache.ts            # checkCacheHit() → CacheHitResult | null
│                       # writeCacheAsync() → void (via waitUntil)
│                       # invalidateCacheOnMutationAsync() → void (via waitUntil)
├── circuit.ts          # enforceCircuit() → ProxyResponse | null
│                       # recordCircuitOutcomeAsync() → void (via waitUntil)
├── auth-recovery.ts    # detect401Race() → ProxyResponse | null
│                       # (the dual-check KV propagation logic, extracted verbatim)
├── redirect.ts         # blockRedirect() → ProxyResponse | null
│                       # (pure guard, no state)
└── truncation.ts       # readBodyWithLimit() → { body: string, truncated: boolean }

The anomaly + usage tracking concern (concern 7) stays in executor.ts as three ctx.waitUntil calls — splitting these into a named function recordMetricsAsync(ctx) within executor.ts itself is sufficient (no new file needed).

Orchestrator shape (after decomposition)

// src/core/executor.ts (after)
export async function executeRequest(url, options, ctx): Promise<ProxyResponse> {
  const method = (options.method ?? 'GET').toUpperCase();
  const urlPath = safePathname(url);
  const cacheable = shouldCache(method, ctx.domain, ctx.action, urlPath);

  // 1. Cache HIT
  const cacheHit = cacheable ? await checkCacheHit(ctx, url, method, options.body) : null;
  if (cacheHit) return cacheHit;

  // 2. Circuit breaker
  const circuitResponse = await enforceCircuit(ctx);
  if (circuitResponse) return circuitResponse;

  // 3. Fetch
  const { response, fetchStart } = await upstreamFetch(url, options, ctx);
  // (timeout + AbortController live here — not in middleware)

  // 4. Redirect block
  const redirectResponse = blockRedirect(response, ctx, method, urlPath);
  if (redirectResponse) return redirectResponse;

  // 5. Error path
  if (!response.ok) {
    // 5a. 401 race detection
    if (response.status === 401) {
      const raceResponse = await detect401Race(ctx, urlPath, method);
      if (raceResponse) return raceResponse;
    }
    return await handleErrorResponse(response, ctx, method, urlPath);
  }

  // 6. Body truncation
  const { body, truncated } = await readBodyWithLimit(response);

  // 7. Cache write + invalidation (async)
  if (cacheable && cacheKey) writeCacheAsync(ctx, cacheKey, response, body);
  else if (!cacheable && method !== 'GET') invalidateCacheOnMutationAsync(ctx);

  // 8. Circuit success + metrics (async)
  recordCircuitOutcomeAsync(ctx, 'success');
  recordMetricsAsync(ctx);

  return buildSuccessResponse(response, body, truncated, cacheable);
}

What does NOT change

The exported signature of executeRequest — callers are unaffected.
Behavior: no logic changes, verbatim extraction only. Behavioral changes require a separate PR with explicit test coverage.
The waitUntil pattern — async D1 logging calls stay as-is; they are not further abstracted.
The 401 dual KV read — extracted to auth-recovery.ts but both reads preserved exactly.

Implementation Plan

Phased to reduce blast radius. Each phase is a standalone PR that passes all existing tests before merging.

Phase	Extracts	Files created	Risk	PR target
1	`detect401Race` → `auth-recovery.ts`	1 new file	High (but isolated)	`executor/phase-1-auth-recovery`
2	`blockRedirect` + `readBodyWithLimit` → `redirect.ts`, `truncation.ts`	2 new files	Low	`executor/phase-2-pure-extractors`
3	Cache middleware → `cache.ts`	1 new file	Medium	`executor/phase-3-cache-middleware`
4	Circuit middleware → `circuit.ts`	1 new file	Medium	`executor/phase-4-circuit-middleware`
5	Thin orchestrator + `recordMetricsAsync` inline	executor.ts rewrite	Medium	`executor/phase-5-orchestrator`

Phase 1 is the immediate next action. It extracts the highest-risk logic into its own testable unit first, before touching the cache or circuit paths.

Test discipline

Each phase must not change the test count in test/executor.test.ts (all existing test cases must pass).
Each new middleware module gets its own test file: test/core/middleware/{name}.test.ts.
The detect401Race extraction (Phase 1) must include:
- Test: marker key present → returns TOKEN_REFRESHING response
- Test: marker key absent, embedded refreshed_at < 120s → returns TOKEN_PROPAGATING response
- Test: marker key absent, embedded refreshed_at ≥ 120s → returns null (caller handles as normal 401)
- Test: no token data in KV → returns null

Consequences

Positive

Incident response: The 401 dual-check (highest-risk logic) becomes a 60-line file with its own tests. A 2am read takes 60 seconds, not 10 minutes.
Testability: Each concern is independently testable without mocking the full executor pipeline.
Change isolation: Modifying the circuit-breaker threshold no longer requires touching the same file as the truncation logic.
Onboarding: A new contributor can understand the request pipeline by reading the 50-line orchestrator, then drilling into the middleware they care about.

Negative / risks

Refactor blast radius: Five PRs across src/core/, each touching the hot path. Any behavioral bug introduced during extraction would not be caught by existing tests if the test case doesn’t exercise that branch.
Module proliferation: Five new files under src/core/middleware/. Acceptable given each has a clear single responsibility.
Import graph depth: executor.ts will now import from five middleware modules. TypeScript strict mode and the existing typecheck gate mitigate this.

Neutral

Runtime performance: No new allocations. Extraction into functions adds one call-frame per middleware. CF Workers V8 JIT inlines trivial call chains at runtime — overhead is immeasurable (<0.1ms).
Bundle size: Negligible. All modules are in-process (same Worker bundle).

Reversal

The decomposition is structural, not behavioral. Reversal (collapsing back to monolithic executor) would require reverting all five phases and is not warranted by any single trigger. The decomposition is considered complete and stable after Phase 5 merges. Individual middleware modules can be re-merged back into executor.ts if a future architectural change (e.g., a new request path that bypasses the pipeline) makes the separation counterproductive.

References

Tech-debt register: docs/projects/LEDGER.md → “executor.ts complexity — 7 concerns in 437 lines”
V5 invariant #2: KV-only hot path — relevant to the dual KV read in concern 3
V5 invariant #3: Fail-fast, no retries — must be preserved in the orchestrator
V5 invariant #11: 30s AbortController timeout — lives in the orchestrator, not extracted