CF Workflows vs Vercel Workflows for HITL Approval Gates
ADR-048 — CF Workflows vs Vercel Workflows for HITL Approval Gates
- Status: Accepted
- Date: 2026-05-08
- Decider: Mishaal Murawala (delegated engineering judgment to Claude Code as engineering lead)
- Supersedes: none
- Confirms: ADR-047, ADR-026
- Related: V5 Invariant #1 (two-plane architecture), V5 Invariant #9 (CF Cron + CF Workflows)
Context
ADR-047 (OAuth Re-Auth Escalation Path) selected CF Workflows for HITL approval gates without a documented head-to-head comparison against Vercel Workflows — a gap flagged in the Q1 gap table. This ADR closes that gap by recording the evaluation, the decisive factors, and the re-eval trigger.
The specific HITL requirement driving the evaluation:
- Long-running approval windows (up to 72 hours) for OAuth re-auth and ad-spend approval flows
- Auto-deny after N hours if no response (requires a guaranteed timeout mechanism in the workflow runtime itself)
- D1 audit trail write on every approval event (requires access to CF D1 binding)
- No third compute plane (V5 Invariant #1: only Execution Worker + Context Worker)
Evaluation Criteria (9)
| # | Criterion | CF Workflows | Vercel Workflows | Winner |
|---|---|---|---|---|
| 1 | Suspension / resume with timeout | step.sleep() + step.waitForEvent() documented; 30-day max suspension; workflow timeout field in trigger config enforces wall-clock ceiling | waitForEvent (hook API) — timeout param not present in current docs; auto-deny-after-N-hours pattern unverifiable without it | CF |
| 2 | Platform lock-in / Invariant #1 | Same platform as gateway — no new compute plane | Vercel would be a third compute plane, violating Invariant #1 without an ADR amending it | CF |
| 3 | Native binding access | Direct service bindings to D1, KV, DO, R2 — zero round-trips | HTTP calls to CF REST API required for every D1 audit write; adds latency + a secret to manage | CF |
| 4 | Latency (cold start) | V8 isolate, ~5ms cold start | Node.js runtime, ~200–800ms cold start | CF |
| 5 | Cost | Included in Workers Paid ($5/mo); Workflows billed per step execution | Vercel Pro + Workflow add-on; separate billing plane | CF |
| 6 | Developer experience (DX) | WorkflowEntrypoint.run() with step.do() / step.sleep() — explicit but verbose | Directive syntax (sequence, parallel, waitForSignal) — less boilerplate; hook API intuitive | Vercel |
| 7 | HITL approval UI | No hosted UI — must build approval endpoint in the Worker and surface link to Slack/email | No hosted UI either — same pattern required | Tie |
| 8 | Observability | CF Logpush → R2/D1; Workflow instance status queryable via API | Vercel dashboard, Vercel Log Drains to external | Tie |
| 9 | Replay / retry on failure | Step-level idempotency via name-keyed step.do() — replay is safe | Step replay supported; similar semantics | Tie |
Result: CF wins 6/9. Vercel wins 1/9. 3 ties.
Decisive Factors
Three factors are independently sufficient to reject Vercel Workflows; together they make the decision unambiguous.
1. Vercel hook timeout absent from current docs
The auto-deny-after-N-hours requirement is non-negotiable for the OAuth re-auth and ad-spend HITL flows. If no human acts within 72 hours, the workflow must auto-resolve (deny/expire) without a separate cleanup job.
CF Workflows satisfies this natively: the timeout field in the Workflow trigger config sets a wall-clock ceiling; step.sleep() combines with step.waitForEvent() to implement the pattern idiomatically.
Vercel Workflows waitForEvent / waitForSignal — as of 2026-05-08 docs — does not document a timeout parameter on the hook. This means the only way to implement auto-deny is a separate scheduled job that polls for stale workflow instances and signals them. That is a second coordination surface, adds operational complexity, and is exactly the kind of external cron service prohibited by Invariant #9.
This alone blocks Vercel.
2. Third compute plane violates Invariant #1
V5 Invariant #1: “Two-plane architecture. Execution (gateway) + Context (context-worker). A third plane is forbidden without an ADR.”
CF Workflows runs inside the CF Workers runtime — the same execution plane as the gateway. No new plane.
Vercel Workflows runs on Vercel’s Node.js runtime — a separate compute plane managed by a separate vendor. Adopting it would require amending Invariant #1 and writing a separate ADR justifying the third plane. Given factors 1 and 3, there is no basis to justify the invariant change.
3. Native binding access required for D1 audit trail
Every approval event (token re-auth granted, ad-spend approved, auto-denied) must write to decision_log in D1. With CF Workflows, this is a direct env.DB.prepare().bind().run() call — same pattern used everywhere else in the codebase, zero extra secrets, no extra latency.
With Vercel, every D1 write is an HTTP call to the CF REST API (https://api.cloudflare.com/client/v4/accounts/{id}/d1/database/{id}/query). This adds 30–100ms per audit write, requires CLOUDFLARE_API_TOKEN to be set as a Vercel secret, and couples the Vercel compute plane to CF credentials management.
Decision
CF Workflows confirmed. ADR-047 and ADR-026 stand unchanged.
All HITL approval gates (OAuth re-auth, ad-spend approval) are implemented as CF Workflow instances inside the gateway’s execution plane. The timeout field in the Workflow trigger + step.waitForEvent() with step.sleep() implement the auto-deny-after-N-hours requirement natively.
CF Workflows HITL Pattern (canonical implementation)
import { WorkflowEntrypoint, WorkflowEvent, WorkflowStep } from 'cloudflare:workers';
export interface HitlApprovalParams { tenantId: string; provider: string; reason: 'oauth_reauth' | 'ad_spend_approval'; context: Record<string, unknown>; timeoutHours: number; // typically 72}
export class HitlApprovalWorkflow extends WorkflowEntrypoint<Env, HitlApprovalParams> { async run(event: WorkflowEvent<HitlApprovalParams>, step: WorkflowStep) { const { tenantId, provider, reason, context, timeoutHours } = event.payload;
// Step 1: Send notification (Slack/email with approval link) const notificationResult = await step.do('send-notification', async () => { const approvalToken = crypto.randomUUID(); // Write pending approval to D1 decision_log await this.env.DB.prepare( `INSERT INTO decision_log (id, tenant_id, provider, reason, status, context, created_at) VALUES (?, ?, ?, ?, 'pending', ?, datetime('now'))` ).bind(approvalToken, tenantId, provider, reason, JSON.stringify(context)).run();
// Send Slack alert with approval link await fetch(this.env.SLACK_WEBHOOK_URL, { method: 'POST', body: JSON.stringify({ text: `HITL Required: ${reason} for ${tenantId}/${provider}`, blocks: [ { type: 'actions', elements: [ { type: 'button', text: { type: 'plain_text', text: 'Approve' }, url: `${this.env.GATEWAY_URL}/admin/hitl/${approvalToken}/approve`, }, { type: 'button', text: { type: 'plain_text', text: 'Deny' }, url: `${this.env.GATEWAY_URL}/admin/hitl/${approvalToken}/deny`, style: 'danger', }, ], }, ], }), });
return { approvalToken }; });
// Step 2: Wait for human decision or timeout const decision = await step.waitForEvent<{ action: 'approve' | 'deny'; actor: string }>( 'hitl-decision', { type: 'hitl-decision', timeout: `${timeoutHours * 3600}s`, } );
// Step 3: Record outcome (whether approved, denied, or timed out) await step.do('record-outcome', async () => { const status = decision === null ? 'auto_denied_timeout' : decision.action; const actor = decision === null ? 'system' : decision.actor;
await this.env.DB.prepare( `UPDATE decision_log SET status = ?, resolved_by = ?, resolved_at = datetime('now') WHERE id = ?` ) .bind(status, actor, notificationResult.approvalToken) .run();
// If approved and oauth_reauth: trigger re-auth flow if (status === 'approve' && reason === 'oauth_reauth') { await this.env.TOKEN_MANAGER.get( this.env.TOKEN_MANAGER.idFromName(`${tenantId}:${provider}:default`) ).fetch('https://internal/force-reauth'); } }); }}Wiring in gateway (src/index.ts):
// Trigger from admin handler or token health cronconst instance = await env.HITL_APPROVAL_WORKFLOW.create({ params: { tenantId, provider, reason: 'oauth_reauth', context: { error, lastRefresh }, timeoutHours: 72, },});
// Approval/denial endpoint in admin handler dispatches event to resume workflowawait env.HITL_APPROVAL_WORKFLOW.get(instanceId).sendEvent({ type: 'hitl-decision', payload: { action: 'approve', actor: adminUserId },});wrangler.toml binding:
[[workflows]]name = "hitl-approval"binding = "HITL_APPROVAL_WORKFLOW"class_name = "HitlApprovalWorkflow"Re-Eval Trigger
Re-evaluate Vercel Workflows when all three of the following are true:
- Vercel ships a CF Workers Runtime target (runs V8 isolate, not Node.js — eliminates Invariant #1 violation)
- Vercel
waitForSignal/waitForEventdocuments a built-intimeoutparameter (eliminates the auto-deny gap) - Vercel provides native CF binding access without HTTP round-trips to CF REST API (eliminates the D1 audit write round-trip)
DX alone (Criterion 6) is not sufficient to re-open this decision.
Consequences
- ADR-047 HITL design proceeds as specified; no changes to ADR-026 Workflows-for-Ingestion pattern
- Vercel Workflows is formally classified as “deferred — re-eval trigger documented above”
- The 5-item Q1 gap table item “CF Workflows vs Vercel Workflows evaluated?” is now closed