Skip to content

Cloudflare Platform Audit — State of the Art for V5 Gateway

Cloudflare Platform Audit — State of the Art for V5 Gateway

Date: 2026-04-24 Target system: Ascend GTM V5 Gateway (Hono + Cloudflare Workers, D1, KV, R2, Vectorize, Workers AI, Durable Objects, CF Queues, CF Cron) Method: Live fetch of developers.cloudflare.com and blog.cloudflare.com. No training-data facts. Every claim has a source URL. Docs last-updated dates embedded where shown on the page.


1. Workers Secrets Store

(a) Current state

  • Status: Open beta as of 2026-04-16 (the Secrets Store overview page carries an “Available in open beta” banner).
  • Scope: account-level secret, referenced from one or more Workers via a binding; not worker-scoped like wrangler secret put.
  • Integration surfaces today: Workers and AI Gateway (BYOK). More surfaces planned.
  • Worker access pattern is async: you get the plaintext value via await env.BINDING.get() on every request (no sync env.SECRET_NAME usage).
  • wrangler.toml / wrangler.jsonc shape:
    [[secrets_store_secrets]]
    binding = "MY_BINDING"
    store_id = "<STORE_ID>"
    secret_name = "MY_SECRET"
  • Local dev requires wrangler secrets-store secret … without --remote (prod secrets are not readable from wrangler dev --local).

How Secrets Store differs from wrangler secret put

Attributewrangler secret putSecrets Store
ScopePer-WorkerPer-account (one store, many bindings, many Workers)
RotationManual per Worker, redeploy required per scriptRotate once in the store, all bound Workers immediately see new value (no redeploy)
Access in codeenv.SECRET_NAME synchronousawait env.BINDING.get() asynchronous
Visibility in dashboardSecret is attached to the WorkerCentral audit + listing across the account
Reuse across workersDuplicated per WorkerSingle source of truth
Used by AI Gateway BYOKNoYes (dedicated integration)

(b) Application to V5

V5 today uses per-Worker wrangler secret put for 25+ credentials (HubSpot, Salesforce, Google, Slack, Gong, OpenAI, Anthropic, etc.). Adding the scheduler Worker, a dev Worker, preview Workers, etc. multiplies rotation pain linearly. Secrets Store solves that exactly.

Async .get() matters for hot paths: on a 100k req/day gateway, an extra await per request per secret needs to be scoped carefully. Best practice: call .get() once inside the provider adapter and cache in module scope only if the secret isn’t rotated mid-request (Secrets Store rotation is eventual, not instantaneous).

(c) Recommendation: ADOPT in Phase 3, not now

Reasons to wait:

  • Still open beta (not GA) as of 2026-04-16. Production-critical credential plane on a beta is a real risk. No SLA.
  • The async-on-read pattern is a non-trivial refactor across 53+ provider adapters (resolveTokenData() style). The migration has to be boring and mechanical — worth doing in one sweep, not drive-by.

Do now: start an ADR (docs/architecture/decisions/) capturing the migration plan so the refactor lands the day after GA.

(d) Migration cost if adopted today

  • 1 store creation + ~25 secrets import (1 hour).
  • Adapter refactor: replace env.OPENAI_API_KEY with await env.OPENAI_API_KEY.get() in every provider adapter + memoize per request context. Estimate 0.5–1 day at current adapter count.
  • CI/deploy pipeline change: wrangler.jsonc uses secrets_store_secrets blocks; remove wrangler secret put automation scripts.

2. Cloudflare AI Gateway

(a) Current state

  • Status: GA, available on all plans. Overview page last updated 2026-04-20.
  • Unified API (OpenAI-compat) supports: Workers AI, Bedrock, Anthropic, Azure OpenAI, Baseten, Cartesia, Cerebras, Cohere, Deepgram, DeepSeek, ElevenLabs, Fal AI, Google AI Studio, Google Vertex AI, Groq, HuggingFace, Ideogram, Mistral, OpenAI, OpenRouter, Parallel, Perplexity, Replicate, xAI. Source: same overview page.
  • Feature matrix (per overview + features sub-pages):
    • Caching — exact-request cache, per-request control via cf-aig-cache-ttl / cf-aig-skip-cache / cf-aig-cache-key. cf-aig-cache-status: HIT|MISS. Semantic caching is on the roadmap but not shipped. Source: https://developers.cloudflare.com/ai-gateway/features/caching/
    • Rate limiting — per-gateway and per-user.
    • Dynamic routing (Beta) — visual or JSON-configured route graph: Start → Conditional / Percentage / RateLimit / BudgetLimit → Model nodes → End. Supports per-user-plan branching, A/B, gradual rollouts, budget caps with fallback. Source: https://developers.cloudflare.com/ai-gateway/features/dynamic-routing/
    • Guardrails (Beta) — input/output policy enforcement.
    • DLP (Beta) — data-loss prevention on prompts/responses.
    • BYOK / Store Keys (Beta) — upstream provider keys live in the gateway (optionally Secrets-Store-backed), your Worker carries only the gateway key.
    • Custom providers (Beta) — bring-your-own OpenAI-compatible endpoint.
    • Custom costs — override per-model cost values for accurate spend tracking when providers aren’t priced yet by Cloudflare.
    • OpenTelemetry export — traces/metrics to any OTel collector.
    • Logpush — request/response logs to R2/S3/Datadog/etc.
    • WebSockets API (Beta) — realtime + non-realtime.

(b) Application to V5

V5 calls OpenAI, Anthropic, Google Gemini, OpenRouter, DeepSeek, Groq, plus Workers AI directly. That means:

  • No unified observability (each provider’s dashboard is a silo).
  • No cross-provider cost visibility (custom Analytics-Engine plumbing covers part of it but is incomplete).
  • No cache layer → we pay for duplicate extraction prompts on Gong transcripts, ICP scoring, etc.
  • No fallback chain → one provider 429 = user-visible failure.
  • No prompt/response archive for training-data + incident forensics.

Integration pattern for a Worker already doing fetch('https://api.openai.com/v1/chat/completions', ...):

  • Change base URL to https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_slug}/openai.
  • Same request body; the gateway is transparent.
  • For multi-provider: use the Unified API path /v1/{account_id}/{gateway_slug}/compat/chat/completions with model: "openai/gpt-4o", model: "anthropic/claude-sonnet-…" etc. Code becomes one client, many providers.

(c) Recommendation: ADOPT NOW (top priority)

This is the single highest-leverage change available. Caching alone likely recovers 10–25% of LLM spend; dynamic routing replaces hand-rolled fallback logic; Logpush replaces homebrew prompt logging; OpenTelemetry integrates cleanly with Workers observability.

(d) Migration cost

  • ~2 hours for the core gateway flip (one feature flag + base URL change across the 5–7 LLM adapters).
  • ~1 day for dynamic-routing migration (move the current model-router logic into a dynamic route, so it’s declarative + versioned instead of code).
  • ~0.5 day for Logpush → R2 (archive) + optional Datadog/Axiom.
  • ~0.5 day for BYOK — requires keys to be uploaded to gateway; store them in Secrets Store (gated on §1 adoption) or inline in the gateway.

No functional risk if we flip one provider at a time. Start with OpenRouter (highest call volume, easiest rollback) → DeepSeek → Groq → Anthropic → OpenAI → Gemini.


3. Workers Pipelines

(a) Current state

  • Status: Open beta, Workers Paid only. Docs last updated 2026-04-21.
  • Architecture: Streams → Pipelines (SQL transform) → Sinks.
    • Streams: durable buffered queues, ingested via HTTP endpoint or Worker binding.
    • Pipelines: SQL (filter / project / transform / enrich) applied as events flow through.
    • Sinks: R2 as Apache Iceberg tables (via R2 Data Catalog) or Parquet/JSON files.
  • Guarantees: durable ingestion + exactly-once delivery to R2.
  • Pricing: outside R2 storage/ops costs, not currently billed.
  • Setup: npx wrangler pipelines setup.
  • Source: overview + https://developers.cloudflare.com/pipelines/pipelines/ + pipelines setup command listed in wrangler reference.

Pipelines vs Queues — decision matrix

DimensionQueuesPipelines
Primary purposeAsync work dispatchAnalytical ingest to lakehouse
Delivery targetWorker consumer (JS code)R2 (Iceberg/Parquet/JSON)
OrderingPer-queue FIFO (best effort)Stream order preserved
Delivery semanticsAt-least-onceExactly-once to R2
TransformationsIn your Worker consumerSQL at ingest time
BackpressureQueue depth + retriesBuffered streams
Best for”do something when X happens""store every X event for analytics”
Max payload128 KB messageTBD per stream

(b) Application to V5

Phase 2 ingestion (Gong, Salesforce, HubSpot event streams) has two distinct patterns:

  1. “Act on this event now” — new call recorded → run transcript extraction → upsert signal. This is a Queues job, clearly.
  2. “Archive every event for replay + analytics” — raw webhook payload, every field, forever. Currently written to D1 or KV by the ingest Worker. Pipelines + Iceberg in R2 is strictly better: columnar, queryable by external tools, no D1 row-count bloat.

(c) Recommendation: ADOPT for archival ingest in Phase 2; KEEP Queues for job dispatch

They are complementary, not a replacement. Use Pipelines to write an append-only “raw events” lake in R2 (one Iceberg table per source), Queues to drive business-logic reactions.

(d) Migration cost

  • Pilot with Gong webhook stream (~0.5 day to set up stream + pipeline + R2 Iceberg sink + SQL validator).
  • Second pilot with Salesforce CDC (~0.5 day, same pattern).
  • Caveat: still beta, so do not deprecate existing D1 event logs until Pipelines goes GA.

4. Workers Workflows

(a) Current state

Workflows vs Queues + DO-alarms for multi-step pipelines

ConcernQueues + DO alarmsWorkflows
Durable step retryHand-rolledBuilt-in, per-step
State across stepsDO storage writesImplicit (step return values)
Pause for human/webhookDO alarm + polling or webhook callback that writes DO statestep.waitForEvent() native
Long sleep (hours/days)DO alarmstep.sleepUntil() native
Debugging / observabilityLogs + manual DO state dumpBuilt-in step-by-step UI
Code complexityHigh (state machine in app code)Low (linear code with await step.do())
Failure blast radiusYou own retry correctnessCF owns retry correctness

(b) Application to V5

Concrete example from the brief: “Gong transcript → keyword pre-filter → LLM extraction → D1 insert → Vectorize upsert → signal evaluation”. Today this is (presumably) a queue consumer that runs all steps in one invocation and retries the whole thing on failure. That has two problems:

  1. An LLM rate-limit blip re-runs the keyword pre-filter (wasted compute, duplicate D1 writes if not perfectly idempotent).
  2. A CPU-time hit on one step kills the whole invocation.

Workflow form:

export class GongIngestWorkflow extends WorkflowEntrypoint {
async run(event, step) {
const text = await step.do("fetch-transcript", () => gong.fetch(event.payload.callId));
const keep = await step.do("keyword-prefilter", () => preFilter(text));
if (!keep) return { skipped: true };
const extract = await step.do("llm-extract", { retries: { limit: 5 } }, () => llm.extract(text));
await step.do("d1-insert", () => d1.insertSignal(extract));
await step.do("vectorize-upsert", () => vectorize.upsert(extract.embedding));
await step.do("signal-eval", () => signals.evaluate(extract));
}
}

Each step retries independently with its own backoff. A D1 transient failure only re-runs the D1 step, not the LLM call.

(c) Recommendation: ADOPT NOW for the Gong + Salesforce multi-step pipelines; keep Queues for single-step dispatch

Human-in-the-Loop ad-spend approval (in current roadmap) is the textbook step.waitForEvent() use case — replace whatever cron/alarm polling exists today.

(d) Migration cost

  • 1 day to port the Gong pipeline. Lock in the pattern.
  • 0.5 day per additional multi-step job (SFDC enrichment, ad-spend approval, client-report generation).
  • Zero runtime risk — Workflows run alongside existing Workers, no platform change needed.

5. Workers Browser Rendering / Browser Run

(a) Current state

  • Renamed to “Browser Run” (formerly Browser Rendering). Still the same product with expanded surface.
  • Two integration tiers:
    • Quick Actions (stateless, no code): HTTP endpoints /content, /screenshot, /pdf, /markdown, /snapshot, /scrape, /json (AI-structured extraction), /links, /crawl (beta).
    • Browser Sessions (stateful): Puppeteer, Playwright, Chrome DevTools Protocol, Stagehand (AI-driven, finds elements by intent), Playwright MCP (for LLM agents).
  • New features relative to earlier versions: Live View (beta), Human-in-the-Loop (beta), Session recording, WebMCP (beta), Custom fonts, Session reuse.
  • Pricing (Workers Paid): 10 browser-hours included/mo, then $0.09/hr. 10 concurrent browsers averaged/mo included, then $2.00/extra. 120 concurrent max on paid plan.
  • Browser timeout 60s default.

Does it replace Playwright for scrape/screenshot/PDF?

  • For scraping / screenshots / PDFs without behavioral complexity: yes, Quick Actions are strictly better (zero code, pay-per-use, edge-local, no infra).
  • For full Playwright scripts (form flows, auth, multi-page agentic work): no replacement needed — Browser Run runs Playwright via CDP. You write the same Playwright code and connect to CF’s browsers instead of local Chromium.
  • AI agent browsing: Stagehand + Playwright MCP is state-of-the-art here (as of docs). Current rule ~/.claude/rules/print-pdf.md still applies if you are generating PDFs with image fidelity — the 2× viewport trick is a Chromium print-renderer fact, not a CF fact.

(b) Application to V5

Use cases that fit:

  • Competitor site snapshots for the GTM researcher subagent (/screenshot, /markdown).
  • Client-report PDF generation (replace any local wkhtmltopdf/Playwright infra).
  • AI-structured extraction from customer websites for ICP scoring (/json endpoint with a schema).
  • Crawl endpoint for competitor full-site ingestion at Phase 3.

(c) Recommendation: ADOPT NOW for Quick Actions (screenshot, PDF, markdown, json); ADOPT Stagehand when AI agents go autonomous in Phase 3

Explicit reject: do not move the existing Playwright-based print-PDF pipeline unless the output matches the 2× viewport + html{zoom:2} + scale:0.5 trick (see ~/.claude/rules/print-pdf.md). Quick Actions /pdf uses the same Chromium print renderer, so the same fidelity rules apply — test before switching.

(d) Migration cost

  • Quick Actions: zero code, just HTTP calls. Hours, not days.
  • Stagehand integration: ~1 day to wrap in an ascend-gateway tool.

6. Workers RPC + Service Bindings (WorkerEntrypoint)

(a) Current state

(b) Application to V5

Current V5 is a single Worker (Hono). As we split (scheduler Worker, long-running pipeline Worker, admin Worker, webhook dispatcher), RPC is the right boundary:

  • No HTTP overhead between Workers.
  • Type-safe contracts (your IDE catches cross-worker breakage).
  • Works as the back-end for the ADR-013 Headless 360 split without re-plumbing.

Explicit best practice as of 2026: export class X extends WorkerEntrypoint {} not export default { fetch } when the Worker is meant to be called from another Worker.

(c) Recommendation: ADOPT NOW for any new multi-Worker surface

Active items that benefit:

  • Scheduler V5 (separate Worker per the spec in memory) → expose as SchedulerEntrypoint RPC, main gateway calls it via binding.
  • Admin Worker (see §12): behind Access, RPCs into the main gateway for read-only data.
  • Any future “heavy compute” worker (bulk embedding, batch PDF rendering) that should not share the gateway’s CPU budget per request.

(d) Migration cost

  • Zero for net-new workers — start them as WorkerEntrypoint.
  • If an existing internal HTTP Worker call gets converted, it’s ~1 hour per call site (change fetch(url) to await env.BINDING.method() + remove JSON boilerplate).

7. Durable Objects — SQLite vs KV storage, current patterns

(a) Current state

  • SQLite-backed DOs are GA and recommended for all new namespaces. KV-backed storage is marked “(Legacy)” in the sidebar and “for backwards compatibility” in best-practices docs.
  • SQLite DOs offer:
    • SQL (sql.exec) + KV API on the same storage.
    • Structured tables (multi-column), indexes.
    • Point-in-time recovery up to 30 days.
    • Free tier eligibility.
    • Storage billing enabled Jan 2026 (per overview).
  • WebSocket hibernation — keep stateful WebSocket connections without holding DO in memory. Still best practice.
  • RPC methods on DOs — define methods on the DO class, call via stub: await stub.myMethod(args). Preferred over fetch for internal calls.
  • Alarms — schedule future DO execution at a per-object timestamp. Good for per-tenant scheduled work.
  • Key-value-only DOs can migrate to SQLite in future (migration path announced, not shipped as of 2026-04).

(b) Application to V5

V5 uses DOs for OAuth (token cache + refresh lock). That’s a small-schema, high-read, low-write use case. Current backend (whichever) is cheap.

However:

  • If the current V5 OAuth DO is KV-backed → migrate to SQLite when the migration path ships, or recreate the DO class for new tenants immediately.
  • Rate-limiting (if we add it per-tenant) is a textbook SQLite-DO use case (time-window counters in a tiny indexed table).
  • Scheduler V5 per-tenant state → SQLite DO with alarms is the correct primitive.

(c) Recommendation: ADOPT SQLite DO for all new DO classes. Plan migration for OAuth DO.

Explicit: use RPC methods on the DO class, not fetch over the stub. Hibernate WebSockets. Use alarms for scheduled per-object work.

(d) Migration cost

  • New DO classes: zero extra cost, just use SQLite config (new_sqlite_classes in wrangler.toml).
  • Existing KV-backed OAuth DO: blocked on CF-provided migration path. Track with an open tech-debt row; do not do manual migration.

8. Cloudflare Containers on Workers

(a) Current state

  • Status: GA, Workers Paid only. Docs last updated 2026-04-21.
  • Model: a Container class extends Durable Object; each instance runs your OCI image with defaultPort, sleepAfter, max_instances. Routed via a DO namespace.
  • Pattern: getContainer(env.MY_CONTAINER, sessionId).fetch(request) from a Worker.
  • Use cases (per docs):
    • Resource-intensive workloads (multiple CPU cores, lots of memory/disk).
    • Full filesystem / specific runtime / Linux-only libs.
    • Pre-existing images (e.g. headless apps shipped as containers).

(b) Application to V5

V5 is 100% Workers today. Evaluated containers for:

  • LLM extraction: stays on Workers (bounded CPU per request).
  • Large file processing (Gong transcript > 10MB, bulk embedding): Workers 30s CPU limit + 128MB memory can be a ceiling. Containers would help, but Workflows + chunking is the first-line fix.
  • Pandoc / custom report rendering with heavy native deps: possible fit.
  • Third-party binaries we don’t control: possible fit.

(c) Recommendation: NOT NOW. Re-evaluate at Phase 3 when specific CPU/memory ceilings are breached.

Moving to containers means giving up a big part of the Workers edge (cold-start advantage, zero infra, request-level isolation) and adding new ops surface (image building, size discipline, SSH). Today V5 has no CPU/memory pressure that Workflows-chunking can’t solve.

Explicit reject (at current phase): no V5 workload meets the “resource-intensive, needs full filesystem, requires specific runtime” bar.

(d) Migration cost if adopted

  • Per container: Dockerfile + wrangler [[containers]] block + DO class binding (~0.5 day for first one).
  • Ongoing: image size management, patching, scanning. Non-trivial ops overhead.

9. Hyperdrive

(a) Current state

  • Status: GA, Free + Paid. Supports Postgres + MySQL (MySQL is labeled Beta).
  • Features:
    • Global connection pool — a hot pool maintained close to the Worker, removes cold TCP handshake + auth handshake per request.
    • Query caching — idempotent queries cached at the edge.
    • Private database support via Tunnel (beta) — reach DBs inside a VPC without public exposure.
    • Credential rotation — supported.
  • Works with standard drivers: pg, postgres.js, mysql2, drizzle, prisma.

(b) Application to V5

V5 has no Postgres or MySQL today (D1 is the source of truth; KV/R2/Vectorize cover the rest). Hyperdrive is only relevant if we add:

  • A client-owned Postgres (some PE clients will have their own warehouse we need to read).
  • A shared Postgres for analytics we don’t want in D1 (D1 has row/size limits that Pipelines+R2 or external Postgres can beat for wide analytics).

(c) Recommendation: NOT NOW. ADOPT immediately if we ever connect a Worker to an external Postgres/MySQL.

Direct connection pools from Workers do not work well at scale — each invocation is isolated, TCP handshakes are sequential, and credential rotation is painful. Hyperdrive is the only sane pattern.

(d) Migration cost

  • Near-zero: [[hyperdrive]] binding + connection-string swap in the driver. ~30 minutes the day we actually need it.

10. Workers Logs + Logpush

(a) Current state

  • Workers Logs — built-in structured log store with query UI in the dashboard.
    • Source: https://developers.cloudflare.com/workers/observability/logs/workers-logs/
    • Retention: Free 3 days, Paid 7 days.
    • Limits: 5B logs/account/day (overflow auto-sampled at 1% head). 256 KB max single log.
    • Pricing (Paid): 20M log events/month included, then $0.60 per additional million.
    • Head-based sampling: set head_sampling_rate in wrangler.toml observability block (e.g. 0.1 for 10%).
  • Workers Logpush — streams logs out of CF to external destinations (no 7-day retention cap).
    • Source: https://developers.cloudflare.com/logs/get-started/enable-destinations/
    • Supported destinations: R2, Cloudflare Pipelines (to R2/Iceberg/Parquet), generic HTTP, Amazon S3, S3-compatible endpoints, Datadog, Elastic, Google Cloud Storage, BigQuery, Azure, New Relic, SentinelOne, Splunk, Sumo Logic, Amazon Kinesis, IBM QRadar, IBM Cloud Logs. Plus third-party integrations: Axiom, Taegis, Exabeam, Sekoia.
    • Dedicated egress IPs available.
  • Tail Workers — a Worker that receives log events from other Workers in real time (for custom processing before export).

(b) Application to V5

Today V5 relies on dashboard logs + Analytics Engine telemetry. Two concrete problems this solves:

  1. Retention > 7 days: every incident post-mortem that reaches back more than a week is blind. Logpush to R2 (cheapest path) solves this with Parquet-friendly format forever.
  2. Structured query: Workers Logs UI supports filtering on JSON fields — if we log {event: "tool_call", tenant: "...", tool: "...", duration_ms: ...} we get a free tool-telemetry view without standing up Grafana/Datadog.

(c) Recommendation: ADOPT NOW

Specifically:

  • Enable Workers Logs on every Worker (it’s a one-line wrangler.toml change); set head_sampling_rate to 1.0 (100%) for now since traffic is below the 20M free/mo threshold.
  • Enable Logpush → R2 with Parquet format for long-term forensics. Cost is R2 storage + ops, trivial at current volume.
  • Convert all console.log(...) to console.log(JSON.stringify({...})) — a structured logging sweep across adapters. ~0.5 day.
  • Defer Datadog/Grafana integration until traffic or incident frequency warrants paying for it.

(d) Migration cost

  • ~2 hours total across the changes above. This is the lowest-effort high-value item in the audit.

11. Workers Analytics Engine

(a) Current state

  • Status: GA. Unlimited-cardinality custom analytics with SQL API.
  • Write path: env.MY_AE.writeDataPoint({ blobs, doubles, indexes }) inside a Worker.
  • Query path:
    • SQL API (REST) — SELECT blob1, SUM(double1) FROM ... GROUP BY ....
    • Grafana — native Cloudflare plugin for AE.
    • Worker-based query — read AE via SQL API from a Worker and expose a JSON endpoint.
  • Sampling: AE samples when cardinality/volume is high; aggregation queries are sample-adjusted via _sample_interval.

(b) Application to V5

Already used for tool telemetry (per project CLAUDE.md). 2026 best practice from docs:

  • One AE dataset per logical concern (tool_telemetry, llm_spend, tenant_usage), not one giant dataset.
  • Always set indexes: [tenant_id] on every write so billing aggregation is cheap.
  • Query from a /admin/analytics/* route (behind Access — see §12) in a Worker; don’t point Grafana at it directly from the public internet.

(c) Recommendation: AUGMENT in-place

  • Audit existing writeDataPoint calls for missing indexes (tenant).
  • Stand up a Grafana Cloud org + AE plugin for non-engineering dashboards (spend/usage per client). The AE plugin is free; Grafana Cloud free tier is enough for now.
  • Build /admin/analytics/summary.json (Worker endpoint behind Access) as the canonical source, rather than having every client build their own AE query.

(d) Migration cost

  • ~0.5 day for the index audit + Grafana setup.

12. Cloudflare Access / Zero Trust — admin endpoint protection

(a) Current state

(b) Application to V5

V5’s admin path today is an API-key header. That’s:

  • Shared across users (no per-admin audit trail).
  • No session management.
  • Rotation is painful.
  • Not revokable without deploying a new key.

Access + passkey:

  • Per-admin identity.
  • WebAuthn = phishing-resistant, no password.
  • Instant revocation from dashboard.
  • Workers validate the Cf-Access-Jwt-Assertion and derive admin identity.

(c) Recommendation: ADOPT NOW for /admin/* paths on V5

  • Create a Self-hosted Access Application for ascend-gateway-v5.ascendgtm.workers.dev/admin/* (and any custom domain).
  • Policy: Require → Email → mishaal@ascendgtm.net + Require → Authentication method: WebAuthn.
  • Add a service token for programmatic admin (Claude Code / automation) with its own policy row.
  • Keep the API-key check as defense-in-depth for the near term (belt + suspenders).

(d) Migration cost

  • ~1 hour for the Access config, ~2 hours to add JWT validation middleware in the Worker, ~30 min per automation client to switch to service tokens.

13. Email Workers

(a) Current state

(b) Application to V5

Real use cases:

  • Inbound HubSpot/SFDC webhook-via-email — some legacy systems only alert via email. An Email Worker can parse the payload and trigger a Queue/Workflow.
  • Customer BCC to a dedicated address for Gong-like transcript ingestion without external SaaS.
  • Agent “drop a transcript into inbox → pipeline runs” UX — Mishaal-only ingest path.

(c) Recommendation: NOT NOW (no current triggering use case). Keep as a tool in the toolbox for Phase 3.

Explicit reject reason: V5 has no inbound-email dependency today. Building this speculatively violates the Karpathy simplicity-first rule.

(d) Migration cost

  • ~1 hour per Email Worker the day we actually need one.

14. Cloudflare Tunnel / Zero Trust Network

(a) Current state

(b) Application to V5

Historical context: the old VPS (Hetzner, Bridge API 8888) is decommissioned per project CLAUDE.md. V5 is 100% on Cloudflare Workers edge. There is no private origin to front.

Possible future uses:

  • Client-owned Postgres reachable only on their VPC — Tunnel is the sanctioned way for Hyperdrive to reach it.
  • A self-hosted Cal.com or SES relay if Scheduler V5 goes partly off-CF.

(c) Recommendation: NOT NOW. ADOPT when we first need to reach a client’s private network for Hyperdrive or a self-hosted service.

Explicit reject for current architecture: there is no on-prem or VPC-only service to tunnel to. Adding Tunnel to a 100% edge stack is pure overhead.

(d) Migration cost

  • ~0.5 day per private origin when the need arises (install cloudflared in their VPC, define hostname + policy).

Prioritized adoption roadmap

Top 5 — adopt immediately (this sprint)

  1. Workers Logs + Logpush to R2 (§10) — lowest-effort, highest-value. 2 hours. Unblocks >7-day forensics and structured incident review.
  2. AI Gateway — unified API, caching, dynamic routing, Logpush (§2) — single biggest LLM cost + resilience lever. 1–2 days total, per-provider rollout.
  3. Cloudflare Access on /admin/* (§12) — phishing-resistant admin auth. 3 hours.
  4. Workflows for the Gong/SFDC multi-step pipelines (§4) — retries, waitForEvent for HITL ad-spend. 1 day for first pipeline, 0.5 day each after.
  5. SQLite DOs for all new DO classes (§7) — free upgrade for any new stateful surface (scheduler, per-tenant rate limiting, session objects). Zero cost if done net-new.

Next 5 — adopt in Phase 3 / when trigger hits

  1. Secrets Store (§1) — wait for GA. ADR now, rollout the day it GAs.
  2. Workers Pipelines for analytics/archive ingest (§3) — pilot with Gong webhook stream; beta-acceptable for archive-only path.
  3. Browser Run Quick Actions + Stagehand (§5) — fold in as the GTM researcher subagent + client report PDF renderer.
  4. Workers RPC + multi-worker split (§6) — triggered by ADR-013 Headless 360 / Scheduler V5 split. Use WorkerEntrypoint from day one on the new Workers.
  5. Analytics Engine augmentation + Grafana (§11) — tenant-indexed AE + Grafana Cloud dashboards for client usage/spend visibility. 0.5 day.

Explicit rejects (do NOT adopt at this phase)

  • Containers on Workers (§8) — no workload today meets the bar; retry at Phase 3 if CPU/memory ceilings bite.
  • Hyperdrive (§9) — no Postgres/MySQL in scope. Ready to adopt the day we add one.
  • Email Workers (§13) — no inbound-email dependency. Speculative build violates simplicity-first.
  • Cloudflare Tunnel (§14) — no private origin to front. V5 is 100% edge. Adopt when client VPCs enter scope.

Cross-cutting rules that should land alongside the roadmap

  • Every new Worker: observability.enabled = true, head_sampling_rate = 1.0, Logpush ruleset attached.
  • Every new DO namespace: SQLite backend.
  • Every new inter-Worker call: WorkerEntrypoint RPC, not internal fetch.
  • Every new admin endpoint: behind Access.
  • Every new LLM call: through AI Gateway (so caching, fallback, and observability are free).