V5 evolves into a two-plane platform (Execution + Context) to deliver the "always-on GTM brain"

ADR-016: V5 evolves into a two-plane platform (Execution + Context) to deliver the “always-on GTM brain”

Status: Accepted (2026-04-24 — Phase 2 kickoff; see docs/plans/V5-PHASE-2-CONTEXT-PLANE.md) Date: 2026-04-22 (proposed), 2026-04-24 (accepted) Deciders: Mishaal Murawala Supersedes (partial): invariant #1 “ONE Worker only” — scoped violation documented below. Relates to: docs/daily-ai-review/2026-04-22.md (Octave competitive analysis), PRD Gateway V5 §7 (Decision Records), v5 v2.1 §14 Change 2 (dual-store sync nightmare to avoid).

Context

Ascend’s consulting output today — ICP models, motion definitions, messaging frameworks — is produced during a time-boxed engagement, delivered as artifacts (Notion docs, SFDC fields, Slack messages), and then decays as the market moves. Octave HQ is selling a pure-product version of the same category as “your always-on GTM brain” with G2 4.7/5 and real-infra customers (Airbyte, Kong, Grafana Labs).

The strategic move: evolve Ascend from time-boxed consulting to service-as-infrastructure — retain the strategic engagement upfront, then operationalize the artifacts as a persistent, auto-updating context layer hosted on V5 that continuously learns from the client’s Gong/SFDC/HubSpot and pushes updates back to those tools.

This ADR records the product architecture to deliver that. A prior draft proposed compressing everything into the existing V5 Worker with KV as the graph store. A multi-model council review (GPT-5.4, Gemini 3.1 Pro, Nemotron 3) rejected that draft on two structural grounds and three tactical ones. This ADR incorporates the corrections.

Decision

V5 becomes a two-plane platform:

Execution Plane — the existing V5 gateway (ascend-gateway-v5). Unchanged in scope: stateless credential injection, MCP tool surface, ≤10ms overhead, KV-only hot path, fail-fast. Remains invariant-pure.
Context Plane — new Cloudflare Worker ascend-context-worker. Hosts the tenant context graph, ingestion pipelines, signal detection, and action emission. Communicates with the Execution Plane via Cloudflare Service Bindings (zero-latency RPC).

When signals fire actions (update SFDC, send Slack alert, push Outreach sequence), the Context Plane calls the Execution Plane as a regular MCP client — the gateway does not need to know the Context Plane exists.

Non-goals (explicit, no later debate)

No tenant-facing UI with user management. Kahuna is tenant 1. If PFP demand hits, revisit.
No billing system. Invoice off AWS + CF usage reports in arrears for the first N tenants.
No custom ML model training. LLM-as-graph-updater is sufficient.
No voice/audio ingestion. Gong transcripts (text) are the signal-dense source.
No “complete market intelligence” positioning in Phase 1. Phase 1 is “always-on internal GTM context.” External signals (competitor feeds, public content) are a Phase 4+ expansion.
No external signal detection or competitive web scraping in Phase 1. Ground truth first.

Invariants — what holds, what flexes

Invariant	Status	Rationale
#1 ONE Worker only	FLEXED (scoped)	Adding `ascend-context-worker` as a 4th Worker. Workload class is materially different (LLM inference, long-running ingestion, Vectorize queries). Deployment coupling would make an ingestion bug take down the interactive proxy. Service Bindings preserve zero-latency RPC. Documented boundary.
#2 No D1 in the interactive request path	HOLDS.	Context Plane’s D1 reads happen behind explicit MCP tool calls (`context_query`, `context_explain`). Not the request proxy hot path. 30ms D1 reads are acceptable at that tier.
#3 Fail-fast, no retries	HOLDS on read paths.	Ingestion write paths get dead-letter + CF Queue retry semantics — correct tradeoff for durable learning, not a contract violation for callers.
#4 No external vendors in the token path	HOLDS.	Context Plane uses existing Execution Plane token reads. Does not introduce new OAuth flows.
#6 KV is the sole config store	HOLDS.	Context Plane config (signal rules, ICP schema, ingestion sources) lives in KV under new key families (`signals:{tenant}:`, `icp:{tenant}:`). KV is NOT used for graph data (see below).
#10 Gateway overhead ≤10ms	HOLDS for Execution Plane.	Context Plane has its own latency budget (sub-100ms for `context_query`, sub-30s for ingestion ticks).

Explicit rejection of the prior draft’s KV-for-graph-data idea. V5 v2.1 §14 Change 2 explicitly purged dual-store sync. Putting graph entities and facts in KV — then syncing writes from D1 — re-introduces exactly that nightmare. D1 is the single source of truth for the graph. No KV materialized views in Phase 1.

Architecture

Data plane

D1 schema (source of truth, Context Worker only):

-- Entities with deterministic IDs to prevent fragmentation from LLM extraction
CREATE TABLE entities (
  tenant_id TEXT NOT NULL,
  entity_id TEXT NOT NULL,         -- deterministic: domain for accounts, lowercase email for people
  entity_type TEXT NOT NULL,       -- 'account' | 'person' | 'deal' | 'motion' | 'icp_dimension' | 'signal'
  canonical_name TEXT,             -- resolved display name
  attributes JSON,                 -- structured attributes (industry, stage, etc.)
  first_seen_at TEXT DEFAULT (datetime('now')),
  updated_at TEXT DEFAULT (datetime('now')),
  PRIMARY KEY (tenant_id, entity_id)
);

-- Facts carry source attribution + confidence + authority for explainability trail
CREATE TABLE facts (
  fact_id TEXT PRIMARY KEY,
  tenant_id TEXT NOT NULL,
  subject TEXT NOT NULL,           -- entity_id
  predicate TEXT NOT NULL,         -- 'matches_icp' | 'engaged_on' | 'mentioned_competitor' | ...
  object TEXT NOT NULL,            -- entity_id | literal value
  source_type TEXT NOT NULL,       -- 'gong' | 'salesforce' | 'gmail' | 'hubspot' | 'manual'
  source_id TEXT NOT NULL,         -- foreign ID in source system
  source_quote TEXT,               -- verbatim quote for explainability (nullable for structured facts)
  source_authority INTEGER NOT NULL, -- 1=structured (SFDC field) .. 5=ground-truth (Gong quote)
  confidence REAL NOT NULL,        -- 0-1 extraction confidence
  extraction_model TEXT,           -- e.g. 'glm-4.7'
  extraction_prompt_id TEXT,       -- for reproducibility if we re-extract with better models
  learned_at TEXT DEFAULT (datetime('now')),
  superseded_by TEXT               -- fact_id of newer fact that replaces this
);
CREATE INDEX idx_facts_tenant_subject ON facts(tenant_id, subject);
CREATE INDEX idx_facts_tenant_predicate ON facts(tenant_id, predicate);
CREATE INDEX idx_facts_learned_at ON facts(tenant_id, learned_at);

-- Signal evaluations (cold path, for debugging + dashboard)
CREATE TABLE signal_evaluations (
  id TEXT PRIMARY KEY,
  tenant_id TEXT NOT NULL,
  signal_id TEXT NOT NULL,
  evaluated_at TEXT DEFAULT (datetime('now')),
  triggered BOOLEAN NOT NULL,
  matched_facts JSON,
  action_taken TEXT,
  action_payload JSON
);

Deterministic entity ID rules (hardcoded, not LLM-resolved):
- Account: lowercased root domain (acme.com from any URL/email/text mention)
- Person: lowercased email address (primary) OR {first_lower}.{last_lower}@{account_domain} (inferred)
- Deal: {source_type}:{source_id} (e.g. salesforce:006ABC...)
- Motion/ICP dimension: manually-curated, Ascend-assigned slug
- LLM extraction outputs are mapped to canonical entities via these rules BEFORE insertion. No free-form entity creation.
Source authority hierarchy (for conflict resolution):
- 5 = Gong verbatim quote (ground truth — what buyers actually said)
- 4 = Email body content (buyer-authored)
- 3 = SFDC opportunity stage / close date (system-tracked)
- 2 = SFDC custom fields (rep-entered, potentially stale)
- 1 = Inferred / derived
When facts conflict, higher authority wins; lower-authority fact is marked superseded_by but retained for audit.
Vectorize binding in Context Worker: one index per tenant (ctx_{tenant}_facts), 384-dim embeddings of fact source_quote + predicate + object text. Cheap to query, supports semantic retrieval in context_query.

Ingestion plane

Discovery crons (CF Cron Triggers on Context Worker): every 15 minutes, iterate active tenants, enqueue deltas to CF Queue ctx-ingest-{source}.
CF Queue consumers: batched processing, built-in retry, dead-letter queue. One queue per source type (ctx-ingest-gong, ctx-ingest-salesforce, etc.).
Keyword pre-filter (deterministic, cheap) before LLM extraction:
- Per-tenant keyword list: ingest_keywords:{tenant} in KV — ICP terms, competitor names, motion triggers, industry vocabulary.
- A Gong transcript or email body that matches zero keywords is stored as a low-confidence “engagement happened” fact WITHOUT full LLM extraction. Saves the bulk of token costs.
- Transcripts matching keywords go to LLM extraction (via llm_invoke economy tier, zai-org/glm-4.7 default) for entity + predicate + object + source_quote + confidence extraction.
CF Workflows for historical backfill (Phase 3 initialization wizard): pull 24 months of SFDC opps + 24 months of Gong recordings, durable multi-step execution with automatic resume on failure.

Signal plane

Signal rules in KV: signals:{tenant}:{signal_id} → JSON {query_template, threshold, priority, action_tier}.
Evaluation cron (every 15 min): ordered by priority, evaluates D1 query, emits action if threshold crossed.
Action tiers (policy-gated, configurable per signal):
- notify — Slack message to Ascend + tenant ops channel, no SFDC write
- draft — create SFDC task or draft email assigned to rep, requires human approval
- auto — direct write to SFDC / Outreach sequence without human review
- Default for new signals is notify. Promotion to draft or auto is an explicit per-signal configuration change. This prevents false-positive cascades from becoming automation liability.

Tool surface (new MCP tools in Context Worker, exposed via Service Binding through Execution Plane’s `/mcp`)

context_query — semantic + structural query. Inputs: {tenant, query_text?, entity_filter?, predicate_filter?, time_range?}. Returns top-K facts with source quotes, confidences, learned_at.
context_explain — trace a specific claim back to its source facts. “Why does the model think Acme matches ICP-v3?” → returns the supporting facts + verbatim quotes + confidences.
context_ingest_trigger — manually trigger ingestion for a tenant/source combo (admin use).
signal_list / signal_fire — list configured signals, manually test-fire a signal.

Phased delivery

Phase 1 — Foundation (2-3 weeks, ~60-80 hrs engineering)

New Cloudflare Worker ascend-context-worker scaffolded.
D1 schema + deterministic entity ID rules + source authority.
Vectorize binding + embedding pipeline.
context_query + context_explain MCP tools.
Gong ingestion queue + consumer + keyword pre-filter.
SFDC ingestion queue + consumer (structured — easier).
Ship as internal Ascend capability: used by Mishaal during Kahuna consulting engagements.
Success criterion: query context_explain("Acme matches ICP-v3") returns 3+ source quotes from Gong + SFDC with traceable learned_at timestamps.

Phase 2 — Signals + Actions (3-4 weeks)

Signal rules KV schema + evaluation cron.
Action emission via Service Binding → V5 Gateway tools.
Three production signals for Kahuna (ICP match + engagement spike + competitor mention), each at notify tier initially.
Success criterion: Kahuna receives Slack notification within 20 min of a signal condition being met in their SFDC/Gong data, with an explainability trail attached.
Gradual promotion of vetted signals from notify → draft → auto over the phase.

Phase 3 — Productization (4-6 weeks)

CF Workflows for historical backfill (pull 24-month SFDC + 24-month Gong for new tenant init).
Minimal tenant initialization wizard (admin UI or CLI script).
Tenant data export endpoint (compliance/portability).
Success criterion: a net-new PFP portfolio co can be onboarded without Mishaal writing custom ingestion code per client.

Phase 4+ — Expansion (as demand arrives)

HubSpot ingestion, Gmail ingestion.
External signal sources (competitor content feeds, public buying signals).
Tenant-facing dashboard (when tenant #3 needs it).
Per-tenant KV namespace isolation (if SOC 2 becomes relevant).

Business model this unlocks

Strategy engagement (upfront): $50-150K, 4-6 weeks. Deliverables: seeded context graph + signal rules + motion definitions + policy-tier decisions.
Infrastructure retainer (recurring): $3-10K/mo. Covers LLM inference (bulk on economy tier), CF platform pass-through with margin, SRE, monthly tuning.
Margin profile: strategy ≈ 50-70% (consulting). Retainer ≈ 80%+ once scale hits. The retainer is the compounding revenue; strategy is the wedge.

At 5 PFP portfolio co tenants × $5K/mo recurring = $300K ARR recurring baseline, before strategy fees.

Risks + explicit mitigations

Risk	Mitigation
Entity fragmentation from LLM extraction	Deterministic entity IDs + source authority hierarchy (designed into D1 schema day one).
Extraction cost balloons on high-Gong-volume clients	Keyword pre-filter gates the expensive LLM path; default to `glm-4.7` (10× cheaper than Sonnet).
False-positive signal cascade	Default action tier is `notify`; promotion to `draft`/`auto` is an explicit per-signal config change.
Context Worker deployment takes down ingestion	Separate Worker means ingestion bugs don’t affect the interactive proxy.
Octave eats our lunch while we build	Our differentiator is strategy + explainability trail + hybrid service model — neither of which is their product. We aren’t racing them on feature parity.

Open questions (parking, not blocking)

Vectorize cost ceiling — need to model at 10 tenants × 100K facts × 384-dim embeddings. Likely fine; verify.
Service Binding latency in practice — CF docs say zero-latency RPC; confirm via a Phase 1 benchmark.
Historical backfill time-window negotiation with tenants — 24 months is arbitrary; may need to be configurable per tenant.

References

Council review (GPT-5.4 + Gemini 3.1 Pro + Nemotron 3), 2026-04-22 (in session log).
Octave HQ positioning analysis: docs/daily-ai-review/2026-04-22.md §Octave.
V5 v2.1 PRD §14 Change 2: dual-store sync explicitly purged.
CF Queues: https://developers.cloudflare.com/queues/
CF Workflows: https://developers.cloudflare.com/workflows/
CF Vectorize: https://developers.cloudflare.com/vectorize/