V5 evolves into a two-plane platform (Execution + Context) to deliver the "always-on GTM brain"
ADR-016: V5 evolves into a two-plane platform (Execution + Context) to deliver the “always-on GTM brain”
Status: Accepted (2026-04-24 — Phase 2 kickoff; see docs/plans/V5-PHASE-2-CONTEXT-PLANE.md)
Date: 2026-04-22 (proposed), 2026-04-24 (accepted)
Deciders: Mishaal Murawala
Supersedes (partial): invariant #1 “ONE Worker only” — scoped violation documented below.
Relates to: docs/daily-ai-review/2026-04-22.md (Octave competitive analysis), PRD Gateway V5 §7 (Decision Records), v5 v2.1 §14 Change 2 (dual-store sync nightmare to avoid).
Context
Ascend’s consulting output today — ICP models, motion definitions, messaging frameworks — is produced during a time-boxed engagement, delivered as artifacts (Notion docs, SFDC fields, Slack messages), and then decays as the market moves. Octave HQ is selling a pure-product version of the same category as “your always-on GTM brain” with G2 4.7/5 and real-infra customers (Airbyte, Kong, Grafana Labs).
The strategic move: evolve Ascend from time-boxed consulting to service-as-infrastructure — retain the strategic engagement upfront, then operationalize the artifacts as a persistent, auto-updating context layer hosted on V5 that continuously learns from the client’s Gong/SFDC/HubSpot and pushes updates back to those tools.
This ADR records the product architecture to deliver that. A prior draft proposed compressing everything into the existing V5 Worker with KV as the graph store. A multi-model council review (GPT-5.4, Gemini 3.1 Pro, Nemotron 3) rejected that draft on two structural grounds and three tactical ones. This ADR incorporates the corrections.
Decision
V5 becomes a two-plane platform:
-
Execution Plane — the existing V5 gateway (
ascend-gateway-v5). Unchanged in scope: stateless credential injection, MCP tool surface, ≤10ms overhead, KV-only hot path, fail-fast. Remains invariant-pure. -
Context Plane — new Cloudflare Worker
ascend-context-worker. Hosts the tenant context graph, ingestion pipelines, signal detection, and action emission. Communicates with the Execution Plane via Cloudflare Service Bindings (zero-latency RPC).
When signals fire actions (update SFDC, send Slack alert, push Outreach sequence), the Context Plane calls the Execution Plane as a regular MCP client — the gateway does not need to know the Context Plane exists.
Non-goals (explicit, no later debate)
- No tenant-facing UI with user management. Kahuna is tenant 1. If PFP demand hits, revisit.
- No billing system. Invoice off AWS + CF usage reports in arrears for the first N tenants.
- No custom ML model training. LLM-as-graph-updater is sufficient.
- No voice/audio ingestion. Gong transcripts (text) are the signal-dense source.
- No “complete market intelligence” positioning in Phase 1. Phase 1 is “always-on internal GTM context.” External signals (competitor feeds, public content) are a Phase 4+ expansion.
- No external signal detection or competitive web scraping in Phase 1. Ground truth first.
Invariants — what holds, what flexes
| Invariant | Status | Rationale |
|---|---|---|
| #1 ONE Worker only | FLEXED (scoped) | Adding ascend-context-worker as a 4th Worker. Workload class is materially different (LLM inference, long-running ingestion, Vectorize queries). Deployment coupling would make an ingestion bug take down the interactive proxy. Service Bindings preserve zero-latency RPC. Documented boundary. |
| #2 No D1 in the interactive request path | HOLDS. | Context Plane’s D1 reads happen behind explicit MCP tool calls (context_query, context_explain). Not the request proxy hot path. 30ms D1 reads are acceptable at that tier. |
| #3 Fail-fast, no retries | HOLDS on read paths. | Ingestion write paths get dead-letter + CF Queue retry semantics — correct tradeoff for durable learning, not a contract violation for callers. |
| #4 No external vendors in the token path | HOLDS. | Context Plane uses existing Execution Plane token reads. Does not introduce new OAuth flows. |
| #6 KV is the sole config store | HOLDS. | Context Plane config (signal rules, ICP schema, ingestion sources) lives in KV under new key families (signals:{tenant}:*, icp:{tenant}:*). KV is NOT used for graph data (see below). |
| #10 Gateway overhead ≤10ms | HOLDS for Execution Plane. | Context Plane has its own latency budget (sub-100ms for context_query, sub-30s for ingestion ticks). |
Explicit rejection of the prior draft’s KV-for-graph-data idea. V5 v2.1 §14 Change 2 explicitly purged dual-store sync. Putting graph entities and facts in KV — then syncing writes from D1 — re-introduces exactly that nightmare. D1 is the single source of truth for the graph. No KV materialized views in Phase 1.
Architecture
Data plane
-
D1 schema (source of truth, Context Worker only):
-- Entities with deterministic IDs to prevent fragmentation from LLM extractionCREATE TABLE entities (tenant_id TEXT NOT NULL,entity_id TEXT NOT NULL, -- deterministic: domain for accounts, lowercase email for peopleentity_type TEXT NOT NULL, -- 'account' | 'person' | 'deal' | 'motion' | 'icp_dimension' | 'signal'canonical_name TEXT, -- resolved display nameattributes JSON, -- structured attributes (industry, stage, etc.)first_seen_at TEXT DEFAULT (datetime('now')),updated_at TEXT DEFAULT (datetime('now')),PRIMARY KEY (tenant_id, entity_id));-- Facts carry source attribution + confidence + authority for explainability trailCREATE TABLE facts (fact_id TEXT PRIMARY KEY,tenant_id TEXT NOT NULL,subject TEXT NOT NULL, -- entity_idpredicate TEXT NOT NULL, -- 'matches_icp' | 'engaged_on' | 'mentioned_competitor' | ...object TEXT NOT NULL, -- entity_id | literal valuesource_type TEXT NOT NULL, -- 'gong' | 'salesforce' | 'gmail' | 'hubspot' | 'manual'source_id TEXT NOT NULL, -- foreign ID in source systemsource_quote TEXT, -- verbatim quote for explainability (nullable for structured facts)source_authority INTEGER NOT NULL, -- 1=structured (SFDC field) .. 5=ground-truth (Gong quote)confidence REAL NOT NULL, -- 0-1 extraction confidenceextraction_model TEXT, -- e.g. 'glm-4.7'extraction_prompt_id TEXT, -- for reproducibility if we re-extract with better modelslearned_at TEXT DEFAULT (datetime('now')),superseded_by TEXT -- fact_id of newer fact that replaces this);CREATE INDEX idx_facts_tenant_subject ON facts(tenant_id, subject);CREATE INDEX idx_facts_tenant_predicate ON facts(tenant_id, predicate);CREATE INDEX idx_facts_learned_at ON facts(tenant_id, learned_at);-- Signal evaluations (cold path, for debugging + dashboard)CREATE TABLE signal_evaluations (id TEXT PRIMARY KEY,tenant_id TEXT NOT NULL,signal_id TEXT NOT NULL,evaluated_at TEXT DEFAULT (datetime('now')),triggered BOOLEAN NOT NULL,matched_facts JSON,action_taken TEXT,action_payload JSON); -
Deterministic entity ID rules (hardcoded, not LLM-resolved):
- Account: lowercased root domain (
acme.comfrom any URL/email/text mention) - Person: lowercased email address (primary) OR
{first_lower}.{last_lower}@{account_domain}(inferred) - Deal:
{source_type}:{source_id}(e.g.salesforce:006ABC...) - Motion/ICP dimension: manually-curated, Ascend-assigned slug
- LLM extraction outputs are mapped to canonical entities via these rules BEFORE insertion. No free-form entity creation.
- Account: lowercased root domain (
-
Source authority hierarchy (for conflict resolution):
- 5 = Gong verbatim quote (ground truth — what buyers actually said)
- 4 = Email body content (buyer-authored)
- 3 = SFDC opportunity stage / close date (system-tracked)
- 2 = SFDC custom fields (rep-entered, potentially stale)
- 1 = Inferred / derived
When facts conflict, higher authority wins; lower-authority fact is marked
superseded_bybut retained for audit. -
Vectorize binding in Context Worker: one index per tenant (
ctx_{tenant}_facts), 384-dim embeddings of fact source_quote + predicate + object text. Cheap to query, supports semantic retrieval incontext_query.
Ingestion plane
- Discovery crons (CF Cron Triggers on Context Worker): every 15 minutes, iterate active tenants, enqueue deltas to CF Queue
ctx-ingest-{source}. - CF Queue consumers: batched processing, built-in retry, dead-letter queue. One queue per source type (
ctx-ingest-gong,ctx-ingest-salesforce, etc.). - Keyword pre-filter (deterministic, cheap) before LLM extraction:
- Per-tenant keyword list:
ingest_keywords:{tenant}in KV — ICP terms, competitor names, motion triggers, industry vocabulary. - A Gong transcript or email body that matches zero keywords is stored as a low-confidence “engagement happened” fact WITHOUT full LLM extraction. Saves the bulk of token costs.
- Transcripts matching keywords go to LLM extraction (via
llm_invokeeconomy tier,zai-org/glm-4.7default) for entity + predicate + object + source_quote + confidence extraction.
- Per-tenant keyword list:
- CF Workflows for historical backfill (Phase 3 initialization wizard): pull 24 months of SFDC opps + 24 months of Gong recordings, durable multi-step execution with automatic resume on failure.
Signal plane
- Signal rules in KV:
signals:{tenant}:{signal_id}→ JSON{query_template, threshold, priority, action_tier}. - Evaluation cron (every 15 min): ordered by priority, evaluates D1 query, emits action if threshold crossed.
- Action tiers (policy-gated, configurable per signal):
notify— Slack message to Ascend + tenant ops channel, no SFDC writedraft— create SFDC task or draft email assigned to rep, requires human approvalauto— direct write to SFDC / Outreach sequence without human review- Default for new signals is
notify. Promotion todraftorautois an explicit per-signal configuration change. This prevents false-positive cascades from becoming automation liability.
Tool surface (new MCP tools in Context Worker, exposed via Service Binding through Execution Plane’s /mcp)
context_query— semantic + structural query. Inputs:{tenant, query_text?, entity_filter?, predicate_filter?, time_range?}. Returns top-K facts with source quotes, confidences, learned_at.context_explain— trace a specific claim back to its source facts. “Why does the model think Acme matches ICP-v3?” → returns the supporting facts + verbatim quotes + confidences.context_ingest_trigger— manually trigger ingestion for a tenant/source combo (admin use).signal_list/signal_fire— list configured signals, manually test-fire a signal.
Phased delivery
Phase 1 — Foundation (2-3 weeks, ~60-80 hrs engineering)
- New Cloudflare Worker
ascend-context-workerscaffolded. - D1 schema + deterministic entity ID rules + source authority.
- Vectorize binding + embedding pipeline.
context_query+context_explainMCP tools.- Gong ingestion queue + consumer + keyword pre-filter.
- SFDC ingestion queue + consumer (structured — easier).
- Ship as internal Ascend capability: used by Mishaal during Kahuna consulting engagements.
- Success criterion: query
context_explain("Acme matches ICP-v3")returns 3+ source quotes from Gong + SFDC with traceable learned_at timestamps.
Phase 2 — Signals + Actions (3-4 weeks)
- Signal rules KV schema + evaluation cron.
- Action emission via Service Binding → V5 Gateway tools.
- Three production signals for Kahuna (ICP match + engagement spike + competitor mention), each at
notifytier initially. - Success criterion: Kahuna receives Slack notification within 20 min of a signal condition being met in their SFDC/Gong data, with an explainability trail attached.
- Gradual promotion of vetted signals from
notify→draft→autoover the phase.
Phase 3 — Productization (4-6 weeks)
- CF Workflows for historical backfill (pull 24-month SFDC + 24-month Gong for new tenant init).
- Minimal tenant initialization wizard (admin UI or CLI script).
- Tenant data export endpoint (compliance/portability).
- Success criterion: a net-new PFP portfolio co can be onboarded without Mishaal writing custom ingestion code per client.
Phase 4+ — Expansion (as demand arrives)
- HubSpot ingestion, Gmail ingestion.
- External signal sources (competitor content feeds, public buying signals).
- Tenant-facing dashboard (when tenant #3 needs it).
- Per-tenant KV namespace isolation (if SOC 2 becomes relevant).
Business model this unlocks
- Strategy engagement (upfront): $50-150K, 4-6 weeks. Deliverables: seeded context graph + signal rules + motion definitions + policy-tier decisions.
- Infrastructure retainer (recurring): $3-10K/mo. Covers LLM inference (bulk on economy tier), CF platform pass-through with margin, SRE, monthly tuning.
- Margin profile: strategy ≈ 50-70% (consulting). Retainer ≈ 80%+ once scale hits. The retainer is the compounding revenue; strategy is the wedge.
At 5 PFP portfolio co tenants × $5K/mo recurring = $300K ARR recurring baseline, before strategy fees.
Risks + explicit mitigations
| Risk | Mitigation |
|---|---|
| Entity fragmentation from LLM extraction | Deterministic entity IDs + source authority hierarchy (designed into D1 schema day one). |
| Extraction cost balloons on high-Gong-volume clients | Keyword pre-filter gates the expensive LLM path; default to glm-4.7 (10× cheaper than Sonnet). |
| False-positive signal cascade | Default action tier is notify; promotion to draft/auto is an explicit per-signal config change. |
| Context Worker deployment takes down ingestion | Separate Worker means ingestion bugs don’t affect the interactive proxy. |
| Octave eats our lunch while we build | Our differentiator is strategy + explainability trail + hybrid service model — neither of which is their product. We aren’t racing them on feature parity. |
Open questions (parking, not blocking)
- Vectorize cost ceiling — need to model at 10 tenants × 100K facts × 384-dim embeddings. Likely fine; verify.
- Service Binding latency in practice — CF docs say zero-latency RPC; confirm via a Phase 1 benchmark.
- Historical backfill time-window negotiation with tenants — 24 months is arbitrary; may need to be configurable per tenant.
References
- Council review (GPT-5.4 + Gemini 3.1 Pro + Nemotron 3), 2026-04-22 (in session log).
- Octave HQ positioning analysis:
docs/daily-ai-review/2026-04-22.md§Octave. - V5 v2.1 PRD §14 Change 2: dual-store sync explicitly purged.
- CF Queues: https://developers.cloudflare.com/queues/
- CF Workflows: https://developers.cloudflare.com/workflows/
- CF Vectorize: https://developers.cloudflare.com/vectorize/