Operator OS Q1 = Five Parallel Tracks
ADR-040 — Operator OS Q1 = Five Parallel Tracks
- Status: Accepted
- Date: 2026-05-07
- Decider: Mishaal Murawala (delegated engineering sequencing to Claude Code as engineering lead)
- Supersedes: none
- Related:
docs/architecture/ASCEND_OPERATOR_OS_VISION.md,docs/plans/OPERATOR-OS-Q1-FOUNDATION.md - Pre-listed follow-up ADRs: ADR-041 (hot-path budget 10ms→30ms), ADR-042 (capability-index replaces static MCP registration), ADR-043 (memory tiers expand beyond KV-only), ADR-044 (composites deferred)
Context
The Ascend Operator OS Vision (ADOPTED 2026-05-07, see vision doc Appendix B) commits us to a multi-tenant agent platform with a capability index, four memory tiers, an inference router, a per-tenant eval system, and a 19-agent topology. The vision retires several V5 invariants (10ms hot-path budget, 35-tool ceiling, KV-only memory, static MCP registration, composite-tools roadmap), each of which needs its own follow-up ADR.
Mishaal’s directive 2026-05-07: “Sequence is up to you. You are the engineering lead. You figure out the best way to do it. Ideally, parallel process as many things as you can that are non-overlapping, but ultimately how you implement is up to you.”
Q1 needs to ship the foundation in 12 weeks while:
- not breaking the existing V5 substrate (production traffic continues),
- not violating the existing 15 invariants until each retiring ADR ships,
- maximizing parallelism across non-overlapping subsystems,
- giving Mishaal at least one demoable per-tenant agent within 6 weeks.
Decision
Five parallel tracks, with explicit dependency edges. Each track has an independent owner, an independent acceptance test, and an independent merge cadence.
Track A (weeks 1–6): per-tenant agent DO + 1 SDR Agent end-to-end │ ├── depends on (none — uses today's static MCP tools) └── unblocks fund agents in Track C
Track B (weeks 1–8): capability index (Vectorize + retrieval helper) │ ├── depends on (none — pure additive layer) └── unblocks Track A migration to dynamic retrieval (week 7+)
Track C (weeks 4–12): per-fund agent DO + cross-portco D1 view + Op Partner dashboard │ ├── depends on Track A (DO base class) by week 4 └── unblocks fund-level land motion (Q2)
Track D (weeks 6–12): semantic + procedural memory tiers │ ├── depends on Track A (agent runtime to call into) by week 6 └── unblocks compounding loops 3+5 (procedural / pattern bank)
Track E (weeks 8–12): per-tenant evals + A/B routing + Cerebras triage tier │ ├── depends on Track A live traffic (need real runs to grade) by week 8 └── unblocks outcome-pricing commit (Q2)Tracks A and B are fully parallel from week 1. Tracks C/D/E stagger in as their dependency edges land.
Track scopes (acceptance tests)
Track A — Per-tenant agent DO + 1 SDR Agent (weeks 1–6)
Goal: ship one production agent end-to-end. SDR Agent is the wedge: it has the clearest outcome metric (qualified meeting) and Mishaal can dogfood it against Ascend’s own outbound.
Scope:
- New SQLite-backed DO class
AgentRuntimekeyed by(tenant_id, agent_type, agent_instance_id). wrangler.tomlmigration addsAgentRuntimetonew_sqlite_classes.src/do/agent-runtime.ts— turn loop, working memory (KV TTL’d), episodic memory write to D1memory_episodes(new table, migration 0008).src/agents/sdr/— system prompt, ICP scorer call, draft email + send (gated), reply triage, meeting-booked detection.- Langfuse trace export via existing AI Gateway.
- Eval harness shadow mode: every SDR Agent run produces (input, output) tuple in D1
agent_runs(new table, migration 0008). - Surface:
POST /v1/agents/:tenant/sdr/runadmin route (CF-Access gated).
Acceptance:
- One SDR Agent instance for tenant
ascend(dogfood) live in production. - 50+ runs in
agent_runstable within 7 days of go-live. - A/B vs. generic-baseline reply rate logged to D1; report after 4 weeks.
- Hot-path budget for non-agent gateway requests stays ≤10ms p99 (no regression to existing invariant 10).
Track B — Capability index (weeks 1–8, parallel to A)
Goal: every tool retrievable by embedding similarity, with cost/latency/success priors. No agent code uses it yet — Track A still calls today’s static tools — but the index is queryable and ready.
Scope:
- New Vectorize namespace
capability_index(separate from existingtenant_*namespaces). scripts/embed-tool-catalog.ts— readssrc/config/providers.ts+src/handlers/mcp.ts, embeds (description + scope + provider) per tool, writes vectors + metadata to Vectorize.- KV mirror
capability_index:{tool_name}with the priors block (schema in vision doc §3.2). src/lib/capability-retrieval.ts—retrieveCapabilities(intent: string, opts: {top_k, max_cost_usd, max_latency_ms_p99}): ToolCandidate[].- Priors writer: nightly CF Cron job reads last 24h of
agent_runs+ gateway audit logs, updates per-tool success_rate / latency / cost. - Admin endpoint
POST /admin/capabilities/reindex(CF-Access gated).
Acceptance:
- All current 33 tools indexed.
retrieveCapabilities("score this account against ICP")returnsicp_scorer-class tools in top-3 with non-trivial similarity score.- Priors update visibly within 24h of an agent run.
- Track A migration to dynamic retrieval is a follow-up after week 7 (not Q1 scope to flip the switch).
Track C — Per-fund agent DO + cross-portco view + Op Partner dashboard (weeks 4–12)
Goal: prove fund-level land motion is technically real. Operating Partner can ask “how is Portco X tracking against the peer set” and get a useful answer.
Scope:
- New DO class
FundRuntimekeyed by(fund_id, agent_type)— same SQLite-backed pattern asAgentRuntime. - D1 view (cold path)
cross_portco_metrics_v— aggregates anonymized last-90d metrics from each portco’sagent_runstable bytenant_id× metric type. - Tenant-isolation contract: portco tenants never see other portcos. Fund tenant sees aggregated metrics only.
src/agents/fund/operating-partner-brief/— agent that produces a 1-page Op Partner brief on demand for a given fund × portco.- Surface: minimal Cloudflare Pages dashboard (
fund-dashboard/) — single-page, list-of-portcos + drill-into-brief. CF-Access gated.
Acceptance:
- Fund tenant
pointfield-democonfigured with 2 mock portcos. - Op Partner brief renders for each.
- Cross-portco D1 view query returns in <1s for ≤10 portcos.
- Tenant-isolation test: portco tenant
kahunacannot read fundpointfield-demo’s view (403 verified by integration test).
Track D — Semantic + procedural memory tiers (weeks 6–12)
Goal: every agent has access to per-tenant facts (semantic) and learned-workflow priors (procedural). Compounding loops 3+5 begin accruing.
Scope:
- Per-tenant Vectorize namespace pattern
tenant_{tenant_id}(already exists — Track D operationalizes it for agents). src/lib/memory-semantic.ts—recall(tenant_id, query, opts): Fact[]andlearn(tenant_id, fact).- DO SQLite procedural store inside
AgentRuntime— tableprocedural_workflowswith bandit weights. src/lib/memory-procedural.ts—selectWorkflow(tenant_id, agent_type, task_type)(Thompson-sampling over stored workflows) andrecordOutcome(workflow_id, outcome_score).- Migration to wire SDR Agent (Track A) to use both: semantic recall before drafting; procedural workflow selection before sending.
Acceptance:
- SDR Agent for
ascendwrites ≥10 semantic facts in week 1 of going live. - Procedural workflow store has ≥3 distinct workflows recorded after 4 weeks.
- Bandit selection demonstrably weights toward higher-outcome workflows over time (chart in weekly summary).
Track E — Per-tenant evals + A/B routing + Cerebras triage (weeks 8–12)
Goal: the eval moat starts compounding, and the inference router can run sub-200ms triage when reasoning isn’t needed.
Scope:
- D1 table
eval_datasetsper-tenant (migration 0009): graded (input, expected_output, actual_output, score, grader_id, graded_at). - Grading pipeline: human + tri-judge (carryover from V5 Quality Harness Phase 4) writes scores back into
eval_datasets. - A/B router at orchestration: per (tenant, agent_type, task_type) bandit choosing between (model, prompt_variant) tuples.
- Cerebras provider added to
api_config:{provider}.inference-router.tswraps existingllm_invokewith a triage tier that hits Cerebras for low-reasoning tasks (intent classification, format extraction) and falls through to Anthropic/Gemini for reasoning.
Acceptance:
- Tenant
ascendhas ≥100 graded examples ineval_datasetsfor SDR Agent within 30 days of Track E start. - A/B router visibly switches between two SDR Agent prompt variants based on bandit weights (logged).
- Cerebras tier handles ≥30% of agent calls measured over a 7-day window.
- p50 inference latency on triage-tier calls <200ms.
Consequences
Positive:
- 5 tracks running in parallel = 12-week Q1 instead of ~30 weeks serial.
- Track A delivers a demoable agent within 6 weeks (Mishaal’s “first useful output” SLA from §7).
- Track B is pure additive — no risk of regressing the existing gateway.
- Track C unblocks fund-level commercial motion in Q2 without waiting for portco motion to mature.
Negative / accepted risk:
- Tracks A and D both touch
AgentRuntimeDO. Risk of merge conflicts. Mitigation: Track D blocks on Track A skeleton (week 1) before opening any code. - Track B’s priors writer depends on
agent_runsD1 schema landing in Track A first. Mitigation: Track A merges migration 0008 by week 2; Track B’s priors job is a no-op until then. - Tracks C and D both add D1 migrations. Mitigation: migrations are ordered (0008 Track A, 0009 Track E, 0010 Track C, 0011 Track D).
- Five tracks may exceed single-engineer (Mishaal+Claude) bandwidth. Mitigation: each track ships independently; if a track slips, vision is not invalidated.
- Hot-path budget regression risk when capability-index retrieval lands in agent runtime (post-Q1). Mitigation: ADR-041 lands before flip.
Reversal triggers:
- If Track A’s SDR Agent fails A/B vs. generic baseline at week 6 → pause Tracks C/D/E, debug agent quality before scaling topology.
- If capability-index priors are too noisy at week 8 to be useful → defer Track B integration to Q2; agents stay on static MCP for Q1 demo.
- If Cerebras tier shows <10% routing share at week 12 → defer Track E inference-router work; eval system still ships.
Out of scope for Q1 (deferred to Q2+)
- Agents 11–19 (only SDR Agent + Operating Partner Brief Agent in Q1)
- Outcome-billing infrastructure (Stripe metered usage)
- Public capability-index API (closed for Q1, open in Q3 if Bet 5 plays out)
- SOC 2 Type 2 (Q3 question per vision §14)
- Composites (vision §9 — deferred indefinitely; ADR-044)
Implementation kickoff
After this ADR + the Q1 plan doc + the LEDGER row land on main:
- Track A: branch
claude/operator-os-track-a-sdr-agent. First commit:wrangler.tomlmigration +AgentRuntimeskeleton. - Track B: branch
claude/operator-os-track-b-capability-index. First commit: Vectorize namespace creation script.
Tracks C/D/E branches open at their dependency-unblock weeks per the diagram above.