Lean stack: Composio + Mem0 + Anthropic-via-AI-Gateway
ADR-053 — Lean stack: Composio + Mem0 + Anthropic-via-AI-Gateway
Status: Accepted 2026-05-16
Supersedes (in part): ADR-042 (capability index), parts of ADR-038 (Nango token write path), ADR-050 (hermes-slack-listener sibling — preserved but rewired)
Amends: Invariants #1, #2, #5, #6, #7, #11 in .claude/rules/v5-invariants.md
Context
V5 Gateway was conceived when no single vendor owned the OAuth-SaaS layer well. Composio now does. As of 2026-05-16, 15 active connections are verified across Ascend (apollo, gamma, googlesuper, linkedin, quickbooks, slack ×2) and Kahuna (gamma, gong, googlesuper, hubspot, linkedin, linkedin_ads, salesforce, semrush). Composio provides 993+ tools across these providers and handles OAuth token lifecycle externally.
Mem0 Cloud became canonical memory on 2026-05-16 (4,256 memories migrated from Hindsight). Hindsight is mirror-only and slated for decommission.
Cloudflare AI Gateway is a routing slug, not a Worker. Any process — including non-Worker code — can invoke https://gateway.ai.cloudflare.com/v1/{acct}/ascend-anthropic/anthropic/... and satisfy invariant #12 (observability + fallback chain + budget cap) without a proxy plane.
The combination means V5’s three jobs — (a) OAuth/SaaS proxy, (b) LLM routing, (c) RAG over client data — are all dischargeable without an Ascend-owned Worker in the data path. The first-principles test (“does Kahuna need this to do GTM work tomorrow?”) rules out:
- A “side-channel” Worker (
ascend-side-channel) — prebuilt, never deployed, exists to wrap Anthropic + AWS + web_fetch. With crons dead and context-worker rewired, it has no consumer. - A context-worker —
gong-ingest+salesforce-ingestworkflows + Vectorizeclient-knowledgeindex were built for a cross-conversation RAG use case that has not materialized. Kahuna GTM work today is “fetch fresh + reason,” not “search 6 months of transcripts.” Build RAG fresh against Mem0 if/when the use case proves out. - All 44 V5 crons. None produced sufficient value to justify a new home.
Decision
Adopt the lean stack:
- Composio — canonical OAuth SaaS layer for both tenants. All HubSpot / Salesforce / Google / Slack / Gong / Apollo / SEMrush / QBO / LinkedIn ops route via
mcp__composio__*. New SaaS capability = Composio capability request first; we add a typed wrapper Worker only if Composio refuses and the need is <30d. - Mem0 — canonical long-term memory.
mcp__mem0__*is the only write path. - Anthropic direct via CF AI Gateway slug
ascend-anthropic— any non-Claude-Code LLM call uses this URL pattern. No proxy Worker required. Invariant #12 satisfied at the slug level. - Hermes (rebuilt) — thin Slack bot in
hermes-slack-listener/. Calls Composio MCP + Anthropic direct + Mem0 SDK. No V5, no side-channel, no context-worker dependency. - Claude Code — interactive surface; Mishaal’s daily driver.
Removed from the topology:
- V5 Worker (all 31 tools, 44 crons, DOs, token KV namespace, D1, R2 backup bucket)
ascend-side-channel(never deployed; deleted from repo)ascend-context-worker(ingest workflows, D1 entities/facts/signal_evaluations, Vectorizeclient-knowledge)- Token DO + Nango write path (Composio owns OAuth lifecycle)
- Capability index Vectorize namespace (
capability_index)
Invariant amendments
| # | Before | After |
|---|---|---|
| 1 | Two-plane (gateway + context-worker), Service Bindings between the two only. | Zero-plane: no Ascend-owned Worker in the data path. SaaS via Composio MCP; LLM via Anthropic-direct + AI Gateway slug; memory via Mem0. |
| 2 | KV-only hot path on the gateway. | N/A — no gateway. Composio handles its own latency budget. |
| 5 | No external vendors in the token read path (Nango exception). | Composio owns OAuth end-to-end. No KV tokens:* keys, no DO refresh, no Nango. |
| 6 | Request path never touches a DO. | N/A — no DOs. |
| 7 | Capability index, not static registration. | N/A — Composio tool discovery via COMPOSIO_SEARCH_TOOLS; capability index Vectorize namespace retired. |
| 11 | 30s AbortController on every outbound fetch. | Applies only to Hermes outbound calls (which is a tiny surface). |
| 12 | Every LLM call goes through AI Gateway. | Unchanged. Satisfied by AI Gateway slug, not by a Worker proxy. |
Invariants #3, #4, #8, #9, #10, #13, #14, #15 are either preserved (where they apply to Hermes/Composio/Mem0) or N/A (where they applied only to V5 internals).
Consequences
Positive:
- ~95% of V5’s surface area deleted; maintenance burden collapses.
- No proxy latency on SaaS calls (Composio direct).
- No token custody risk (Composio owns it).
- No cron sprawl (all 44 retired; future schedules earn their place fresh).
- Hermes becomes ~200 lines of Slack-bot code calling three external services. Trivially maintainable.
Negative / Risks:
- Lock-in to Composio. If Composio fails or pivots pricing, we’re exposed. Mitigation: escape-hatch pattern is documented — if Composio refuses a capability and need is urgent, we add a single-purpose CF Worker for that one endpoint.
- No RAG today. If Kahuna agents start needing cross-conversation reasoning over Gong transcripts, we have to build it (against Mem0). Mitigation: this is a forward problem, not a today problem. We don’t pay infrastructure cost for a use case that hasn’t shipped.
- No “fallback” for LLM routing. Anthropic-direct via AI Gateway slug is the path. AI Gateway provides fallback chain config — that’s the resilience layer.
- Loss of
call_apiescape hatch in any Ascend-owned Worker. Replacement: Composio capability requests + single-purpose Workers if needed.
Reversal triggers
Re-introduce an Ascend-owned execution plane only if:
- Composio outage/breakage causes >2 production incidents in a 30-day window AND there’s no Composio SLA upgrade path, OR
- A Kahuna (or future client) workflow proves a quantifiable RAG-over-historical-data need that Composio + Mem0 cannot serve at acceptable latency/cost, OR
- A regulatory/compliance requirement mandates token custody on infrastructure we control.
Any such reversal requires a new ADR documenting which of the above triggered.
Implementation
See .claude/memory/2026-05-16-session-end.md for the day-by-day execution plan. ADR-054 covers retirement criteria (zero consumer calls for ≥7 days → archive → delete).