Phase 7 Wiki Tools — One Polymorphic `client_wiki` Tool, Not Three

ADR-037: Phase 7 Wiki Tools — One Polymorphic `client_wiki` Tool, Not Three

Status: Accepted Date: 2026-05-02 Author: Claude Code (architect-reviewer agent) Business owner: Mishaal Murawala Related:

docs/plans/V5-QUALITY-HARNESS-AND-KNOWLEDGE-LAYER.md (Phase 7, lines 216–248)
.claude/rules/v5-invariants.md — invariant #7 (tool ceiling)
ADR-018 — claude_* family consolidation precedent
ADR-033 — discriminated-union operation pattern (BDA/Transcribe)

Context

Phase 7 of the V5 Quality Harness + Knowledge Layer plan introduces a per-tenant wiki — Karpathy/Shann LLM-Wikid pattern, productized as the V5 Pro tier. The plan calls for three new MCP tools:

client_wiki_ingest — accept a source (URL / raw text / R2 file ID), classify, summarize, write a sources/ page, update index.md + relevant entities/ and concepts/ pages, append log.md
client_wiki_query — read index.md, pick relevant pages, synthesize a cited answer, file output back into wiki/{tenant}/wiki/outputs/
client_wiki_lint — scan for contradictions, orphan pages, stale claims, missing cross-references, undocumented concepts; output lint-report.md

The conflict. The MCP tool registry currently has 33 tools registered in src/handlers/mcp.ts (lines 115–147) — 25 curated + 8 platform. Invariant #7 sets a hard ceiling at 35. Adding three new tools brings the count to 36, which exceeds the ceiling and blocks Phase 7 from shipping.

Note on doc drift: at time of writing, .claude/rules/v5-invariants.md still says “31 tools registered” and docs/requirements/TOOLS.md says “Total: 31 tools” while the actual register-call count in mcp.ts is 33 (added aws_bedrock_converse and aws_nova_canvas 2026-05-01). The drift is pre-existing tech debt, not part of this decision. This ADR’s PR fixes the count in both files.

The ceiling exists for two reasons:

Agent decision quality degrades with tool count. Anthropic’s MCP design guidance and Cloudflare Agents SDK telemetry both indicate routing accuracy drops sharply beyond ~30 candidate tools when an LLM is presented with the full tool list per turn. The 35 ceiling buys roughly one quarter of additional headroom from the current state.
Consumer surface area. Every registered tool is exposed to every connected client (Cursor, Codex CLI, ChatGPT, claude.ai/code, Claude iOS, GitHub Copilot). Adding tools is cheap; removing them is a breaking change for those consumers (see ADR-023 — we already retain low-use tools for backwards-compat reasons).

Four options were evaluated:

Option	What	Tools	Cost	Verdict
A	Raise ceiling to 40	All 3 separate	36	Rejected — gives up the discipline; the ceiling is the safeguard, not the symptom
B	Merge ingest+query into `client_wiki`, keep lint separate	2 wiki tools	35 (exactly at limit)	Rejected — boxes in Phase 8/9; partial consolidation has no ergonomic benefit
C	Merge all 3 into one polymorphic `client_wiki`	1 wiki tool	34	Accepted — matches V5 precedent, preserves headroom
D	Trim 1+ low-use tools to make room	All 3 separate	33 + 3 - n	Rejected — multi-week deprecation cycle, breaking for live clients

Decision

Adopt Option C: implement Phase 7 as a single polymorphic client_wiki MCP tool with a discriminated-union action field selecting between ingest, query, and lint.

This:

Stays at 34 tools registered (33 + 1), preserving 1 slot of ceiling headroom
Matches the established V5 polymorphic-tool pattern (claude, agent_state, aws_bda_analyze, aws_transcribe)
Keeps the per-tenant R2 wiki layout cohesive in a single tool — all three actions operate on the same wiki/{tenant}/ prefix with the same auth + tenant scoping
Avoids invariant #7 amendment — the existing 35 ceiling is preserved

Polymorphic input schema (Zod)

import { z } from 'zod';

// ═══ INGEST ═══════════════════════════════════════════════════════════════

const IngestSourceUrl = z.object({
  type: z.literal('url'),
  url: z.string().url().describe('Public https URL to fetch and ingest'),
}).strict();

const IngestSourceText = z.object({
  type: z.literal('text'),
  text: z.string().min(1).max(500_000).describe('Raw text body to ingest'),
  source_label: z.string().min(1).max(200).describe('Human-readable label (e.g. "Q3 board memo")'),
}).strict();

const IngestSourceR2 = z.object({
  type: z.literal('r2'),
  r2_key: z.string().describe('R2 object key under wiki/{tenant}/raw/'),
}).strict();

const IngestSchema = z.object({
  action: z.literal('ingest'),
  source: z.discriminatedUnion('type', [IngestSourceUrl, IngestSourceText, IngestSourceR2]),
  doc_type: z.enum(['article', 'tweet', 'transcript', 'paper', 'note', 'auto'])
    .default('auto')
    .describe('Document classification hint; "auto" lets the model classify'),
  cost_cap_usd: z.number().min(0.01).max(2.00).default(0.15)
    .describe('Hard cost ceiling for this ingest call; aborts if exceeded'),
  model: z.string().default('claude-sonnet-4-6-20250514')
    .describe('Anthropic model used for summarization + page synthesis'),
}).strict();

// ═══ QUERY ════════════════════════════════════════════════════════════════

const QuerySchema = z.object({
  action: z.literal('query'),
  question: z.string().min(1).max(2_000).describe('Natural-language question to answer from the wiki'),
  output_format: z.enum(['markdown', 'comparison_table', 'summary_with_sources'])
    .default('markdown'),
  max_pages: z.number().int().min(1).max(20).default(8)
    .describe('Max wiki pages to synthesize over'),
  persist_output: z.boolean().default(true)
    .describe('When true, files the answer back to wiki/{tenant}/wiki/outputs/ for compounding'),
  model: z.string().default('claude-sonnet-4-6-20250514'),
}).strict();

// ═══ LINT ═════════════════════════════════════════════════════════════════

const LintSchema = z.object({
  action: z.literal('lint'),
  scope: z.enum(['full', 'concepts', 'entities', 'sources', 'syntheses'])
    .default('full')
    .describe('Restrict lint to a wiki sub-section; "full" walks the entire tenant wiki'),
  publish_report: z.boolean().default(true)
    .describe('When true, writes wiki/{tenant}/wiki/lint-report.md and posts material findings to Slack'),
  staleness_days: z.number().int().min(1).max(365).default(90)
    .describe('Claims older than this many days vs newer raw sources are flagged as stale'),
}).strict();

// ═══ DISCRIMINATED UNION ══════════════════════════════════════════════════

export const ClientWikiInput = z.discriminatedUnion('action', [
  IngestSchema,
  QuerySchema,
  LintSchema,
]);
export type ClientWikiInput = z.infer<typeof ClientWikiInput>;

Tool registration

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import type { ToolContext } from '../lib/types';
import { ClientWikiInput } from './client-wiki.schema';
import { runIngest } from '../wiki/ingest';
import { runQuery } from '../wiki/query';
import { runLint } from '../wiki/lint';

export function register(server: McpServer, getContext: () => ToolContext): void {
  server.registerTool(
    'client_wiki',
    {
      description:
        'Per-tenant compounding knowledge wiki (V5 Pro). Three actions: ' +
        '`ingest` adds a source (URL/text/R2 file) and updates entities, concepts, sources pages; ' +
        '`query` synthesizes a cited answer over wiki pages and files the output back for compounding; ' +
        '`lint` scans for contradictions, orphans, stale claims, missing cross-refs. ' +
        'All actions scoped to the calling tenant\'s wiki/{tenant}/ R2 prefix.',
      inputSchema: ClientWikiInput,
    },
    async (rawArgs: unknown) => {
      const args = ClientWikiInput.parse(rawArgs);
      const ctx = getContext();
      switch (args.action) {
        case 'ingest': return runIngest(ctx, args);
        case 'query':  return runQuery(ctx, args);
        case 'lint':   return runLint(ctx, args);
      }
    },
  );
}

This mirrors the dispatch shape used by src/tools/claude.ts (lines 212+) and src/tools/agent-state.ts (lines 14–71). The discriminator name action (vs operation) matches the claude and agent_state precedent; aws_bda_analyze and aws_transcribe use operation because they have a nested-action collision that wiki does not.

Tool count after Phase 7 ships

Bucket	Count
Curated (current)	25
Platform (current)	8
Subtotal (today)	33
`client_wiki` (Phase 7, this ADR)	+1 (curated)
Total after Phase 7	34 / 35

Headroom: 1 tool. Next addition triggers either consolidation or invariant amendment via a fresh ADR.

Consequences

Positive

Phase 7 unblocks immediately. No invariant amendment required, no consumer-facing breaking change, no deprecation cycle.
Pattern consistency. Fourth tool to use the discriminated-union dispatch (claude, agent_state, aws_bda_analyze/aws_transcribe, now client_wiki). Lowers cognitive load for next session — one shape to learn, not three.
Cohesion matches the domain. All three actions share the R2 prefix wiki/{tenant}/, share the same per-tenant scoping, share the same content model. A separate client_wiki_lint would have re-implemented the same R2 traversal as client_wiki_ingest.
Cost cap surfacing. cost_cap_usd lives in the ingest schema as a first-class field — visible in tool docs, enforceable per call, not buried in tenant config.
Headroom preserved. 1 slot remains under the 35 ceiling for opportunistic use during Phase 8 or a hot-fix tool.

Negative

Larger single tool. client_wiki will be ~600–900 LOC across src/tools/client-wiki.ts + src/wiki/{ingest,query,lint}.ts. Comparable to the existing claude tool (~1,200 LOC) — large but not unprecedented.
Return shape varies by action. Ingest returns page-update receipts, query returns synthesized markdown + citation list, lint returns a finding-list with severities. Consumers must branch on the action they sent. This is the same trade-off the claude tool makes (invoke vs files vs batch return very different shapes); in practice MCP clients handle this fine because they always know which action they invoked.
Test surface concentrated. All three actions live in the same tool; one schema regression can break all three. Mitigated by keeping runIngest / runQuery / runLint as separately-tested modules under src/wiki/ with their own unit tests in test/wiki/.
Doc drift in the registry catalogue. Connected-app catalogues (Discover endpoint, ChatGPT app surface) will show one entry “client_wiki” instead of three. Acceptable trade-off for the ceiling discipline; the tool description enumerates all three actions so discoverability is preserved.

Invariant amendments needed

None. Invariant #7’s hard ceiling of 35 is preserved. The “31 MCP tools registered” line in .claude/rules/v5-invariants.md is updated as a count refresh (drift fix), not an invariant amendment:

7. **31 MCP tools registered** (23 curated + 8 platform). Adding a tool requires an ADR or `TOOLS.md` row documenting scope + owner. Hard ceiling: 35.
7. **34 MCP tools registered** (26 curated + 8 platform). Adding a tool requires an ADR or `TOOLS.md` row documenting scope + owner. Hard ceiling: 35.

(The 26 curated includes client_wiki once Phase 7 lands. Until then, the count is 33 — see “Implementation order” below.)

The same one-line change must land in docs/architecture/ASCEND-CLOUD-NATIVE-V2-ENGINEERING-PLAN.md Part II in the same PR per the canonical-source rule at the bottom of v5-invariants.md.

Implementation order

This ADR lands first (current PR). Updates v5-invariants.md count from 31 → 33 (drift fix only — doesn’t claim the new tool yet).
Phase 7.1 lands src/wiki/r2-store.ts (no tool registration, no count change).
Phase 7.2 + 7.3 + 7.4 land in a single PR that registers client_wiki and bumps v5-invariants.md from 33 → 34, updates docs/requirements/TOOLS.md to add the row, updates mcp.ts to register the new tool, and updates ASCEND-CLOUD-NATIVE-V2-ENGINEERING-PLAN.md in the same commit per drift-prevention.

Alternatives considered

Option A — Raise the ceiling (rejected)

Simplest in pure engineering terms — bump invariant #7 to 36 (or 40 for headroom), keep three separate tools. Rejected because:

The ceiling exists precisely to prevent the slow tool-count creep that degrades agent routing. Raising it on the first conflict teaches the codebase that the ceiling is a soft suggestion. Two more such amendments and we’re at 40, then 50.
Consumer surface area: every connected client (Cursor / ChatGPT / Codex / claude.ai / Claude iOS) sees the larger tool list. The cost is borne by every user every turn, not just by the gateway.
Zero engineering cost saved vs Option C — the wiki helper modules under src/wiki/ are needed regardless. The only saved work is writing one Zod discriminator.

Option B — Partial merge (ingest + query consolidated, lint separate) (rejected)

Stays at exactly 35 (33 + 2). Rejected because:

Lands at the ceiling with zero headroom — Phase 8 (skills + cache defaults) and any future tool addition immediately re-trigger this exact ADR.
Has no ergonomic benefit: ingest+query share the same R2 helpers as lint. Splitting lint off doesn’t simplify either side.
“Two-tool” precedent doesn’t exist in V5. Every previous polymorphic consolidation went all-in (claude covers invoke/batch/files/agents in one tool; agent_state covers store/retrieve/delete/list in one tool).

Option D — Trim 1+ low-use tools (rejected)

Forces audit of tool usage via D1 tool_traces aggregation. Rejected because:

Multi-week deprecation cycle. Removing a tool is a breaking change for every connected client. Cursor, Codex CLI, ChatGPT, claude.ai/code, and Claude iOS each have to be updated; we’d need a deprecation announcement, a deprecation header on the removed tool’s responses for ≥30 days, and a migration path for any callers.
Live customer dependency. dealcloud was added specifically for Point Field Partners (PE-backed client; see clients/pointfield/ config). microsoft_calendar is also Point Field. linkedin_ads and meta_ads are paid-tier features for Kahuna. There are no obviously low-use tools to trim — ADR-023 already considered and rejected this direction in April 2026.
Wrong tool for this problem. Phase 7 needs three new wiki capabilities; trimming an unrelated tool to make room is a non-sequitur.

If usage data later shows a tool is truly unused (zero tool_traces invocations across all tenants for ≥60 days), a future ADR can revisit trimming as a separate decision — but it’s not on the critical path for Phase 7.

Open questions for Mishaal

None blocking. Two product-side questions surfaced during this review that don’t gate the decision:

Pricing model for client_wiki_ingest. Plan calls for $0.15 cost cap per ingest. Should V5 Pro include unlimited ingest (gateway eats the LLM cost) or pass through with a per-ingest fee? Doesn’t gate the technical decision — the cost_cap_usd parameter is in the schema either way.
Slack notification routing for lint. Plan says material findings post to Slack. Which channel? #kahuna-internal per-tenant, or a single #v5-wiki-lint ops channel? Doesn’t gate the decision — admin endpoint can configure this per-tenant in tenant_config:{tenant}.wiki.lint_slack_channel.

References

src/tools/claude.ts lines 212+ — discriminated-union dispatch precedent
src/tools/agent-state.ts lines 5–71 — simpler enum-action precedent
docs/decisions/ADR-018-adr-016-signoff-and-phase-1-commit.md — claude_* consolidation rationale (rejected the 4-tool split, kept one polymorphic tool)
docs/decisions/ADR-023-retain-low-use-tools.md — established that trimming tools is a last resort
docs/decisions/ADR-033-async-aws-job-orchestration.md — codified the discriminated-union operation pattern for async tools
.claude/rules/v5-invariants.md invariant #7 — the ceiling under amendment