Vectorize — Complete Implementation Guide

Updated: 2026-05-10. Canonical reference for all Vectorize usage in the Ascend GTM Platform. When this file conflicts with VECTORIZE_NAMESPACE_REGISTRY.md, the registry wins for index definitions. This file covers usage, scoring, and the discover_apis semantic workflow.

Why Vectorize?

Vectorize is the semantic layer of the platform. It lets agents find the right tool for any task using natural language — without needing to know the exact tool name. Every tool in the catalog has an embedding in CAPABILITY_INDEX. When an agent calls discover_apis({query: "..."}), the gateway embeds the query with @cf/baai/bge-m3, runs a cosine similarity search, blends in historical usage priors from KV, and returns a ranked list with composite scores.

The 4 Vectorize Indexes

Binding	Index name	Dim	Purpose	Who writes	Who reads
`CAPABILITY_INDEX`	`capability_index`	1024	Semantic tool discovery	`scripts/embed-tool-catalog.ts`	`discover_apis`, `retrieveCapabilities()`
`MEMORY_INDEX`	`memory-index`	1024	Per-tenant episodic memory	`learnSemanticMemory()`	`recallSemanticMemory()`
`PATTERN_INDEX`	`pattern-bank`	1024	Eval quality patterns	`seed-pattern-bank.ts` cron	`harness-investigate.ts`, `harness-autofix.ts`
`VECTORIZE_INDEX`	`client-knowledge`	1024	Tenant knowledge base (RAG)	Ingestion pipeline / `/admin/knowledge`	`search_knowledge` tool

Full schema per index: docs/architecture/VECTORIZE_NAMESPACE_REGISTRY.md.

CAPABILITY_INDEX — Tool Discovery (Track B, ADR-042)

This is the most important index. It powers discover_apis’s semantic mode and makes the tool catalog unbounded — new tools can be added without hitting a registration ceiling.

How it works end-to-end

Agent: discover_apis({query: "query CRM contacts"})
                ↓
Gateway: retrieveCapabilities("query CRM contacts", env, { topK: 5 })
                ↓
 1. AI.run("@cf/baai/bge-m3", { text: ["query CRM contacts"] })
    → 1024-dim embedding vector
                ↓
 2. CAPABILITY_INDEX.query(vector, { topK: 5, returnMetadata: "all" })
    → vector matches with cosine scores + metadata
                ↓
 3. For each match: ASCEND_KV.get("capability_index:{tool_name}")
    → CapabilityPriors { usage_count_30d, success_rate, avg_latency_ms_p50, ... }
                ↓
 4. score_composite = score_vector × (1 + log(1 + usage_count_30d) × success_rate)
    (cold tools: score_composite ≈ score_vector; proven tools get sub-linear popularity boost)
                ↓
 5. Sort by score_composite DESC
                ↓
Agent receives: ranked tools with scores, connection status, and priors

Composite scoring formula

// From src/lib/capability-retrieval.ts
score_composite = score_vector * (1 + Math.log(1 + (priors?.usage_count_30d ?? 0)) * (priors?.success_rate ?? 1));

Cold tool (no priors): score_composite = score_vector × (1 + log(1) × 1) = score_vector × 1.0
Proven tool (150 uses, 97% success): score_composite ≈ score_vector × 1 + log(151) × 0.97 ≈ score_vector × 5.9
Boost is sub-linear (log scale) — a tool with 1000 uses doesn’t dominate a better semantic match

KV priors format

Written by recompute-capability-priors.ts cron (daily, 3am UTC):

KV key: capability_index:{tool_name}

{
  "tool_name": "hubspot_crm",
  "window_days": 30,
  "usage_count_30d": 150,
  "success_rate": 0.97,
  "error_count_30d": 5,
  "avg_latency_ms_p50": 280,
  "avg_latency_ms_p99": 950,
  "last_used_at": "2026-05-10T00:00:00.000Z",
  "last_updated": "2026-05-10T04:00:00.000Z"
}

Re-seeding the index

Run after adding a new tool to src/lib/tool-catalog.ts:

npm run typecheck                    # confirm types clean
tsx scripts/embed-tool-catalog.ts    # re-embed all 30 tools

The script reads TOOL_CATALOG from src/lib/tool-catalog.ts, calls AI.run("@cf/baai/bge-m3") for each entry, and upserts into CAPABILITY_INDEX via wrangler vectorize upsert. Idempotent (same tool_slug overwrites the previous vector). Requires CLOUDFLARE_API_TOKEN and CLOUDFLARE_ACCOUNT_ID in environment.

discover_apis — Two-Mode Interface

Semantic mode (preferred when unsure which tool to call)

// Pass a natural-language query
discover_apis({ query: "query CRM contacts" })
discover_apis({ query: "send transactional email" })
discover_apis({ query: "analyze website traffic and conversions" })
discover_apis({ query: "get google ads campaign performance" })
discover_apis({ query: "generate a presentation" })
discover_apis({ query: "what tool should I use to..." })  // meta-queries work too

// Optional: control result count (default 5, max 25)
discover_apis({ query: "crm contacts", topK: 10 })

Response shape (semantic mode):

{
  "tenant_id": "kahuna",
  "mode": "semantic",
  "query": "query CRM contacts",
  "result_count": 2,
  "recommended_tools": [
    {
      "tool_name": "hubspot_crm",
      "score_vector": 0.8812,
      "score_composite": 3.2451,
      "category": "curated",
      "provider": "hubspot",
      "connected": true,
      "accounts": [{ "id": "hs-1", "label": "HubSpot Prod" }],
      "usage_hint": "Call `hubspot_crm` with the appropriate parameters. Use discover_apis({domain: \"hubspot\"}) to see endpoints.",
      "priors": {
        "usage_count_30d": 150,
        "success_rate": 0.97,
        "avg_latency_ms_p50": 280
      }
    },
    {
      "tool_name": "salesforce_query",
      "score_vector": 0.7943,
      "score_composite": 0.7943,
      "category": "curated",
      "provider": "salesforce",
      "connected": false,
      "accounts": [],
      "usage_hint": "Call `salesforce_query` with the appropriate parameters. Use discover_apis({domain: \"salesforce\"}) to see endpoints.",
      "priors": null
    }
  ],
  "note": "Pick the tool with the highest score_composite. Check `connected: true` before calling. If connected, call the tool directly. If not connected, the provider must be authorized first."
}

Decision rule:

Pick the tool with the highest score_composite
Check connected: true — if false, the provider isn’t authorized for this tenant
Call the tool directly (no intermediate steps needed)

Catalog mode (when you know the provider or category)

discover_apis()                          // all providers
discover_apis({ domain: "hubspot" })     // exact match
discover_apis({ domain: "google_*" })    // all Google APIs
discover_apis({ domain: "crm" })         // CRM category
discover_apis({ category: "crm" })       // same, explicit category filter
discover_apis({ domain: "*" })           // all providers (explicit wildcard)

Backward compat rule: If domain or category is set, query is silently ignored. This ensures existing callers are unaffected by the new semantic mode.

capability-retrieval.ts — The Core Library

src/lib/capability-retrieval.ts is the single source of truth for all capability lookups.

Public API:

import { retrieveCapabilities, EMBEDDING_DIM } from '../lib/capability-retrieval';

const result = await retrieveCapabilities(
  "query CRM contacts",  // natural-language query
  env,                   // Env (needs AI + CAPABILITY_INDEX + ASCEND_KV bindings)
  { topK: 5 }            // options (default topK=5, max topK=25)
);

if (!result.success) {
  // result.code: 'CONFIG_MISSING' | 'EMBED_FAILED' | 'QUERY_FAILED'
  // result.error: human-readable message
  // result.hint: what to do instead
  return;
}

// result.matches: CapabilityMatch[] sorted by score_composite DESC
for (const match of result.matches) {
  console.log(match.tool_name, match.score_composite, match.metadata.provider);
  // match.priors is CapabilityPriors | null (null when no KV priors exist)
}

Error codes from retrieveCapabilities:

CONFIG_MISSING — AI binding or CAPABILITY_INDEX binding absent from env
EMBED_FAILED — AI.run(“@cf/baai/bge-m3”) threw or returned empty data
QUERY_FAILED — CAPABILITY_INDEX.query() threw

Constants:

export const EMBEDDING_DIM = 1024;                    // bge-m3 output dimension
export const EMBEDDING_MODEL = '@cf/baai/bge-m3';    // model identifier
export const DEFAULT_TOP_K = 5;
export const MAX_TOP_K = 25;

Admin Endpoints

Query the capability index directly

# Debug: query the capability index as an agent would
curl -X POST https://ascend-gateway-v5.ascendgtm.workers.dev/admin/capabilities/query \
  -H "Authorization: Bearer $ASCEND_GATEWAY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "send email", "topK": 5}'

Re-index all tools

curl -X POST https://ascend-gateway-v5.ascendgtm.workers.dev/admin/capabilities/reindex \
  -H "Authorization: Bearer $ASCEND_GATEWAY_TOKEN"

This calls embed-tool-catalog.ts server-side and re-seeds CAPABILITY_INDEX from src/lib/tool-catalog.ts. Use after adding a new tool row to the catalog.

MEMORY_INDEX — Per-Tenant Semantic Memory

Critical isolation rule: Every MEMORY_INDEX.query() MUST include filter: { tenant_id: tenantId }. Never call the raw binding — always go through recallSemanticMemory(tenantId, query, topK) in src/lib/memory-patterns.ts.

Write path: Non-blocking — always via ctx.waitUntil(learnSemanticMemory(...)). Never await in the hot path.

VECTORIZE_INDEX — Client Knowledge (RAG)

Powers the search_knowledge MCP tool. Same isolation model as MEMORY_INDEX — every query includes filter: { tenant_id: ctx.tenantId }. The search_knowledge tool enforces this internally (tenant from context, Invariant 3).

Adding a New Tool to the Capability Index

Add to TOOLS.md: New row in docs/requirements/TOOLS.md
Add to src/lib/tool-catalog.ts: New entry(...) call (update EXPECTED_TOOL_COUNT)
Register in gateway: src/handlers/mcp.ts + src/handlers/internal-tool.ts
Re-run embed script: tsx scripts/embed-tool-catalog.ts
Verify: npm test — the capabilities test asserts count parity

The Vectorize CI check (scripts/verify-capability-registry.mjs) runs after every test suite and will fail if the embedded count doesn’t match EXPECTED_TOOL_COUNT.

Debugging Vectorize

Check if CAPABILITY_INDEX has vectors:

wrangler vectorize info capability_index --name ascend-gateway-v5

Check KV priors for a specific tool:

wrangler kv:key get "capability_index:hubspot_crm" \
  --namespace-id $(wrangler kv:namespace list | jq -r '.[] | select(.title == "ASCEND_KV") | .id') \
  --name ascend-gateway-v5

Test semantic query from the gateway:

curl -X POST https://ascend-gateway-v5.ascendgtm.workers.dev/admin/capabilities/query \
  -H "Authorization: Bearer $ASCEND_GATEWAY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "google ads report", "topK": 3}'

Why a tool scores low:

Its embed text is too short or generic (edit src/lib/tool-catalog.ts purpose string)
Priors KV key is missing (run recompute-capability-priors.ts cron manually)
The embedding is stale (re-run embed-tool-catalog.ts)