docs(routing): document Mode 2 routing pod + env vars
Add routing pod to README architecture diagram and env vars table. Add routing MCP endpoint to .context/PROJECT.md. Regenerate derived context adapters via task context:sync. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -227,6 +227,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||
migration.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||
endpoint; Mode 1 and Mode 3 do not.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||
|
||||
@@ -56,6 +56,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||
migration.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||
endpoint; Mode 1 and Mode 3 do not.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||
|
||||
@@ -232,6 +232,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||
migration.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||
endpoint; Mode 1 and Mode 3 do not.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||
|
||||
@@ -230,6 +230,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||
migration.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||
endpoint; Mode 1 and Mode 3 do not.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||
|
||||
@@ -227,6 +227,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||
migration.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||
endpoint; Mode 1 and Mode 3 do not.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||
|
||||
@@ -56,6 +56,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||
migration.
|
||||
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||
endpoint; Mode 1 and Mode 3 do not.
|
||||
|
||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||
|
||||
@@ -12,6 +12,7 @@ Your Claude Code session (in any project)
|
||||
│
|
||||
│ MCP over HTTP (Tailscale)
|
||||
├──▶ supervisor :3200 (NodePort 30320 on koala) — skill workers: tdd, debug, spec, …
|
||||
├──▶ routing :3210 (NodePort 30310 on koala) — Mode 2 only: review, debug, retrospective, trainer
|
||||
└──▶ brain :3300 (NodePort 30330 on koala) — brain_query, brain_write, brain_ingest, session_log
|
||||
│
|
||||
└─ also serves the legacy REST endpoints (/query, /write, /ingest, …)
|
||||
@@ -112,6 +113,14 @@ The supervisor probes connectivity at call time:
|
||||
| `INGEST_BASE_URL` | `http://localhost:3300` | Supervisor → ingestion |
|
||||
| `LITELLM_BASE_URL` | — | LiteLLM proxy for Tier 2 model routing |
|
||||
| `SUPERVISOR_MCP_TOKEN` | — | Optional bearer token for the supervisor MCP HTTP endpoint; when empty, no auth is enforced |
|
||||
| `ROUTING_PORT` | `3210` | Routing pod's listen port |
|
||||
| `ROUTING_MCP_TOKEN` | — | Optional bearer token for the routing MCP HTTP endpoint |
|
||||
| `BRAIN_URL` | `http://ingestion.supervisor:3300` | Routing pod → brain (in-cluster) |
|
||||
| `HYPERGUILD_LOCAL_MODEL` | `qwen35` | Local model for routed-to-local skill calls |
|
||||
| `HYPERGUILD_CLAUDE_MODEL` | `claude-sonnet-4-6` | Claude model for routed-to-Claude skill calls |
|
||||
| `HYPERGUILD_ROUTE_LOCAL_FLOOR` | `0.90` | At/above pass rate, route to local |
|
||||
| `HYPERGUILD_ROUTE_LOCAL_CEIL` | `0.70` | Below pass rate, route to Claude. Between CEIL and FLOOR is the sample band. |
|
||||
| `HYPERGUILD_PASS_RATE_TTL_SECONDS` | `60` | Per-skill pass-rate cache TTL |
|
||||
|
||||
## Phase 2 (planned)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user