diff --git a/.aider.conventions.md b/.aider.conventions.md
index b65b8b0..c51c98a 100644
--- a/.aider.conventions.md
+++ b/.aider.conventions.md
@@ -221,31 +221,28 @@ Key skills:
 
 ## MCP endpoints
 
-Two MCP servers expose this project's tooling, both reachable over Tailscale:
+Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
 
-- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
-  `brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
-  by the ingestion service directly.
-- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
-  `tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
-  `trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
-  migration.
+- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
+  `brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
+  `brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
+  service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
 - **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
-  the same four cost-routable skills as the supervisor (`review`, `debug`,
-  `retrospective`, `trainer`) but per-call decides whether to use a local
-  model or Claude based on the brain's `/pass-rate` response. Bearer auth
-  via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
-  endpoint; Mode 1 and Mode 3 do not.
+  `review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
+  or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
+  (opt-in). Only `mode client-local` registers this endpoint.
+
+The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
+skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
+the routing pod; brain tools moved to the brain MCP.
 
 The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
-`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
-shell scripts and non-MCP clients.
+`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
+for shell scripts and non-MCP clients.
 
-The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
-endpoint that aggregates `final_status` counts from session logs and returns
-`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
-reads this to decide whether to route skill calls to local models. Pass rate
-is `null` when no logged invocations are in the window.
+`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
+gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
+and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
 
 ## Agent instructions
 
diff --git a/.context/PROJECT.md b/.context/PROJECT.md
index 8e1b4e5..a0ad2b7 100644
--- a/.context/PROJECT.md
+++ b/.context/PROJECT.md
@@ -47,31 +47,28 @@
 
 ## MCP endpoints
 
-Two MCP servers expose this project's tooling, both reachable over Tailscale:
+Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
 
-- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
-  `brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
-  by the ingestion service directly.
-- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
-  `tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
-  `trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
-  migration.
+- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
+  `brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
+  `brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
+  service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
 - **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
-  the same four cost-routable skills as the supervisor (`review`, `debug`,
-  `retrospective`, `trainer`) but per-call decides whether to use a local
-  model or Claude based on the brain's `/pass-rate` response. Bearer auth
-  via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
-  endpoint; Mode 1 and Mode 3 do not.
+  `review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
+  or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
+  (opt-in). Only `mode client-local` registers this endpoint.
+
+The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
+skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
+the routing pod; brain tools moved to the brain MCP.
 
 The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
-`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
-shell scripts and non-MCP clients.
+`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
+for shell scripts and non-MCP clients.
 
-The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
-endpoint that aggregates `final_status` counts from session logs and returns
-`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
-reads this to decide whether to route skill calls to local models. Pass rate
-is `null` when no logged invocations are in the window.
+`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
+gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
+and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
 
 ## Agent instructions
 
diff --git a/.context/system-prompt.txt b/.context/system-prompt.txt
index 1990200..26ed0bd 100644
--- a/.context/system-prompt.txt
+++ b/.context/system-prompt.txt
@@ -226,31 +226,28 @@ Key skills:
 
 ## MCP endpoints
 
-Two MCP servers expose this project's tooling, both reachable over Tailscale:
+Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
 
-- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
-  `brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
-  by the ingestion service directly.
-- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
-  `tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
-  `trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
-  migration.
+- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
+  `brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
+  `brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
+  service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
 - **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
-  the same four cost-routable skills as the supervisor (`review`, `debug`,
-  `retrospective`, `trainer`) but per-call decides whether to use a local
-  model or Claude based on the brain's `/pass-rate` response. Bearer auth
-  via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
-  endpoint; Mode 1 and Mode 3 do not.
+  `review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
+  or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
+  (opt-in). Only `mode client-local` registers this endpoint.
+
+The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
+skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
+the routing pod; brain tools moved to the brain MCP.
 
 The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
-`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
-shell scripts and non-MCP clients.
+`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
+for shell scripts and non-MCP clients.
 
-The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
-endpoint that aggregates `final_status` counts from session logs and returns
-`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
-reads this to decide whether to route skill calls to local models. Pass rate
-is `null` when no logged invocations are in the window.
+`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
+gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
+and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
 
 ## Agent instructions
 
diff --git a/.cursorrules b/.cursorrules
index 7f335ff..b05c0f3 100644
--- a/.cursorrules
+++ b/.cursorrules
@@ -224,31 +224,28 @@ Key skills:
 
 ## MCP endpoints
 
-Two MCP servers expose this project's tooling, both reachable over Tailscale:
+Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
 
-- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
-  `brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
-  by the ingestion service directly.
-- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
-  `tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
-  `trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
-  migration.
+- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
+  `brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
+  `brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
+  service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
 - **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
-  the same four cost-routable skills as the supervisor (`review`, `debug`,
-  `retrospective`, `trainer`) but per-call decides whether to use a local
-  model or Claude based on the brain's `/pass-rate` response. Bearer auth
-  via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
-  endpoint; Mode 1 and Mode 3 do not.
+  `review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
+  or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
+  (opt-in). Only `mode client-local` registers this endpoint.
+
+The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
+skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
+the routing pod; brain tools moved to the brain MCP.
 
 The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
-`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
-shell scripts and non-MCP clients.
+`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
+for shell scripts and non-MCP clients.
 
-The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
-endpoint that aggregates `final_status` counts from session logs and returns
-`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
-reads this to decide whether to route skill calls to local models. Pass rate
-is `null` when no logged invocations are in the window.
+`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
+gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
+and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
 
 ## Agent instructions
 
diff --git a/AGENTS.md b/AGENTS.md
index b65b8b0..c51c98a 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -221,31 +221,28 @@ Key skills:
 
 ## MCP endpoints
 
-Two MCP servers expose this project's tooling, both reachable over Tailscale:
+Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
 
-- **`brain`** at `http://koala:30330/mcp` — preferred path for `brain_query`,
-  `brain_write`, `brain_ingest`, `brain_ingest_raw`, and `session_log`. Hosted
-  by the ingestion service directly.
-- **`supervisor`** at `http://koala:30320/mcp` — skill workers (`tdd_red`,
-  `tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
-  `trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
-  migration.
+- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
+  `brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
+  `brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
+  service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
 - **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
-  the same four cost-routable skills as the supervisor (`review`, `debug`,
-  `retrospective`, `trainer`) but per-call decides whether to use a local
-  model or Claude based on the brain's `/pass-rate` response. Bearer auth
-  via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
-  endpoint; Mode 1 and Mode 3 do not.
+  `review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
+  or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
+  (opt-in). Only `mode client-local` registers this endpoint.
+
+The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
+skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
+the routing pod; brain tools moved to the brain MCP.
 
 The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
-`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
-shell scripts and non-MCP clients.
+`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
+for shell scripts and non-MCP clients.
 
-The brain HTTP REST API also serves a read-only `GET /pass-rate?skill=X&window=Y`
-endpoint that aggregates `final_status` counts from session logs and returns
-`{skill, window, pass, fail, skip, total, pass_rate}`. Plan 6 (routing pod)
-reads this to decide whether to route skill calls to local models. Pass rate
-is `null` when no logged invocations are in the window.
+`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
+gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
+and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
 
 ## Agent instructions