Compare commits
17 Commits
b6bcc93048
...
bee4bb3c1f
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
bee4bb3c1f | ||
|
|
d72454d929 | ||
|
|
cf94d14922 | ||
|
|
78a43d6a42 | ||
|
|
ca933eef46 | ||
|
|
88782de07c | ||
|
|
083c2d7db9 | ||
|
|
751f410ca6 | ||
|
|
3a99d5e20e | ||
|
|
9a258ca32a | ||
|
|
2a5a74f7c0 | ||
|
|
d40a5ac890 | ||
|
|
b77820534a | ||
|
|
db64ecb1d9 | ||
|
|
ea29e5ebb8 | ||
|
|
ccf080db59 | ||
|
|
69c038478b |
@@ -227,6 +227,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
|||||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||||
migration.
|
migration.
|
||||||
|
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||||
|
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||||
|
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||||
|
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||||
|
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||||
|
endpoint; Mode 1 and Mode 3 do not.
|
||||||
|
|
||||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||||
|
|||||||
@@ -56,6 +56,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
|||||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||||
migration.
|
migration.
|
||||||
|
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||||
|
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||||
|
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||||
|
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||||
|
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||||
|
endpoint; Mode 1 and Mode 3 do not.
|
||||||
|
|
||||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||||
|
|||||||
@@ -232,6 +232,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
|||||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||||
migration.
|
migration.
|
||||||
|
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||||
|
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||||
|
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||||
|
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||||
|
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||||
|
endpoint; Mode 1 and Mode 3 do not.
|
||||||
|
|
||||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||||
|
|||||||
@@ -230,6 +230,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
|||||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||||
migration.
|
migration.
|
||||||
|
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||||
|
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||||
|
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||||
|
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||||
|
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||||
|
endpoint; Mode 1 and Mode 3 do not.
|
||||||
|
|
||||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||||
|
|||||||
@@ -15,6 +15,7 @@ jobs:
|
|||||||
SERVICE: supervisor
|
SERVICE: supervisor
|
||||||
IMAGE: gitea.d-ma.be/mathias/supervisor
|
IMAGE: gitea.d-ma.be/mathias/supervisor
|
||||||
INGESTION_IMAGE: gitea.d-ma.be/mathias/ingestion
|
INGESTION_IMAGE: gitea.d-ma.be/mathias/ingestion
|
||||||
|
ROUTING_IMAGE: gitea.d-ma.be/mathias/routing
|
||||||
INFRA_REPO: git@gitea.d-ma.be:mathias/infra.git
|
INFRA_REPO: git@gitea.d-ma.be:mathias/infra.git
|
||||||
BUILDKIT_HOST: unix:///run/buildkit/buildkitd.sock
|
BUILDKIT_HOST: unix:///run/buildkit/buildkitd.sock
|
||||||
steps:
|
steps:
|
||||||
@@ -62,6 +63,28 @@ jobs:
|
|||||||
|
|
||||||
echo "Built and pushed ${INGESTION_IMAGE}:${IMAGE_TAG}"
|
echo "Built and pushed ${INGESTION_IMAGE}:${IMAGE_TAG}"
|
||||||
|
|
||||||
|
- name: Build and push routing image
|
||||||
|
run: |
|
||||||
|
set -e
|
||||||
|
trap 'rm -f /tmp/routing-image.tar' EXIT
|
||||||
|
IMAGE_TAG="${{ github.sha }}"
|
||||||
|
echo "Building ${ROUTING_IMAGE}:${IMAGE_TAG}"
|
||||||
|
|
||||||
|
buildctl --addr "${BUILDKIT_HOST}" build \
|
||||||
|
--frontend dockerfile.v0 \
|
||||||
|
--local context=. \
|
||||||
|
--local dockerfile=. \
|
||||||
|
--opt filename=Dockerfile.routing \
|
||||||
|
--opt build-arg:VERSION="${IMAGE_TAG}" \
|
||||||
|
--output type=oci,dest=/tmp/routing-image.tar
|
||||||
|
|
||||||
|
skopeo copy \
|
||||||
|
oci-archive:/tmp/routing-image.tar \
|
||||||
|
docker://${ROUTING_IMAGE}:${IMAGE_TAG} \
|
||||||
|
--dest-creds "${{ secrets.REGISTRY_CREDS }}"
|
||||||
|
|
||||||
|
echo "Built and pushed ${ROUTING_IMAGE}:${IMAGE_TAG}"
|
||||||
|
|
||||||
- name: Update infra repo
|
- name: Update infra repo
|
||||||
run: |
|
run: |
|
||||||
set -e
|
set -e
|
||||||
@@ -83,10 +106,15 @@ jobs:
|
|||||||
sed -i "s|gitea.d-ma.be/mathias/ingestion:.*|gitea.d-ma.be/mathias/ingestion:${IMAGE_TAG}|" \
|
sed -i "s|gitea.d-ma.be/mathias/ingestion:.*|gitea.d-ma.be/mathias/ingestion:${IMAGE_TAG}|" \
|
||||||
"k3s/apps/${SERVICE}/ingestion-deployment.yaml"
|
"k3s/apps/${SERVICE}/ingestion-deployment.yaml"
|
||||||
|
|
||||||
|
sed -i "s|gitea.d-ma.be/mathias/routing:.*|gitea.d-ma.be/mathias/routing:${IMAGE_TAG}|" \
|
||||||
|
"k3s/apps/routing/deployment.yaml"
|
||||||
|
|
||||||
git config user.email "cd-bot@d-ma.be"
|
git config user.email "cd-bot@d-ma.be"
|
||||||
git config user.name "CD Bot"
|
git config user.name "CD Bot"
|
||||||
git add "k3s/apps/${SERVICE}/deployment.yaml" "k3s/apps/${SERVICE}/ingestion-deployment.yaml"
|
git add "k3s/apps/${SERVICE}/deployment.yaml" \
|
||||||
git commit -m "chore(deploy): ${SERVICE}+ingestion → ${IMAGE_TAG}"
|
"k3s/apps/${SERVICE}/ingestion-deployment.yaml" \
|
||||||
|
"k3s/apps/routing/deployment.yaml"
|
||||||
|
git commit -m "chore(deploy): supervisor+ingestion+routing → ${IMAGE_TAG}"
|
||||||
GIT_SSH_COMMAND="ssh -i ~/.ssh/infra_deploy_key -o IdentitiesOnly=yes" \
|
GIT_SSH_COMMAND="ssh -i ~/.ssh/infra_deploy_key -o IdentitiesOnly=yes" \
|
||||||
git push
|
git push
|
||||||
|
|
||||||
|
|||||||
@@ -227,6 +227,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
|||||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||||
migration.
|
migration.
|
||||||
|
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||||
|
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||||
|
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||||
|
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||||
|
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||||
|
endpoint; Mode 1 and Mode 3 do not.
|
||||||
|
|
||||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||||
|
|||||||
@@ -56,6 +56,12 @@ Two MCP servers expose this project's tooling, both reachable over Tailscale:
|
|||||||
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
`tdd_green`, `tdd_refactor`, `review`, `debug`, `spec`, `retrospective`,
|
||||||
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
`trainer`, `tier`). Will shrink as skill workers move to SKILL.md in a later
|
||||||
migration.
|
migration.
|
||||||
|
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
|
||||||
|
the same four cost-routable skills as the supervisor (`review`, `debug`,
|
||||||
|
`retrospective`, `trainer`) but per-call decides whether to use a local
|
||||||
|
model or Claude based on the brain's `/pass-rate` response. Bearer auth
|
||||||
|
via `ROUTING_MCP_TOKEN` (opt-in). Only `mode client-local` registers this
|
||||||
|
endpoint; Mode 1 and Mode 3 do not.
|
||||||
|
|
||||||
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
|
||||||
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
`/ingest-path`, `/backfill-refs`) remains available on the same port (3300) for
|
||||||
|
|||||||
25
DECISIONS.md
25
DECISIONS.md
@@ -67,6 +67,31 @@ Record *why* things are the way they are. Future-you will thank present-you.
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Plan 6: routing pod reuses internal/skills/{review,debug,retrospective,trainer}
|
||||||
|
|
||||||
|
Plan 6 (Mode 2 routing pod, 2026-05-04) introduces a second consumer of
|
||||||
|
the four cost-routable skill packages. The routing pod constructs each
|
||||||
|
skill via `<pkg>.New(Config{...})` and hands it `routing.Router.Run` as
|
||||||
|
the `CompleteFunc`. Plan 7 (supervisor retirement) MUST NOT delete the
|
||||||
|
four packages.
|
||||||
|
|
||||||
|
**Plan 7's allowed deletions:**
|
||||||
|
- `internal/skills/{tdd,spec,tier}/` (not consumed by the routing pod)
|
||||||
|
- `cmd/supervisor/` (binary)
|
||||||
|
- `Dockerfile` (supervisor's, at repo root — distinct from `Dockerfile.routing`)
|
||||||
|
- supervisor manifests in the infra repo
|
||||||
|
- NodePort `:30320`
|
||||||
|
|
||||||
|
**Plan 7's preserved code:**
|
||||||
|
- `internal/skills/{review,debug,retrospective,trainer}/`
|
||||||
|
- `internal/registry`
|
||||||
|
- `internal/mcp`
|
||||||
|
- `internal/exec/litellm.go`
|
||||||
|
- `internal/routing/` (entirely new in Plan 6)
|
||||||
|
- `cmd/routing/`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 2026-04-08 — Mistral Vibe gets its own adapter
|
## 2026-04-08 — Mistral Vibe gets its own adapter
|
||||||
|
|
||||||
**Context**: Vibe doesn't read `AGENTS.md` — it uses `~/.vibe/prompts/` and `~/.vibe/agents/` with TOML config.
|
**Context**: Vibe doesn't read `AGENTS.md` — it uses `~/.vibe/prompts/` and `~/.vibe/agents/` with TOML config.
|
||||||
|
|||||||
30
Dockerfile.routing
Normal file
30
Dockerfile.routing
Normal file
@@ -0,0 +1,30 @@
|
|||||||
|
# syntax=docker/dockerfile:1
|
||||||
|
|
||||||
|
# ── Build stage ───────────────────────────────────────────────────────────────
|
||||||
|
FROM golang:1.26-bookworm AS builder
|
||||||
|
|
||||||
|
ARG VERSION=dev
|
||||||
|
WORKDIR /src
|
||||||
|
|
||||||
|
COPY go.mod go.sum ./
|
||||||
|
RUN go mod download
|
||||||
|
|
||||||
|
COPY . .
|
||||||
|
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
|
||||||
|
go build -trimpath -ldflags="-s -w -X main.version=${VERSION}" \
|
||||||
|
-o /out/routing ./cmd/routing
|
||||||
|
|
||||||
|
# ── Runtime stage ─────────────────────────────────────────────────────────────
|
||||||
|
FROM gcr.io/distroless/base-debian12
|
||||||
|
|
||||||
|
COPY --from=builder /out/routing /usr/local/bin/routing
|
||||||
|
COPY config/ /app/config/
|
||||||
|
|
||||||
|
ENV SUPERVISOR_CONFIG_DIR=/app/config/supervisor
|
||||||
|
ENV ROUTING_PORT=3210
|
||||||
|
|
||||||
|
EXPOSE 3210
|
||||||
|
|
||||||
|
USER 65532:65532
|
||||||
|
|
||||||
|
ENTRYPOINT ["/usr/local/bin/routing"]
|
||||||
11
README.md
11
README.md
@@ -12,6 +12,7 @@ Your Claude Code session (in any project)
|
|||||||
│
|
│
|
||||||
│ MCP over HTTP (Tailscale)
|
│ MCP over HTTP (Tailscale)
|
||||||
├──▶ supervisor :3200 (NodePort 30320 on koala) — skill workers: tdd, debug, spec, …
|
├──▶ supervisor :3200 (NodePort 30320 on koala) — skill workers: tdd, debug, spec, …
|
||||||
|
├──▶ routing :3210 (NodePort 30310 on koala) — Mode 2 only: review, debug, retrospective, trainer
|
||||||
└──▶ brain :3300 (NodePort 30330 on koala) — brain_query, brain_write, brain_ingest, session_log
|
└──▶ brain :3300 (NodePort 30330 on koala) — brain_query, brain_write, brain_ingest, session_log
|
||||||
│
|
│
|
||||||
└─ also serves the legacy REST endpoints (/query, /write, /ingest, …)
|
└─ also serves the legacy REST endpoints (/query, /write, /ingest, …)
|
||||||
@@ -112,6 +113,16 @@ The supervisor probes connectivity at call time:
|
|||||||
| `INGEST_BASE_URL` | `http://localhost:3300` | Supervisor → ingestion |
|
| `INGEST_BASE_URL` | `http://localhost:3300` | Supervisor → ingestion |
|
||||||
| `LITELLM_BASE_URL` | — | LiteLLM proxy for Tier 2 model routing |
|
| `LITELLM_BASE_URL` | — | LiteLLM proxy for Tier 2 model routing |
|
||||||
| `SUPERVISOR_MCP_TOKEN` | — | Optional bearer token for the supervisor MCP HTTP endpoint; when empty, no auth is enforced |
|
| `SUPERVISOR_MCP_TOKEN` | — | Optional bearer token for the supervisor MCP HTTP endpoint; when empty, no auth is enforced |
|
||||||
|
| `ROUTING_PORT` | `3210` | Routing pod's listen port |
|
||||||
|
| `ROUTING_MCP_TOKEN` | — | Optional bearer token for the routing MCP HTTP endpoint |
|
||||||
|
| `BRAIN_URL` | `http://ingestion.supervisor:3300` | Routing pod → brain (in-cluster) |
|
||||||
|
| `HYPERGUILD_LOCAL_MODEL` | `qwen35` | Local model for routed-to-local skill calls |
|
||||||
|
| `HYPERGUILD_CLAUDE_MODEL` | `claude-sonnet-4-6` | Claude model for routed-to-Claude skill calls |
|
||||||
|
| `HYPERGUILD_ROUTE_LOCAL_FLOOR` | `0.90` | At/above pass rate, route to local |
|
||||||
|
| `HYPERGUILD_ROUTE_LOCAL_CEIL` | `0.70` | Below pass rate, route to Claude. Between CEIL and FLOOR is the sample band. |
|
||||||
|
| `HYPERGUILD_PASS_RATE_TTL_SECONDS` | `60` | Per-skill pass-rate cache TTL |
|
||||||
|
|
||||||
|
> **Operator note:** LiteLLM at `LITELLM_BASE_URL` must register both `HYPERGUILD_LOCAL_MODEL` and `HYPERGUILD_CLAUDE_MODEL` for routing to do useful work. If a model is missing, LiteLLM returns 4xx, the routing pod's local route fails, the fail-open retry on Claude likely also fails (since both are missing), and the only signal is `final_status: "fail"` on `_routing` entries in the brain.
|
||||||
|
|
||||||
## Phase 2 (planned)
|
## Phase 2 (planned)
|
||||||
|
|
||||||
|
|||||||
@@ -128,6 +128,11 @@ tasks:
|
|||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | jq .
|
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | jq .
|
||||||
|
|
||||||
|
smoke:routing:
|
||||||
|
desc: Boot the routing pod against live LiteLLM + brain and verify _routing logs land
|
||||||
|
cmds:
|
||||||
|
- bash scripts/smoke-routing.sh
|
||||||
|
|
||||||
# ── Git / Release ──────────────────────────────────────────────────────────
|
# ── Git / Release ──────────────────────────────────────────────────────────
|
||||||
|
|
||||||
tag:
|
tag:
|
||||||
|
|||||||
@@ -115,9 +115,10 @@ Flags:
|
|||||||
Modes:
|
Modes:
|
||||||
|
|
||||||
- **cloud** — brain MCP only. Claude Code with no routing.
|
- **cloud** — brain MCP only. Claude Code with no routing.
|
||||||
- **client-local** — brain + routing placeholder. The routing entry's
|
- **client-local** — brain + routing pod. The `routing` entry points at
|
||||||
URL points at `koala:30310/mcp`; a `_routing_pending` field marks it
|
`koala:30310/mcp` (the routing pod, deployed in Plan 6). The
|
||||||
as awaiting Plan 6 of the hyperguild migration.
|
`X-Hyperguild-Mode: client-local` header is forward-compat for future
|
||||||
|
modes; the pod treats absent or unknown values as `client-local`.
|
||||||
- **sovereign** — brain only, with a `_mode_note` explaining that this
|
- **sovereign** — brain only, with a `_mode_note` explaining that this
|
||||||
mode primarily uses Crush + LiteLLM and the `.mcp.json` is a Claude
|
mode primarily uses Crush + LiteLLM and the `.mcp.json` is a Claude
|
||||||
Code fallback for emergency offline use.
|
Code fallback for emergency offline use.
|
||||||
|
|||||||
@@ -80,7 +80,9 @@ func modeClientLocal(brainURL string) map[string]any {
|
|||||||
"routing": map[string]any{
|
"routing": map[string]any{
|
||||||
"url": "http://koala:30310/mcp",
|
"url": "http://koala:30310/mcp",
|
||||||
"description": "Mode 2 routing pod — routes skill calls to LiteLLM/local",
|
"description": "Mode 2 routing pod — routes skill calls to LiteLLM/local",
|
||||||
"_routing_pending": "Plan 6 — routing pod not deployed yet; this URL is a placeholder",
|
"headers": map[string]any{
|
||||||
|
"X-Hyperguild-Mode": "client-local",
|
||||||
|
},
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -39,7 +39,7 @@ func TestRunMode_Cloud_Default(t *testing.T) {
|
|||||||
assert.NotContains(t, got, "_mode_note")
|
assert.NotContains(t, got, "_mode_note")
|
||||||
}
|
}
|
||||||
|
|
||||||
func TestRunMode_ClientLocal_HasRoutingPlaceholder(t *testing.T) {
|
func TestRunMode_ClientLocal_HasRoutingEntry(t *testing.T) {
|
||||||
dir := t.TempDir()
|
dir := t.TempDir()
|
||||||
outPath := filepath.Join(dir, ".mcp.json")
|
outPath := filepath.Join(dir, ".mcp.json")
|
||||||
t.Setenv("BRAIN_URL", "http://koala:30330")
|
t.Setenv("BRAIN_URL", "http://koala:30330")
|
||||||
@@ -54,7 +54,32 @@ func TestRunMode_ClientLocal_HasRoutingPlaceholder(t *testing.T) {
|
|||||||
require.Contains(t, servers, "routing")
|
require.Contains(t, servers, "routing")
|
||||||
|
|
||||||
routing := servers["routing"].(map[string]any)
|
routing := servers["routing"].(map[string]any)
|
||||||
assert.Contains(t, routing, "_routing_pending")
|
assert.NotContains(t, routing, "_routing_pending", "placeholder should be removed once Plan 6 ships")
|
||||||
|
|
||||||
|
headers, ok := routing["headers"].(map[string]any)
|
||||||
|
require.True(t, ok, "routing entry should have headers block")
|
||||||
|
assert.Equal(t, "client-local", headers["X-Hyperguild-Mode"])
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestModeClientLocalHasRoutingHeader(t *testing.T) {
|
||||||
|
tmp := t.TempDir() + "/mcp.json"
|
||||||
|
out := &bytes.Buffer{}
|
||||||
|
stderr := &bytes.Buffer{}
|
||||||
|
require.NoError(t, runMode(context.Background(), []string{"client-local", "--out", tmp}, nil, out, stderr))
|
||||||
|
|
||||||
|
body, err := os.ReadFile(tmp)
|
||||||
|
require.NoError(t, err)
|
||||||
|
var doc map[string]any
|
||||||
|
require.NoError(t, json.Unmarshal(body, &doc))
|
||||||
|
|
||||||
|
servers := doc["mcpServers"].(map[string]any)
|
||||||
|
routing := servers["routing"].(map[string]any)
|
||||||
|
assert.Equal(t, "http://koala:30310/mcp", routing["url"])
|
||||||
|
assert.NotContains(t, routing, "_routing_pending", "placeholder should be removed once Plan 6 ships")
|
||||||
|
|
||||||
|
headers, ok := routing["headers"].(map[string]any)
|
||||||
|
require.True(t, ok, "routing entry should have headers block")
|
||||||
|
assert.Equal(t, "client-local", headers["X-Hyperguild-Mode"])
|
||||||
}
|
}
|
||||||
|
|
||||||
func TestRunMode_Sovereign_HasModeNote(t *testing.T) {
|
func TestRunMode_Sovereign_HasModeNote(t *testing.T) {
|
||||||
|
|||||||
123
cmd/routing/main.go
Normal file
123
cmd/routing/main.go
Normal file
@@ -0,0 +1,123 @@
|
|||||||
|
package main
|
||||||
|
|
||||||
|
// The internal/skills/{debug,retrospective,review,trainer} packages imported
|
||||||
|
// below are also imported by cmd/supervisor. Plan 7 (supervisor retirement)
|
||||||
|
// MUST NOT delete these four packages — the routing pod is their second
|
||||||
|
// consumer. Plan 7 deletes only internal/skills/{tdd,spec,tier} (the skills
|
||||||
|
// that don't route to local), the supervisor binary, and supervisor manifests.
|
||||||
|
// See docs/superpowers/specs/2026-05-04-mode-2-routing-pod-design.md (Constraints).
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"log/slog"
|
||||||
|
"net/http"
|
||||||
|
"os"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/mathiasbq/supervisor/internal/config"
|
||||||
|
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||||||
|
"github.com/mathiasbq/supervisor/internal/mcp"
|
||||||
|
"github.com/mathiasbq/supervisor/internal/registry"
|
||||||
|
"github.com/mathiasbq/supervisor/internal/routing"
|
||||||
|
"github.com/mathiasbq/supervisor/internal/skills/debug"
|
||||||
|
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
|
||||||
|
"github.com/mathiasbq/supervisor/internal/skills/review"
|
||||||
|
"github.com/mathiasbq/supervisor/internal/skills/trainer"
|
||||||
|
)
|
||||||
|
|
||||||
|
func main() {
|
||||||
|
logger := slog.New(slog.NewTextHandler(os.Stderr, nil))
|
||||||
|
slog.SetDefault(logger)
|
||||||
|
|
||||||
|
cfg, err := config.LoadRouting()
|
||||||
|
if err != nil {
|
||||||
|
logger.Error("config load failed", "err", err)
|
||||||
|
os.Exit(1)
|
||||||
|
}
|
||||||
|
|
||||||
|
configDir := envOr("SUPERVISOR_CONFIG_DIR", "/app/config/supervisor")
|
||||||
|
mustRead := func(path string) string {
|
||||||
|
b, err := os.ReadFile(configDir + "/" + path)
|
||||||
|
if err != nil {
|
||||||
|
logger.Error("read prompt failed", "path", path, "err", err)
|
||||||
|
os.Exit(1)
|
||||||
|
}
|
||||||
|
return string(b)
|
||||||
|
}
|
||||||
|
|
||||||
|
llm := iexec.NewLiteLLM(cfg.LiteLLMBaseURL, cfg.LiteLLMAPIKey, 0)
|
||||||
|
|
||||||
|
router := &routing.Router{
|
||||||
|
Fetcher: routing.NewFetcher(cfg.BrainURL, "7d", time.Duration(cfg.PassRateTTLSeconds)*time.Second),
|
||||||
|
Logger: routing.NewLogger(cfg.BrainURL),
|
||||||
|
Policy: routing.Policy{Floor: cfg.RouteLocalFloor, Ceil: cfg.RouteLocalCeil},
|
||||||
|
LocalModel: cfg.LocalModel,
|
||||||
|
ClaudeModel: cfg.ClaudeModel,
|
||||||
|
Complete: llm.Complete,
|
||||||
|
}
|
||||||
|
|
||||||
|
// Skill packages call CompleteFunc(ctx, model, system, user) — no session_id
|
||||||
|
// or project_root in the signature. Rather than modifying every skill's API
|
||||||
|
// (and inflating Plan 6's blast radius), the routing pod logs every decision
|
||||||
|
// under a fixed session_id "_routing". Operators query
|
||||||
|
// `GET /pass-rate?skill=_routing&window=...` to inspect routing health.
|
||||||
|
const routingSessionID = "_routing"
|
||||||
|
wrap := func(skillName string) routing.CompleteFunc {
|
||||||
|
return func(ctx context.Context, _, system, user string) (string, int64, error) {
|
||||||
|
// The model param is ignored: the router picks the model based on policy.
|
||||||
|
return router.Run(ctx, routing.RunInput{
|
||||||
|
Skill: skillName,
|
||||||
|
System: system,
|
||||||
|
User: user,
|
||||||
|
SessionID: routingSessionID,
|
||||||
|
ProjectRoot: "",
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
reg := registry.New()
|
||||||
|
reg.Register(review.New(review.Config{
|
||||||
|
SkillPrompt: mustRead("review.md"),
|
||||||
|
DefaultModel: cfg.LocalModel,
|
||||||
|
CompleteFunc: review.CompleteFunc(wrap("review")),
|
||||||
|
}))
|
||||||
|
reg.Register(debug.New(debug.Config{
|
||||||
|
SkillPrompt: mustRead("debug.md"),
|
||||||
|
DefaultModel: cfg.LocalModel,
|
||||||
|
CompleteFunc: debug.CompleteFunc(wrap("debug")),
|
||||||
|
}))
|
||||||
|
reg.Register(retrospective.New(retrospective.Config{
|
||||||
|
SkillPrompt: mustRead("retrospective.md"),
|
||||||
|
DefaultModel: cfg.LocalModel,
|
||||||
|
CompleteFunc: retrospective.CompleteFunc(wrap("retrospective")),
|
||||||
|
}))
|
||||||
|
reg.Register(trainer.New(trainer.Config{
|
||||||
|
ReaderPrompt: mustRead("trainer-reader.md"),
|
||||||
|
WriterPrompt: mustRead("trainer-writer.md"),
|
||||||
|
DefaultModel: cfg.LocalModel,
|
||||||
|
CompleteFunc: trainer.CompleteFunc(wrap("trainer")),
|
||||||
|
}))
|
||||||
|
|
||||||
|
srv := mcp.NewServer(reg, cfg.MCPAuthToken)
|
||||||
|
mux := http.NewServeMux()
|
||||||
|
mux.Handle("/mcp", srv)
|
||||||
|
mux.HandleFunc("/healthz", func(w http.ResponseWriter, _ *http.Request) {
|
||||||
|
w.WriteHeader(http.StatusOK)
|
||||||
|
})
|
||||||
|
|
||||||
|
addr := ":" + cfg.Port
|
||||||
|
logger.Info("routing pod starting", "addr", addr,
|
||||||
|
"local", cfg.LocalModel, "claude", cfg.ClaudeModel,
|
||||||
|
"floor", cfg.RouteLocalFloor, "ceil", cfg.RouteLocalCeil)
|
||||||
|
if err := http.ListenAndServe(addr, mux); err != nil { //nolint:gosec
|
||||||
|
logger.Error("server stopped", "err", err)
|
||||||
|
os.Exit(1)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func envOr(key, def string) string {
|
||||||
|
if v := os.Getenv(key); v != "" {
|
||||||
|
return v
|
||||||
|
}
|
||||||
|
return def
|
||||||
|
}
|
||||||
123
cmd/routing/main_test.go
Normal file
123
cmd/routing/main_test.go
Normal file
@@ -0,0 +1,123 @@
|
|||||||
|
package main_test
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
|
"io"
|
||||||
|
"net/http"
|
||||||
|
"net/http/httptest"
|
||||||
|
"os/exec"
|
||||||
|
"strings"
|
||||||
|
"testing"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/stretchr/testify/assert"
|
||||||
|
"github.com/stretchr/testify/require"
|
||||||
|
)
|
||||||
|
|
||||||
|
// TestRoutingPodEndToEnd boots the binary against fake LiteLLM + brain servers,
|
||||||
|
// calls tools/list and one tools/call, and verifies the brain saw a session_log POST.
|
||||||
|
func TestRoutingPodEndToEnd(t *testing.T) {
|
||||||
|
if testing.Short() {
|
||||||
|
t.Skip("end-to-end binary boot")
|
||||||
|
}
|
||||||
|
|
||||||
|
var brainHits int
|
||||||
|
llm := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||||
|
_ = json.NewEncoder(w).Encode(map[string]any{
|
||||||
|
"choices": []map[string]any{{"message": map[string]any{"role": "assistant", "content": "stub"}}},
|
||||||
|
})
|
||||||
|
}))
|
||||||
|
defer llm.Close()
|
||||||
|
|
||||||
|
brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
switch r.URL.Path {
|
||||||
|
case "/pass-rate":
|
||||||
|
brainHits++
|
||||||
|
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": 0.95})
|
||||||
|
case "/mcp":
|
||||||
|
brainHits++
|
||||||
|
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
|
||||||
|
}
|
||||||
|
}))
|
||||||
|
defer brain.Close()
|
||||||
|
|
||||||
|
bin := buildRouting(t)
|
||||||
|
cmd := exec.Command(bin)
|
||||||
|
cmd.Env = append(cmd.Env,
|
||||||
|
"ROUTING_PORT=33310",
|
||||||
|
"LITELLM_BASE_URL="+llm.URL,
|
||||||
|
"LITELLM_API_KEY=stub",
|
||||||
|
"BRAIN_URL="+brain.URL,
|
||||||
|
"SUPERVISOR_CONFIG_DIR=../../config/supervisor",
|
||||||
|
"PATH="+osPath(),
|
||||||
|
)
|
||||||
|
require.NoError(t, cmd.Start())
|
||||||
|
t.Cleanup(func() { _ = cmd.Process.Kill() })
|
||||||
|
|
||||||
|
require.NoError(t, waitForPort(t, "127.0.0.1:33310", 5*time.Second))
|
||||||
|
|
||||||
|
resp := mcpCall(t, "http://127.0.0.1:33310/mcp", `{"jsonrpc":"2.0","id":1,"method":"tools/list"}`)
|
||||||
|
assert.Contains(t, resp, `"review"`)
|
||||||
|
assert.Contains(t, resp, `"debug"`)
|
||||||
|
assert.Contains(t, resp, `"retrospective"`)
|
||||||
|
assert.Contains(t, resp, `"trainer"`)
|
||||||
|
|
||||||
|
resp = mcpCall(t, "http://127.0.0.1:33310/mcp", `{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"review","arguments":{"project_root":"/tmp","files":["README.md"]}}}`)
|
||||||
|
_ = resp // shape varies by skill; we only need a 200
|
||||||
|
|
||||||
|
// Wait briefly for the async session_log to land.
|
||||||
|
deadline := time.Now().Add(2 * time.Second)
|
||||||
|
for time.Now().Before(deadline) && brainHits < 2 {
|
||||||
|
time.Sleep(50 * time.Millisecond)
|
||||||
|
}
|
||||||
|
assert.GreaterOrEqual(t, brainHits, 2, "expected at least one /pass-rate hit and one /mcp session_log hit")
|
||||||
|
}
|
||||||
|
|
||||||
|
func buildRouting(t *testing.T) string {
|
||||||
|
t.Helper()
|
||||||
|
bin := t.TempDir() + "/routing"
|
||||||
|
out, err := exec.Command("go", "build", "-o", bin, "github.com/mathiasbq/supervisor/cmd/routing").CombinedOutput()
|
||||||
|
require.NoError(t, err, "build failed: %s", out)
|
||||||
|
return bin
|
||||||
|
}
|
||||||
|
|
||||||
|
func waitForPort(_ *testing.T, addr string, dur time.Duration) error {
|
||||||
|
deadline := time.Now().Add(dur)
|
||||||
|
for time.Now().Before(deadline) {
|
||||||
|
c, err := http.Get("http://" + addr + "/healthz") //nolint:noctx
|
||||||
|
if err == nil {
|
||||||
|
_ = c.Body.Close()
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
conn, err := http.NewRequest(http.MethodPost, "http://"+addr+"/mcp", strings.NewReader(`{}`))
|
||||||
|
if err == nil {
|
||||||
|
r, err := http.DefaultClient.Do(conn)
|
||||||
|
if err == nil {
|
||||||
|
_ = r.Body.Close()
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
}
|
||||||
|
time.Sleep(50 * time.Millisecond)
|
||||||
|
}
|
||||||
|
return context.DeadlineExceeded
|
||||||
|
}
|
||||||
|
|
||||||
|
func mcpCall(t *testing.T, url, body string) string {
|
||||||
|
t.Helper()
|
||||||
|
r, err := http.Post(url, "application/json", strings.NewReader(body)) //nolint:noctx
|
||||||
|
require.NoError(t, err)
|
||||||
|
defer func() { _ = r.Body.Close() }()
|
||||||
|
raw, err := io.ReadAll(r.Body)
|
||||||
|
require.NoError(t, err)
|
||||||
|
return string(raw)
|
||||||
|
}
|
||||||
|
|
||||||
|
func osPath() string {
|
||||||
|
for _, e := range append([]string{}, exec.Command("env").Env...) {
|
||||||
|
if strings.HasPrefix(e, "PATH=") {
|
||||||
|
return strings.TrimPrefix(e, "PATH=")
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return "/usr/bin:/bin"
|
||||||
|
}
|
||||||
84
internal/config/routing.go
Normal file
84
internal/config/routing.go
Normal file
@@ -0,0 +1,84 @@
|
|||||||
|
package config
|
||||||
|
|
||||||
|
import (
|
||||||
|
"fmt"
|
||||||
|
"os"
|
||||||
|
"strconv"
|
||||||
|
)
|
||||||
|
|
||||||
|
// RoutingConfig holds the runtime configuration for the routing pod.
|
||||||
|
// Separate from Config because the routing pod's surface differs from the supervisor's.
|
||||||
|
type RoutingConfig struct {
|
||||||
|
Port string // ROUTING_PORT, default 3210
|
||||||
|
MCPAuthToken string // ROUTING_MCP_TOKEN, optional bearer token
|
||||||
|
LiteLLMBaseURL string // LITELLM_BASE_URL, default http://piguard:4000
|
||||||
|
LiteLLMAPIKey string // LITELLM_API_KEY
|
||||||
|
BrainURL string // BRAIN_URL, default http://ingestion.supervisor:3300
|
||||||
|
LocalModel string // HYPERGUILD_LOCAL_MODEL, default qwen35
|
||||||
|
ClaudeModel string // HYPERGUILD_CLAUDE_MODEL, default claude-sonnet-4-6
|
||||||
|
// RouteLocalFloor and RouteLocalCeil intentionally invert the usual
|
||||||
|
// floor < ceil mathematical convention: Floor (default 0.90) is the
|
||||||
|
// UPPER boundary — at/above it, always route local; Ceil (default 0.70)
|
||||||
|
// is the LOWER boundary — below it, always route Claude. The band in
|
||||||
|
// between is the 50/50 sample zone. The naming follows the spec's policy
|
||||||
|
// vocabulary; see internal/routing/policy.go for the consumer.
|
||||||
|
RouteLocalFloor float64 // HYPERGUILD_ROUTE_LOCAL_FLOOR, default 0.90
|
||||||
|
RouteLocalCeil float64 // HYPERGUILD_ROUTE_LOCAL_CEIL, default 0.70
|
||||||
|
PassRateTTLSeconds int // HYPERGUILD_PASS_RATE_TTL_SECONDS, default 60
|
||||||
|
}
|
||||||
|
|
||||||
|
func LoadRouting() (RoutingConfig, error) {
|
||||||
|
cfg := RoutingConfig{
|
||||||
|
Port: envOr("ROUTING_PORT", "3210"),
|
||||||
|
MCPAuthToken: os.Getenv("ROUTING_MCP_TOKEN"),
|
||||||
|
LiteLLMBaseURL: envOr("LITELLM_BASE_URL", "http://piguard:4000"),
|
||||||
|
LiteLLMAPIKey: os.Getenv("LITELLM_API_KEY"),
|
||||||
|
BrainURL: envOr("BRAIN_URL", "http://ingestion.supervisor:3300"),
|
||||||
|
LocalModel: envOr("HYPERGUILD_LOCAL_MODEL", "qwen35"),
|
||||||
|
ClaudeModel: envOr("HYPERGUILD_CLAUDE_MODEL", "claude-sonnet-4-6"),
|
||||||
|
}
|
||||||
|
|
||||||
|
floor, err := parseFloatEnv("HYPERGUILD_ROUTE_LOCAL_FLOOR", 0.90)
|
||||||
|
if err != nil {
|
||||||
|
return RoutingConfig{}, err
|
||||||
|
}
|
||||||
|
cfg.RouteLocalFloor = floor
|
||||||
|
|
||||||
|
ceil, err := parseFloatEnv("HYPERGUILD_ROUTE_LOCAL_CEIL", 0.70)
|
||||||
|
if err != nil {
|
||||||
|
return RoutingConfig{}, err
|
||||||
|
}
|
||||||
|
cfg.RouteLocalCeil = ceil
|
||||||
|
|
||||||
|
ttl, err := parseIntEnv("HYPERGUILD_PASS_RATE_TTL_SECONDS", 60)
|
||||||
|
if err != nil {
|
||||||
|
return RoutingConfig{}, err
|
||||||
|
}
|
||||||
|
cfg.PassRateTTLSeconds = ttl
|
||||||
|
|
||||||
|
return cfg, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func parseFloatEnv(key string, def float64) (float64, error) {
|
||||||
|
v := os.Getenv(key)
|
||||||
|
if v == "" {
|
||||||
|
return def, nil
|
||||||
|
}
|
||||||
|
f, err := strconv.ParseFloat(v, 64)
|
||||||
|
if err != nil {
|
||||||
|
return 0, fmt.Errorf("config: %s: %w", key, err)
|
||||||
|
}
|
||||||
|
return f, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func parseIntEnv(key string, def int) (int, error) {
|
||||||
|
v := os.Getenv(key)
|
||||||
|
if v == "" {
|
||||||
|
return def, nil
|
||||||
|
}
|
||||||
|
n, err := strconv.Atoi(v)
|
||||||
|
if err != nil {
|
||||||
|
return 0, fmt.Errorf("config: %s: %w", key, err)
|
||||||
|
}
|
||||||
|
return n, nil
|
||||||
|
}
|
||||||
73
internal/config/routing_test.go
Normal file
73
internal/config/routing_test.go
Normal file
@@ -0,0 +1,73 @@
|
|||||||
|
package config_test
|
||||||
|
|
||||||
|
import (
|
||||||
|
"testing"
|
||||||
|
|
||||||
|
"github.com/mathiasbq/supervisor/internal/config"
|
||||||
|
"github.com/stretchr/testify/assert"
|
||||||
|
"github.com/stretchr/testify/require"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestLoadRoutingDefaults(t *testing.T) {
|
||||||
|
for _, k := range []string{
|
||||||
|
"ROUTING_PORT", "ROUTING_MCP_TOKEN", "LITELLM_BASE_URL", "LITELLM_API_KEY",
|
||||||
|
"BRAIN_URL", "HYPERGUILD_LOCAL_MODEL", "HYPERGUILD_CLAUDE_MODEL",
|
||||||
|
"HYPERGUILD_ROUTE_LOCAL_FLOOR", "HYPERGUILD_ROUTE_LOCAL_CEIL",
|
||||||
|
"HYPERGUILD_PASS_RATE_TTL_SECONDS",
|
||||||
|
} {
|
||||||
|
t.Setenv(k, "")
|
||||||
|
}
|
||||||
|
|
||||||
|
cfg, err := config.LoadRouting()
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.Equal(t, "3210", cfg.Port)
|
||||||
|
assert.Equal(t, "", cfg.MCPAuthToken)
|
||||||
|
assert.Equal(t, "http://piguard:4000", cfg.LiteLLMBaseURL)
|
||||||
|
assert.Equal(t, "http://ingestion.supervisor:3300", cfg.BrainURL)
|
||||||
|
assert.Equal(t, "qwen35", cfg.LocalModel)
|
||||||
|
assert.Equal(t, "claude-sonnet-4-6", cfg.ClaudeModel)
|
||||||
|
assert.InDelta(t, 0.90, cfg.RouteLocalFloor, 1e-9)
|
||||||
|
assert.InDelta(t, 0.70, cfg.RouteLocalCeil, 1e-9)
|
||||||
|
assert.Equal(t, 60, cfg.PassRateTTLSeconds)
|
||||||
|
assert.Equal(t, "", cfg.LiteLLMAPIKey)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestLoadRoutingFromEnv(t *testing.T) {
|
||||||
|
t.Setenv("ROUTING_PORT", "3250")
|
||||||
|
t.Setenv("ROUTING_MCP_TOKEN", "tok-xyz")
|
||||||
|
t.Setenv("LITELLM_BASE_URL", "http://localhost:4000")
|
||||||
|
t.Setenv("LITELLM_API_KEY", "lk")
|
||||||
|
t.Setenv("BRAIN_URL", "http://localhost:3300")
|
||||||
|
t.Setenv("HYPERGUILD_LOCAL_MODEL", "qwen2-7b")
|
||||||
|
t.Setenv("HYPERGUILD_CLAUDE_MODEL", "claude-opus-4-7")
|
||||||
|
t.Setenv("HYPERGUILD_ROUTE_LOCAL_FLOOR", "0.85")
|
||||||
|
t.Setenv("HYPERGUILD_ROUTE_LOCAL_CEIL", "0.65")
|
||||||
|
t.Setenv("HYPERGUILD_PASS_RATE_TTL_SECONDS", "30")
|
||||||
|
|
||||||
|
cfg, err := config.LoadRouting()
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.Equal(t, "3250", cfg.Port)
|
||||||
|
assert.Equal(t, "tok-xyz", cfg.MCPAuthToken)
|
||||||
|
assert.Equal(t, "http://localhost:4000", cfg.LiteLLMBaseURL)
|
||||||
|
assert.Equal(t, "lk", cfg.LiteLLMAPIKey)
|
||||||
|
assert.Equal(t, "http://localhost:3300", cfg.BrainURL)
|
||||||
|
assert.Equal(t, "qwen2-7b", cfg.LocalModel)
|
||||||
|
assert.Equal(t, "claude-opus-4-7", cfg.ClaudeModel)
|
||||||
|
assert.InDelta(t, 0.85, cfg.RouteLocalFloor, 1e-9)
|
||||||
|
assert.InDelta(t, 0.65, cfg.RouteLocalCeil, 1e-9)
|
||||||
|
assert.Equal(t, 30, cfg.PassRateTTLSeconds)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestLoadRoutingRejectsBadFloat(t *testing.T) {
|
||||||
|
t.Setenv("HYPERGUILD_ROUTE_LOCAL_FLOOR", "not-a-number")
|
||||||
|
_, err := config.LoadRouting()
|
||||||
|
require.Error(t, err)
|
||||||
|
assert.Contains(t, err.Error(), "HYPERGUILD_ROUTE_LOCAL_FLOOR")
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestLoadRoutingRejectsBadInt(t *testing.T) {
|
||||||
|
t.Setenv("HYPERGUILD_PASS_RATE_TTL_SECONDS", "not-a-number")
|
||||||
|
_, err := config.LoadRouting()
|
||||||
|
require.Error(t, err)
|
||||||
|
assert.Contains(t, err.Error(), "HYPERGUILD_PASS_RATE_TTL_SECONDS")
|
||||||
|
}
|
||||||
21
internal/routing/hash.go
Normal file
21
internal/routing/hash.go
Normal file
@@ -0,0 +1,21 @@
|
|||||||
|
package routing
|
||||||
|
|
||||||
|
import (
|
||||||
|
"crypto/sha256"
|
||||||
|
"encoding/binary"
|
||||||
|
)
|
||||||
|
|
||||||
|
// CanonicalHash returns a deterministic 64-bit hash of (system, user).
|
||||||
|
// Used to make sample-band routing decisions reproducible: identical input
|
||||||
|
// strings produce the same hash on every call, independent of process state.
|
||||||
|
//
|
||||||
|
// Inputs are joined with a 0x00 byte separator before hashing — distinguishes
|
||||||
|
// (system="ab", user="cd") from (system="abcd", user="").
|
||||||
|
func CanonicalHash(system, user string) uint64 {
|
||||||
|
h := sha256.New()
|
||||||
|
h.Write([]byte(system))
|
||||||
|
h.Write([]byte{0})
|
||||||
|
h.Write([]byte(user))
|
||||||
|
sum := h.Sum(nil)
|
||||||
|
return binary.BigEndian.Uint64(sum[:8])
|
||||||
|
}
|
||||||
46
internal/routing/hash_test.go
Normal file
46
internal/routing/hash_test.go
Normal file
@@ -0,0 +1,46 @@
|
|||||||
|
package routing_test
|
||||||
|
|
||||||
|
import (
|
||||||
|
"testing"
|
||||||
|
|
||||||
|
"github.com/mathiasbq/supervisor/internal/routing"
|
||||||
|
"github.com/stretchr/testify/assert"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestCanonicalHashDeterministic(t *testing.T) {
|
||||||
|
a := routing.CanonicalHash("system one", "user one")
|
||||||
|
b := routing.CanonicalHash("system one", "user one")
|
||||||
|
assert.Equal(t, a, b, "same inputs must produce same hash")
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestCanonicalHashDistinguishesInputs(t *testing.T) {
|
||||||
|
cases := [][2]string{
|
||||||
|
{"sys", "user"},
|
||||||
|
{"sys", "user2"},
|
||||||
|
{"sys2", "user"},
|
||||||
|
{"", "system\x00user"}, // separator collision attempt
|
||||||
|
{"system\x00user", ""},
|
||||||
|
}
|
||||||
|
seen := make(map[uint64]bool)
|
||||||
|
for _, c := range cases {
|
||||||
|
h := routing.CanonicalHash(c[0], c[1])
|
||||||
|
assert.False(t, seen[h], "collision on %v", c)
|
||||||
|
seen[h] = true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestCanonicalHashLowBitDistribution(t *testing.T) {
|
||||||
|
// Sanity check: across 1000 distinct inputs, low-bit split is roughly even.
|
||||||
|
zeros, ones := 0, 0
|
||||||
|
for i := 0; i < 1000; i++ {
|
||||||
|
h := routing.CanonicalHash("sys", string(rune('a'+(i%26)))+string(rune(i)))
|
||||||
|
if h&1 == 0 {
|
||||||
|
zeros++
|
||||||
|
} else {
|
||||||
|
ones++
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Allow ±15% deviation from 500/500. Tighter would be flaky on real data.
|
||||||
|
assert.InDelta(t, 500, zeros, 150)
|
||||||
|
assert.InDelta(t, 500, ones, 150)
|
||||||
|
}
|
||||||
79
internal/routing/log.go
Normal file
79
internal/routing/log.go
Normal file
@@ -0,0 +1,79 @@
|
|||||||
|
package routing
|
||||||
|
|
||||||
|
import (
|
||||||
|
"bytes"
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
|
"fmt"
|
||||||
|
"net/http"
|
||||||
|
"time"
|
||||||
|
)
|
||||||
|
|
||||||
|
// LogEntry describes a single routing decision to log via the brain MCP.
|
||||||
|
type LogEntry struct {
|
||||||
|
SessionID string
|
||||||
|
Skill string // the original skill the call routed (e.g., "review")
|
||||||
|
Decision string // "local" or "claude" or "claude_fallback"
|
||||||
|
Message string // free-form, e.g. "model=qwen35, pass_rate=0.94"
|
||||||
|
ProjectRoot string
|
||||||
|
DurationMs int64
|
||||||
|
Failed bool // true → final_status: "fail"; false → "skip"
|
||||||
|
}
|
||||||
|
|
||||||
|
// Logger posts session_log entries to a brain MCP at BrainURL + /mcp.
|
||||||
|
type Logger struct {
|
||||||
|
BrainURL string
|
||||||
|
HTTP *http.Client
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewLogger creates a Logger with a 2-second HTTP timeout.
|
||||||
|
func NewLogger(brainURL string) *Logger {
|
||||||
|
return &Logger{
|
||||||
|
BrainURL: brainURL,
|
||||||
|
HTTP: &http.Client{Timeout: 2 * time.Second},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// LogDecision posts a session_log MCP call. Errors are returned but the caller
|
||||||
|
// MUST NOT block real work on them — logging is best-effort.
|
||||||
|
func (l *Logger) LogDecision(ctx context.Context, e LogEntry) error {
|
||||||
|
status := "skip"
|
||||||
|
if e.Failed {
|
||||||
|
status = "fail"
|
||||||
|
}
|
||||||
|
payload := map[string]any{
|
||||||
|
"jsonrpc": "2.0",
|
||||||
|
"id": 1,
|
||||||
|
"method": "tools/call",
|
||||||
|
"params": map[string]any{
|
||||||
|
"name": "session_log",
|
||||||
|
"arguments": map[string]any{
|
||||||
|
"session_id": e.SessionID,
|
||||||
|
"skill": "_routing",
|
||||||
|
"phase": "decide",
|
||||||
|
"final_status": status,
|
||||||
|
"message": fmt.Sprintf("%s: %s — %s", e.Skill, e.Decision, e.Message),
|
||||||
|
"duration_ms": e.DurationMs,
|
||||||
|
"project_root": e.ProjectRoot,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
body, err := json.Marshal(payload)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("log: marshal: %w", err)
|
||||||
|
}
|
||||||
|
req, err := http.NewRequestWithContext(ctx, http.MethodPost, l.BrainURL+"/mcp", bytes.NewReader(body))
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("log: build request: %w", err)
|
||||||
|
}
|
||||||
|
req.Header.Set("Content-Type", "application/json")
|
||||||
|
resp, err := l.HTTP.Do(req)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("log: request: %w", err)
|
||||||
|
}
|
||||||
|
defer func() { _ = resp.Body.Close() }()
|
||||||
|
if resp.StatusCode != http.StatusOK {
|
||||||
|
return fmt.Errorf("log: server returned status %d", resp.StatusCode)
|
||||||
|
}
|
||||||
|
return nil
|
||||||
|
}
|
||||||
81
internal/routing/log_test.go
Normal file
81
internal/routing/log_test.go
Normal file
@@ -0,0 +1,81 @@
|
|||||||
|
package routing_test
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
|
"io"
|
||||||
|
"net/http"
|
||||||
|
"net/http/httptest"
|
||||||
|
"testing"
|
||||||
|
|
||||||
|
"github.com/mathiasbq/supervisor/internal/routing"
|
||||||
|
"github.com/stretchr/testify/assert"
|
||||||
|
"github.com/stretchr/testify/require"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestLoggerLogDecision(t *testing.T) {
|
||||||
|
var captured map[string]any
|
||||||
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
assert.Equal(t, http.MethodPost, r.Method)
|
||||||
|
assert.Equal(t, "/mcp", r.URL.Path)
|
||||||
|
body, _ := io.ReadAll(r.Body)
|
||||||
|
require.NoError(t, json.Unmarshal(body, &captured))
|
||||||
|
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{"content": []map[string]any{{"type": "text", "text": "ok"}}}})
|
||||||
|
}))
|
||||||
|
defer srv.Close()
|
||||||
|
|
||||||
|
l := routing.NewLogger(srv.URL)
|
||||||
|
err := l.LogDecision(context.Background(), routing.LogEntry{
|
||||||
|
SessionID: "sess-1",
|
||||||
|
Skill: "review",
|
||||||
|
Decision: "local",
|
||||||
|
Message: "model=qwen35, pass_rate=0.94",
|
||||||
|
ProjectRoot: "/home/x/proj",
|
||||||
|
DurationMs: 1234,
|
||||||
|
Failed: false,
|
||||||
|
})
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
params := captured["params"].(map[string]any)
|
||||||
|
assert.Equal(t, "tools/call", captured["method"])
|
||||||
|
assert.Equal(t, "session_log", params["name"])
|
||||||
|
|
||||||
|
args := params["arguments"].(map[string]any)
|
||||||
|
assert.Equal(t, "_routing", args["skill"])
|
||||||
|
assert.Equal(t, "decide", args["phase"])
|
||||||
|
assert.Equal(t, "skip", args["final_status"])
|
||||||
|
assert.Contains(t, args["message"].(string), "review: local")
|
||||||
|
assert.Equal(t, "sess-1", args["session_id"])
|
||||||
|
assert.Equal(t, "/home/x/proj", args["project_root"])
|
||||||
|
assert.Equal(t, float64(1234), args["duration_ms"])
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestLoggerLogFailure(t *testing.T) {
|
||||||
|
var captured map[string]any
|
||||||
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
body, _ := io.ReadAll(r.Body)
|
||||||
|
_ = json.Unmarshal(body, &captured)
|
||||||
|
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
|
||||||
|
}))
|
||||||
|
defer srv.Close()
|
||||||
|
|
||||||
|
l := routing.NewLogger(srv.URL)
|
||||||
|
err := l.LogDecision(context.Background(), routing.LogEntry{
|
||||||
|
SessionID: "s", Skill: "debug", Decision: "local", Message: "litellm down", Failed: true,
|
||||||
|
})
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
args := captured["params"].(map[string]any)["arguments"].(map[string]any)
|
||||||
|
assert.Equal(t, "fail", args["final_status"])
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestLoggerSurfacesUpstreamError(t *testing.T) {
|
||||||
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||||
|
http.Error(w, "down", http.StatusBadGateway)
|
||||||
|
}))
|
||||||
|
defer srv.Close()
|
||||||
|
|
||||||
|
l := routing.NewLogger(srv.URL)
|
||||||
|
err := l.LogDecision(context.Background(), routing.LogEntry{Skill: "x", SessionID: "y", Decision: "local"})
|
||||||
|
require.Error(t, err)
|
||||||
|
}
|
||||||
85
internal/routing/passrate.go
Normal file
85
internal/routing/passrate.go
Normal file
@@ -0,0 +1,85 @@
|
|||||||
|
package routing
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
|
"fmt"
|
||||||
|
"net/http"
|
||||||
|
"net/url"
|
||||||
|
"sync"
|
||||||
|
"time"
|
||||||
|
)
|
||||||
|
|
||||||
|
// Fetcher reads /pass-rate from the brain pod with a per-skill TTL cache.
|
||||||
|
type Fetcher struct {
|
||||||
|
BaseURL string
|
||||||
|
Window string
|
||||||
|
TTL time.Duration
|
||||||
|
HTTP *http.Client
|
||||||
|
|
||||||
|
mu sync.Mutex
|
||||||
|
cache map[string]cachedRate
|
||||||
|
}
|
||||||
|
|
||||||
|
type cachedRate struct {
|
||||||
|
value *float64
|
||||||
|
at time.Time
|
||||||
|
}
|
||||||
|
|
||||||
|
type passRateResponse struct {
|
||||||
|
PassRate *float64 `json:"pass_rate"`
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewFetcher returns a Fetcher that calls baseURL + /pass-rate with the
|
||||||
|
// given window string. If ttl is zero, defaults to 60 seconds. The HTTP
|
||||||
|
// client uses a 1-second total timeout.
|
||||||
|
func NewFetcher(baseURL, window string, ttl time.Duration) *Fetcher {
|
||||||
|
if ttl == 0 {
|
||||||
|
ttl = 60 * time.Second
|
||||||
|
}
|
||||||
|
return &Fetcher{
|
||||||
|
BaseURL: baseURL,
|
||||||
|
Window: window,
|
||||||
|
TTL: ttl,
|
||||||
|
HTTP: &http.Client{Timeout: time.Second},
|
||||||
|
cache: make(map[string]cachedRate),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Get returns the pass rate for the named skill, or nil if no data exists,
|
||||||
|
// or an error if the brain is unreachable. Caches successful results.
|
||||||
|
func (f *Fetcher) Get(ctx context.Context, skill string) (*float64, error) {
|
||||||
|
f.mu.Lock()
|
||||||
|
if c, ok := f.cache[skill]; ok && time.Since(c.at) < f.TTL {
|
||||||
|
v := c.value
|
||||||
|
f.mu.Unlock()
|
||||||
|
return v, nil
|
||||||
|
}
|
||||||
|
f.mu.Unlock()
|
||||||
|
|
||||||
|
u := fmt.Sprintf("%s/pass-rate?skill=%s&window=%s",
|
||||||
|
f.BaseURL, url.QueryEscape(skill), url.QueryEscape(f.Window))
|
||||||
|
req, err := http.NewRequestWithContext(ctx, http.MethodGet, u, nil)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("passrate: build request: %w", err)
|
||||||
|
}
|
||||||
|
resp, err := f.HTTP.Do(req)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("passrate: request: %w", err)
|
||||||
|
}
|
||||||
|
defer func() { _ = resp.Body.Close() }()
|
||||||
|
if resp.StatusCode != http.StatusOK {
|
||||||
|
return nil, fmt.Errorf("passrate: server returned status %d", resp.StatusCode)
|
||||||
|
}
|
||||||
|
|
||||||
|
var body passRateResponse
|
||||||
|
if err := json.NewDecoder(resp.Body).Decode(&body); err != nil {
|
||||||
|
return nil, fmt.Errorf("passrate: decode: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
f.mu.Lock()
|
||||||
|
f.cache[skill] = cachedRate{value: body.PassRate, at: time.Now()}
|
||||||
|
f.mu.Unlock()
|
||||||
|
|
||||||
|
return body.PassRate, nil
|
||||||
|
}
|
||||||
94
internal/routing/passrate_test.go
Normal file
94
internal/routing/passrate_test.go
Normal file
@@ -0,0 +1,94 @@
|
|||||||
|
package routing_test
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
|
"net/http"
|
||||||
|
"net/http/httptest"
|
||||||
|
"sync/atomic"
|
||||||
|
"testing"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/mathiasbq/supervisor/internal/routing"
|
||||||
|
"github.com/stretchr/testify/assert"
|
||||||
|
"github.com/stretchr/testify/require"
|
||||||
|
)
|
||||||
|
|
||||||
|
func TestFetcherGetReturnsPassRate(t *testing.T) {
|
||||||
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
assert.Equal(t, http.MethodGet, r.Method)
|
||||||
|
assert.Equal(t, "/pass-rate", r.URL.Path)
|
||||||
|
assert.Equal(t, "tdd", r.URL.Query().Get("skill"))
|
||||||
|
assert.Equal(t, "7d", r.URL.Query().Get("window"))
|
||||||
|
w.Header().Set("Content-Type", "application/json")
|
||||||
|
_ = json.NewEncoder(w).Encode(map[string]any{"skill": "tdd", "pass_rate": 0.94})
|
||||||
|
}))
|
||||||
|
defer srv.Close()
|
||||||
|
|
||||||
|
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||||||
|
pr, err := f.Get(context.Background(), "tdd")
|
||||||
|
require.NoError(t, err)
|
||||||
|
require.NotNil(t, pr)
|
||||||
|
assert.InDelta(t, 0.94, *pr, 1e-9)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestFetcherGetReturnsNilWhenNoData(t *testing.T) {
|
||||||
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
_ = json.NewEncoder(w).Encode(map[string]any{"skill": "novel", "pass_rate": nil})
|
||||||
|
}))
|
||||||
|
defer srv.Close()
|
||||||
|
|
||||||
|
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||||||
|
pr, err := f.Get(context.Background(), "novel")
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.Nil(t, pr)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestFetcherCachesWithinTTL(t *testing.T) {
|
||||||
|
var calls int32
|
||||||
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||||
|
atomic.AddInt32(&calls, 1)
|
||||||
|
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": 0.5})
|
||||||
|
}))
|
||||||
|
defer srv.Close()
|
||||||
|
|
||||||
|
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||||||
|
for i := 0; i < 5; i++ {
|
||||||
|
_, err := f.Get(context.Background(), "tdd")
|
||||||
|
require.NoError(t, err)
|
||||||
|
}
|
||||||
|
assert.Equal(t, int32(1), atomic.LoadInt32(&calls), "should hit upstream once and serve four times from cache")
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestFetcherFetchesAgainAfterTTLExpires(t *testing.T) {
|
||||||
|
var calls int32
|
||||||
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||||
|
atomic.AddInt32(&calls, 1)
|
||||||
|
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": 0.5})
|
||||||
|
}))
|
||||||
|
defer srv.Close()
|
||||||
|
|
||||||
|
// Tight TTL so the test stays fast.
|
||||||
|
f := routing.NewFetcher(srv.URL, "7d", 5*time.Millisecond)
|
||||||
|
_, err := f.Get(context.Background(), "tdd")
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.Equal(t, int32(1), atomic.LoadInt32(&calls))
|
||||||
|
|
||||||
|
// Sleep past TTL, then a second Get should hit upstream again.
|
||||||
|
time.Sleep(15 * time.Millisecond)
|
||||||
|
_, err = f.Get(context.Background(), "tdd")
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.Equal(t, int32(2), atomic.LoadInt32(&calls), "expected fresh upstream call after TTL expiry")
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestFetcherSurfacesUpstreamError(t *testing.T) {
|
||||||
|
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||||
|
http.Error(w, "boom", http.StatusInternalServerError)
|
||||||
|
}))
|
||||||
|
defer srv.Close()
|
||||||
|
|
||||||
|
f := routing.NewFetcher(srv.URL, "7d", time.Minute)
|
||||||
|
pr, err := f.Get(context.Background(), "tdd")
|
||||||
|
require.Error(t, err)
|
||||||
|
assert.Nil(t, pr)
|
||||||
|
}
|
||||||
47
internal/routing/policy.go
Normal file
47
internal/routing/policy.go
Normal file
@@ -0,0 +1,47 @@
|
|||||||
|
package routing
|
||||||
|
|
||||||
|
// Decision is the route picked for a single skill call.
|
||||||
|
type Decision int
|
||||||
|
|
||||||
|
const (
|
||||||
|
DecideLocal Decision = iota
|
||||||
|
DecideClaude
|
||||||
|
)
|
||||||
|
|
||||||
|
func (d Decision) String() string {
|
||||||
|
if d == DecideLocal {
|
||||||
|
return "local"
|
||||||
|
}
|
||||||
|
return "claude"
|
||||||
|
}
|
||||||
|
|
||||||
|
// Policy holds the floor/ceil thresholds for routing decisions.
|
||||||
|
//
|
||||||
|
// Rules (in order):
|
||||||
|
//
|
||||||
|
// 1. passRate == nil → DecideLocal (default-to-local for cost-routable skills)
|
||||||
|
// 2. *passRate >= Floor → DecideLocal (trust local)
|
||||||
|
// 3. *passRate < Ceil → DecideClaude (don't trust local)
|
||||||
|
// 4. otherwise (sample band) → requestHash low bit picks: 0=local, 1=claude
|
||||||
|
type Policy struct {
|
||||||
|
Floor float64
|
||||||
|
Ceil float64
|
||||||
|
}
|
||||||
|
|
||||||
|
// Decide returns the routing decision for a single call.
|
||||||
|
// requestHash is consulted only when passRate is in the sample band [Ceil, Floor).
|
||||||
|
func (p Policy) Decide(passRate *float64, requestHash uint64) Decision {
|
||||||
|
if passRate == nil {
|
||||||
|
return DecideLocal
|
||||||
|
}
|
||||||
|
if *passRate >= p.Floor {
|
||||||
|
return DecideLocal
|
||||||
|
}
|
||||||
|
if *passRate < p.Ceil {
|
||||||
|
return DecideClaude
|
||||||
|
}
|
||||||
|
if requestHash&1 == 0 {
|
||||||
|
return DecideLocal
|
||||||
|
}
|
||||||
|
return DecideClaude
|
||||||
|
}
|
||||||
36
internal/routing/policy_test.go
Normal file
36
internal/routing/policy_test.go
Normal file
@@ -0,0 +1,36 @@
|
|||||||
|
package routing_test
|
||||||
|
|
||||||
|
import (
|
||||||
|
"testing"
|
||||||
|
|
||||||
|
"github.com/mathiasbq/supervisor/internal/routing"
|
||||||
|
"github.com/stretchr/testify/assert"
|
||||||
|
)
|
||||||
|
|
||||||
|
func ptr(f float64) *float64 { return &f }
|
||||||
|
|
||||||
|
func TestPolicyDecide(t *testing.T) {
|
||||||
|
p := routing.Policy{Floor: 0.9, Ceil: 0.7}
|
||||||
|
|
||||||
|
cases := []struct {
|
||||||
|
name string
|
||||||
|
passRate *float64
|
||||||
|
hash uint64
|
||||||
|
want routing.Decision
|
||||||
|
}{
|
||||||
|
{"null pass rate → local", nil, 0, routing.DecideLocal},
|
||||||
|
{"null pass rate, hash irrelevant → local", nil, 0xDEADBEEF, routing.DecideLocal},
|
||||||
|
{"at floor → local", ptr(0.9), 0, routing.DecideLocal},
|
||||||
|
{"above floor → local", ptr(0.95), 0, routing.DecideLocal},
|
||||||
|
{"below ceil → claude", ptr(0.5), 0, routing.DecideClaude},
|
||||||
|
{"at ceil → sample-band even-hash → local", ptr(0.7), 0, routing.DecideLocal},
|
||||||
|
{"sample band, even hash → local", ptr(0.8), 2, routing.DecideLocal},
|
||||||
|
{"sample band, odd hash → claude", ptr(0.8), 3, routing.DecideClaude},
|
||||||
|
}
|
||||||
|
|
||||||
|
for _, tc := range cases {
|
||||||
|
t.Run(tc.name, func(t *testing.T) {
|
||||||
|
assert.Equal(t, tc.want, p.Decide(tc.passRate, tc.hash))
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
84
internal/routing/router.go
Normal file
84
internal/routing/router.go
Normal file
@@ -0,0 +1,84 @@
|
|||||||
|
package routing
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"fmt"
|
||||||
|
"log/slog"
|
||||||
|
)
|
||||||
|
|
||||||
|
// CompleteFunc matches the signature used by every skill package's Config.
|
||||||
|
type CompleteFunc func(ctx context.Context, model, system, user string) (string, int64, error)
|
||||||
|
|
||||||
|
// RunInput captures the per-call inputs the dispatch wrapper needs.
|
||||||
|
type RunInput struct {
|
||||||
|
Skill string
|
||||||
|
System string
|
||||||
|
User string
|
||||||
|
SessionID string
|
||||||
|
ProjectRoot string
|
||||||
|
}
|
||||||
|
|
||||||
|
// Router composes a pass-rate fetcher, a decision policy, a session logger,
|
||||||
|
// and a LiteLLM client. Skill packages receive Router.Run as their CompleteFunc.
|
||||||
|
type Router struct {
|
||||||
|
Fetcher *Fetcher
|
||||||
|
Logger *Logger
|
||||||
|
Policy Policy
|
||||||
|
LocalModel string
|
||||||
|
ClaudeModel string
|
||||||
|
Complete CompleteFunc
|
||||||
|
}
|
||||||
|
|
||||||
|
// Run executes one skill call: decides local vs claude, calls LiteLLM, logs the
|
||||||
|
// decision. On local-side error, falls open by retrying once on the Claude model.
|
||||||
|
func (r *Router) Run(ctx context.Context, in RunInput) (string, int64, error) {
|
||||||
|
pr, ferr := r.Fetcher.Get(ctx, in.Skill)
|
||||||
|
if ferr != nil {
|
||||||
|
slog.Warn("router: pass-rate unreachable, defaulting to local", "skill", in.Skill, "err", ferr)
|
||||||
|
pr = nil
|
||||||
|
}
|
||||||
|
hash := CanonicalHash(in.System, in.User)
|
||||||
|
decision := r.Policy.Decide(pr, hash)
|
||||||
|
|
||||||
|
model := r.ClaudeModel
|
||||||
|
if decision == DecideLocal {
|
||||||
|
model = r.LocalModel
|
||||||
|
}
|
||||||
|
|
||||||
|
out, ms, err := r.Complete(ctx, model, in.System, in.User)
|
||||||
|
if lerr := r.Logger.LogDecision(ctx, LogEntry{
|
||||||
|
SessionID: in.SessionID,
|
||||||
|
Skill: in.Skill,
|
||||||
|
Decision: decision.String(),
|
||||||
|
Message: fmt.Sprintf("model=%s, pass_rate=%s", model, formatPassRate(pr)),
|
||||||
|
ProjectRoot: in.ProjectRoot,
|
||||||
|
DurationMs: ms,
|
||||||
|
Failed: err != nil,
|
||||||
|
}); lerr != nil {
|
||||||
|
slog.Warn("router: log decision failed", "skill", in.Skill, "err", lerr)
|
||||||
|
}
|
||||||
|
|
||||||
|
if err != nil && decision == DecideLocal {
|
||||||
|
slog.Warn("router: local failed, falling open to claude", "skill", in.Skill, "err", err)
|
||||||
|
out, ms, err = r.Complete(ctx, r.ClaudeModel, in.System, in.User)
|
||||||
|
if lerr := r.Logger.LogDecision(ctx, LogEntry{
|
||||||
|
SessionID: in.SessionID,
|
||||||
|
Skill: in.Skill,
|
||||||
|
Decision: "claude_fallback",
|
||||||
|
Message: fmt.Sprintf("model=%s, after-local-error", r.ClaudeModel),
|
||||||
|
ProjectRoot: in.ProjectRoot,
|
||||||
|
DurationMs: ms,
|
||||||
|
Failed: err != nil,
|
||||||
|
}); lerr != nil {
|
||||||
|
slog.Warn("router: log decision failed", "skill", in.Skill, "err", lerr)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return out, ms, err
|
||||||
|
}
|
||||||
|
|
||||||
|
func formatPassRate(pr *float64) string {
|
||||||
|
if pr == nil {
|
||||||
|
return "null"
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("%.2f", *pr)
|
||||||
|
}
|
||||||
136
internal/routing/router_test.go
Normal file
136
internal/routing/router_test.go
Normal file
@@ -0,0 +1,136 @@
|
|||||||
|
package routing_test
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
|
"errors"
|
||||||
|
"net/http"
|
||||||
|
"net/http/httptest"
|
||||||
|
"sync"
|
||||||
|
"testing"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/mathiasbq/supervisor/internal/routing"
|
||||||
|
"github.com/stretchr/testify/assert"
|
||||||
|
"github.com/stretchr/testify/require"
|
||||||
|
)
|
||||||
|
|
||||||
|
type fakeLLM struct {
|
||||||
|
mu sync.Mutex
|
||||||
|
calls []struct{ Model, System, User string }
|
||||||
|
resp string
|
||||||
|
err error
|
||||||
|
errOn string // if non-empty, only the named model errors
|
||||||
|
}
|
||||||
|
|
||||||
|
func (f *fakeLLM) Complete(_ context.Context, model, system, user string) (string, int64, error) {
|
||||||
|
f.mu.Lock()
|
||||||
|
defer f.mu.Unlock()
|
||||||
|
f.calls = append(f.calls, struct{ Model, System, User string }{model, system, user})
|
||||||
|
if f.errOn == model {
|
||||||
|
return "", 0, f.err
|
||||||
|
}
|
||||||
|
if f.err != nil && f.errOn == "" {
|
||||||
|
return "", 0, f.err
|
||||||
|
}
|
||||||
|
return f.resp, 100, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func newRouter(t *testing.T, llm *fakeLLM, passRate float64) (*routing.Router, *httptest.Server, *httptest.Server) {
|
||||||
|
t.Helper()
|
||||||
|
brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
|
||||||
|
switch r.URL.Path {
|
||||||
|
case "/pass-rate":
|
||||||
|
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": passRate})
|
||||||
|
case "/mcp":
|
||||||
|
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
|
||||||
|
}
|
||||||
|
}))
|
||||||
|
t.Cleanup(brain.Close)
|
||||||
|
|
||||||
|
r := &routing.Router{
|
||||||
|
Fetcher: routing.NewFetcher(brain.URL, "7d", time.Minute),
|
||||||
|
Logger: routing.NewLogger(brain.URL),
|
||||||
|
Policy: routing.Policy{Floor: 0.9, Ceil: 0.7},
|
||||||
|
LocalModel: "qwen35",
|
||||||
|
ClaudeModel: "claude-sonnet-4-6",
|
||||||
|
Complete: llm.Complete,
|
||||||
|
}
|
||||||
|
return r, brain, brain
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestRouterRoutesLocalAtHighPassRate(t *testing.T) {
|
||||||
|
llm := &fakeLLM{resp: "ok"}
|
||||||
|
r, _, _ := newRouter(t, llm, 0.95)
|
||||||
|
|
||||||
|
out, _, err := r.Run(context.Background(), routing.RunInput{
|
||||||
|
Skill: "review", System: "sys", User: "user", SessionID: "s1", ProjectRoot: "/p",
|
||||||
|
})
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.Equal(t, "ok", out)
|
||||||
|
|
||||||
|
llm.mu.Lock()
|
||||||
|
defer llm.mu.Unlock()
|
||||||
|
require.Len(t, llm.calls, 1)
|
||||||
|
assert.Equal(t, "qwen35", llm.calls[0].Model)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestRouterRoutesClaudeAtLowPassRate(t *testing.T) {
|
||||||
|
llm := &fakeLLM{resp: "ok"}
|
||||||
|
r, _, _ := newRouter(t, llm, 0.3)
|
||||||
|
|
||||||
|
_, _, err := r.Run(context.Background(), routing.RunInput{
|
||||||
|
Skill: "review", System: "sys", User: "user", SessionID: "s2",
|
||||||
|
})
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
llm.mu.Lock()
|
||||||
|
defer llm.mu.Unlock()
|
||||||
|
require.Len(t, llm.calls, 1)
|
||||||
|
assert.Equal(t, "claude-sonnet-4-6", llm.calls[0].Model)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestRouterFailsOpenLocalErrorToClaude(t *testing.T) {
|
||||||
|
llm := &fakeLLM{resp: "ok-after-fallback", err: errors.New("local boom"), errOn: "qwen35"}
|
||||||
|
r, _, _ := newRouter(t, llm, 0.95) // would route local
|
||||||
|
|
||||||
|
out, _, err := r.Run(context.Background(), routing.RunInput{
|
||||||
|
Skill: "review", System: "sys", User: "user", SessionID: "s3",
|
||||||
|
})
|
||||||
|
require.NoError(t, err)
|
||||||
|
assert.Equal(t, "ok-after-fallback", out)
|
||||||
|
|
||||||
|
llm.mu.Lock()
|
||||||
|
defer llm.mu.Unlock()
|
||||||
|
require.Len(t, llm.calls, 2)
|
||||||
|
assert.Equal(t, "qwen35", llm.calls[0].Model)
|
||||||
|
assert.Equal(t, "claude-sonnet-4-6", llm.calls[1].Model)
|
||||||
|
}
|
||||||
|
|
||||||
|
func TestRouterDefaultsToLocalWhenBrainUnreachable(t *testing.T) {
|
||||||
|
// Brain returns 500 → fetcher errors → router treats pass rate as nil → local.
|
||||||
|
brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
|
||||||
|
http.Error(w, "down", http.StatusInternalServerError)
|
||||||
|
}))
|
||||||
|
defer brain.Close()
|
||||||
|
|
||||||
|
llm := &fakeLLM{resp: "ok"}
|
||||||
|
r := &routing.Router{
|
||||||
|
Fetcher: routing.NewFetcher(brain.URL, "7d", time.Minute),
|
||||||
|
Logger: routing.NewLogger(brain.URL),
|
||||||
|
Policy: routing.Policy{Floor: 0.9, Ceil: 0.7},
|
||||||
|
LocalModel: "qwen35",
|
||||||
|
ClaudeModel: "claude-sonnet-4-6",
|
||||||
|
Complete: llm.Complete,
|
||||||
|
}
|
||||||
|
|
||||||
|
_, _, err := r.Run(context.Background(), routing.RunInput{
|
||||||
|
Skill: "review", System: "sys", User: "user", SessionID: "s4",
|
||||||
|
})
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
llm.mu.Lock()
|
||||||
|
defer llm.mu.Unlock()
|
||||||
|
require.Len(t, llm.calls, 1)
|
||||||
|
assert.Equal(t, "qwen35", llm.calls[0].Model)
|
||||||
|
}
|
||||||
80
internal/routing/snapshot_test.go
Normal file
80
internal/routing/snapshot_test.go
Normal file
@@ -0,0 +1,80 @@
|
|||||||
|
package routing_test
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"encoding/json"
|
||||||
|
"os"
|
||||||
|
"sort"
|
||||||
|
"testing"
|
||||||
|
|
||||||
|
"github.com/mathiasbq/supervisor/internal/registry"
|
||||||
|
"github.com/mathiasbq/supervisor/internal/skills/debug"
|
||||||
|
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
|
||||||
|
"github.com/mathiasbq/supervisor/internal/skills/review"
|
||||||
|
"github.com/mathiasbq/supervisor/internal/skills/trainer"
|
||||||
|
"github.com/stretchr/testify/assert"
|
||||||
|
"github.com/stretchr/testify/require"
|
||||||
|
)
|
||||||
|
|
||||||
|
// TestToolsListMatchesSupervisorSnapshot pins the four routed skills' tool
|
||||||
|
// definitions to the supervisor's current advertisement. A deliberate schema
|
||||||
|
// change must be reflected here by updating testdata/tools_list.snapshot.json.
|
||||||
|
func TestToolsListMatchesSupervisorSnapshot(t *testing.T) {
|
||||||
|
complete := func(_ context.Context, _, _, _ string) (string, int64, error) {
|
||||||
|
return "", 0, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
reg := registry.New()
|
||||||
|
reg.Register(review.New(review.Config{
|
||||||
|
SkillPrompt: "stub",
|
||||||
|
DefaultModel: "stub",
|
||||||
|
CompleteFunc: complete,
|
||||||
|
}))
|
||||||
|
reg.Register(debug.New(debug.Config{
|
||||||
|
SkillPrompt: "stub",
|
||||||
|
DefaultModel: "stub",
|
||||||
|
CompleteFunc: complete,
|
||||||
|
}))
|
||||||
|
reg.Register(retrospective.New(retrospective.Config{
|
||||||
|
SkillPrompt: "stub",
|
||||||
|
DefaultModel: "stub",
|
||||||
|
CompleteFunc: complete,
|
||||||
|
}))
|
||||||
|
reg.Register(trainer.New(trainer.Config{
|
||||||
|
ReaderPrompt: "stub",
|
||||||
|
WriterPrompt: "stub",
|
||||||
|
DefaultModel: "stub",
|
||||||
|
CompleteFunc: complete,
|
||||||
|
}))
|
||||||
|
|
||||||
|
wanted := map[string]bool{
|
||||||
|
"review": true,
|
||||||
|
"debug": true,
|
||||||
|
"retrospective": true,
|
||||||
|
"trainer": true,
|
||||||
|
}
|
||||||
|
var routed []registry.ToolDef
|
||||||
|
for _, td := range reg.Tools() {
|
||||||
|
if wanted[td.Name] {
|
||||||
|
routed = append(routed, td)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
sort.Slice(routed, func(i, j int) bool { return routed[i].Name < routed[j].Name })
|
||||||
|
|
||||||
|
got, err := json.MarshalIndent(routed, "", " ")
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
want, err := os.ReadFile("testdata/tools_list.snapshot.json")
|
||||||
|
require.NoError(t, err)
|
||||||
|
|
||||||
|
// Normalize both via re-encode so whitespace differences don't dominate.
|
||||||
|
var gotV, wantV any
|
||||||
|
require.NoError(t, json.Unmarshal(got, &gotV))
|
||||||
|
require.NoError(t, json.Unmarshal(want, &wantV))
|
||||||
|
|
||||||
|
gotN, _ := json.MarshalIndent(gotV, "", " ")
|
||||||
|
wantN, _ := json.MarshalIndent(wantV, "", " ")
|
||||||
|
|
||||||
|
assert.Equal(t, string(wantN), string(gotN),
|
||||||
|
"tool advertisement drifted from supervisor snapshot — update testdata/tools_list.snapshot.json deliberately if the schema change is intentional")
|
||||||
|
}
|
||||||
97
internal/routing/testdata/tools_list.snapshot.json
vendored
Normal file
97
internal/routing/testdata/tools_list.snapshot.json
vendored
Normal file
@@ -0,0 +1,97 @@
|
|||||||
|
[
|
||||||
|
{
|
||||||
|
"name": "debug",
|
||||||
|
"description": "Consult a local model to analyse an error and return hypotheses ordered by likelihood, each with a concrete verification step.",
|
||||||
|
"inputSchema": {
|
||||||
|
"properties": {
|
||||||
|
"context": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"error": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"model": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"project_root": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"session_id": {
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"required": [
|
||||||
|
"project_root",
|
||||||
|
"error"
|
||||||
|
],
|
||||||
|
"type": "object"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "retrospective",
|
||||||
|
"description": "Consult a local model to analyse a completed session and identify what is novel or worth preserving as organizational knowledge.",
|
||||||
|
"inputSchema": {
|
||||||
|
"type": "object",
|
||||||
|
"required": [
|
||||||
|
"session_id"
|
||||||
|
],
|
||||||
|
"properties": {
|
||||||
|
"session_id": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"model": {
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "review",
|
||||||
|
"description": "Consult a local model for a structured code review of the specified files. Returns findings with severity levels.",
|
||||||
|
"inputSchema": {
|
||||||
|
"properties": {
|
||||||
|
"context": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"files": {
|
||||||
|
"items": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"type": "array"
|
||||||
|
},
|
||||||
|
"model": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"project_root": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"session_id": {
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"required": [
|
||||||
|
"project_root",
|
||||||
|
"files"
|
||||||
|
],
|
||||||
|
"type": "object"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "trainer",
|
||||||
|
"description": "Consult a local model to identify learning moments from a session log and suggest knowledge to preserve in the brain.",
|
||||||
|
"inputSchema": {
|
||||||
|
"properties": {
|
||||||
|
"model": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"session_id": {
|
||||||
|
"type": "string"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"required": [
|
||||||
|
"session_id"
|
||||||
|
],
|
||||||
|
"type": "object"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
]
|
||||||
64
scripts/smoke-routing.sh
Executable file
64
scripts/smoke-routing.sh
Executable file
@@ -0,0 +1,64 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
# Boot the routing binary and exercise its four tools against live deps.
|
||||||
|
# Skipped when LITELLM_BASE_URL or BRAIN_URL is unreachable.
|
||||||
|
|
||||||
|
LITELLM_BASE_URL="${LITELLM_BASE_URL:-http://piguard:4000}"
|
||||||
|
BRAIN_URL="${BRAIN_URL:-http://koala:30330}"
|
||||||
|
|
||||||
|
if ! curl -sS --max-time 2 "${LITELLM_BASE_URL}/v1/models" >/dev/null 2>&1; then
|
||||||
|
echo "SKIP: LITELLM at ${LITELLM_BASE_URL} unreachable"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
if ! curl -sS --max-time 2 "${BRAIN_URL}/query" -X POST -d '{"query":"x","k":1}' -H 'Content-Type: application/json' >/dev/null 2>&1; then
|
||||||
|
echo "SKIP: BRAIN at ${BRAIN_URL} unreachable"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
PORT=33310
|
||||||
|
BIN=$(mktemp)
|
||||||
|
trap 'rm -f $BIN; pkill -P $$ -f "$BIN" 2>/dev/null || true' EXIT
|
||||||
|
|
||||||
|
go build -o "$BIN" ./cmd/routing
|
||||||
|
|
||||||
|
LITELLM_BASE_URL="$LITELLM_BASE_URL" BRAIN_URL="$BRAIN_URL" \
|
||||||
|
ROUTING_PORT="$PORT" SUPERVISOR_CONFIG_DIR="$(pwd)/config/supervisor" \
|
||||||
|
"$BIN" &
|
||||||
|
|
||||||
|
# Wait for the binary to bind.
|
||||||
|
for _ in $(seq 1 50); do
|
||||||
|
curl -sS "http://127.0.0.1:${PORT}/healthz" >/dev/null 2>&1 && break
|
||||||
|
sleep 0.1
|
||||||
|
done
|
||||||
|
|
||||||
|
call_tool() {
|
||||||
|
local tool="$1"
|
||||||
|
local args="$2"
|
||||||
|
curl -sS -X POST "http://127.0.0.1:${PORT}/mcp" \
|
||||||
|
-H 'Content-Type: application/json' \
|
||||||
|
-d "{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"${tool}\",\"arguments\":${args}}}" \
|
||||||
|
| jq -e '.result // .error' > /dev/null
|
||||||
|
}
|
||||||
|
|
||||||
|
echo "calling tools/list..."
|
||||||
|
curl -sS -X POST "http://127.0.0.1:${PORT}/mcp" \
|
||||||
|
-H 'Content-Type: application/json' \
|
||||||
|
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' \
|
||||||
|
| jq -r '.result.tools | map(.name) | sort | .[]'
|
||||||
|
|
||||||
|
echo "calling each tool..."
|
||||||
|
call_tool review '{"project_root":"/tmp","files":["README.md"],"session_id":"smoke-1"}'
|
||||||
|
call_tool debug '{"project_root":"/tmp","error":"smoke test","session_id":"smoke-1"}'
|
||||||
|
call_tool retrospective '{"session_id":"smoke-1"}'
|
||||||
|
call_tool trainer '{"session_id":"smoke-1"}'
|
||||||
|
|
||||||
|
echo "checking brain has _routing entries..."
|
||||||
|
sleep 2
|
||||||
|
COUNT=$(curl -sS "${BRAIN_URL}/pass-rate?skill=_routing&window=1h" | jq -r '.total // 0')
|
||||||
|
if [ "${COUNT}" -lt 4 ]; then
|
||||||
|
echo "FAIL: expected >=4 _routing entries in last 1h, got ${COUNT}"
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "PASS: smoke:routing"
|
||||||
Reference in New Issue
Block a user