Removes the supervisor binary and its two exclusive skill packages (tdd, spec) now that all functionality is covered by SKILL.md files, the routing pod, and the brain MCP. Routing pod reuses review/debug/retrospective/trainer skill packages which are intentionally preserved. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
hyperguild
An MCP server that acts as a disciplined AI supervisor for Claude Code sessions. Instead of letting Claude Code do whatever it wants, hyperguild enforces structured workflows (TDD red/green/refactor), logs every session, and accumulates learnings into a searchable brain.
How it works
Your Claude Code session (in any project)
│
│ MCP over HTTP (Tailscale)
├──▶ supervisor :3200 (NodePort 30320 on koala) — skill workers: tdd, debug, spec, …
├──▶ routing :3210 (NodePort 30310 on koala) — Mode 2 only: review, debug, retrospective, trainer
└──▶ brain :3300 (NodePort 30330 on koala) — brain_query, brain_write, brain_ingest, session_log
│
└─ also serves the legacy REST endpoints (/query, /write, /ingest, …)
│
▼
brain/
├── sessions/ — JSONL log, one file per session_id
├── wiki/ — searchable knowledge (full-text)
│ ├── concepts/
│ ├── entities/
│ └── sources/
├── raw/ — retrospective output, staged for review
└── training-data/ — SFT/DPO/RL data (Phase 2)
Phase 1 tools (available now)
| Tool | What it does |
|---|---|
tdd_red |
Writes a failing test for a spec, verifies it fails |
tdd_green |
Writes the minimal implementation to make tests pass |
tdd_refactor |
Cleans up implementation while keeping tests green |
session_log |
Appends a structured entry to the session JSONL log |
retrospective |
Reads the session log, identifies novel learnings, writes to brain/raw/ |
brain_query |
Full-text search over brain/wiki/ |
brain_write |
Writes a note to brain/raw/ (with optional YAML frontmatter) |
tier |
Returns the current connectivity tier (1=cloud, 2=LAN, 3=offline) |
Start the servers
# Requires goreman: go install github.com/mattn/goreman@latest
task start # starts ingestion (:3300) + supervisor (:3200) via goreman
task stop # kills both by port
Connect a project
Create .mcp.json in your project root:
{
"mcpServers": {
"supervisor": {
"type": "http",
"url": "http://koala:30320/mcp"
},
"brain": {
"type": "http",
"url": "http://koala:30330/mcp"
}
}
}
Two MCP servers are exposed today, both reachable over Tailscale:
supervisoratkoala:30320— skill workers (tdd_red/green/refactor,review,debug,spec,retrospective,trainer,tier).brainatkoala:30330— knowledge access (brain_query,brain_write,brain_ingest,brain_ingest_raw) andsession_log. Hosted by the ingestion service directly, no separate pod.
No local binary or stdio shim is required — Claude Code talks to both via HTTP.
Open Claude Code in your project — run /mcp to confirm both servers are listed.
A typical TDD session
1. Call tdd_red → spec in, failing test file out
2. Call tdd_green → test path in, implementation out
3. Call tdd_refactor → impl + test in, cleaned code out
4. Call session_log → log each phase result
5. Call retrospective → extracts learnings → brain/raw/
6. Review brain/raw/, move worthy notes to brain/wiki/concepts/
7. Future sessions: call brain_query to retrieve relevant context
Tier detection
The supervisor probes connectivity at call time:
| Tier | Label | Condition |
|---|---|---|
| 1 | full-online | Can reach api.anthropic.com |
| 2 | lan-only | Can reach LiteLLM but not Anthropic |
| 3 | airplane | No external connectivity |
Key env vars
| Variable | Default | Purpose |
|---|---|---|
INGEST_BRAIN_DIR |
../brain |
Brain directory for ingestion server |
INGEST_PORT |
3300 |
Ingestion server port |
SUPERVISOR_CONFIG_DIR |
./config/supervisor |
Skill discipline files |
SUPERVISOR_SESSIONS_DIR |
./brain/sessions |
JSONL session logs |
INGEST_BASE_URL |
http://localhost:3300 |
Supervisor → ingestion |
LITELLM_BASE_URL |
— | LiteLLM proxy for Tier 2 model routing |
SUPERVISOR_MCP_TOKEN |
— | Optional bearer token for the supervisor MCP HTTP endpoint; when empty, no auth is enforced |
ROUTING_PORT |
3210 |
Routing pod's listen port |
ROUTING_MCP_TOKEN |
— | Optional bearer token for the routing MCP HTTP endpoint |
BRAIN_URL |
http://ingestion.supervisor:3300 |
Routing pod → brain (in-cluster) |
HYPERGUILD_FAST_MODEL |
koala/qwen35-9b-fast |
Fast model for high-pass-rate skill calls |
HYPERGUILD_THINKING_MODEL |
iguana/gemma4-26b |
Thinking model for low-pass-rate skill calls |
HYPERGUILD_ROUTE_LOCAL_FLOOR |
0.90 |
At/above pass rate, route to fast model |
HYPERGUILD_ROUTE_LOCAL_CEIL |
0.70 |
Below pass rate, route to thinking model. Between CEIL and FLOOR is the sample band. |
HYPERGUILD_PASS_RATE_TTL_SECONDS |
60 |
Per-skill pass-rate cache TTL |
Operator note: LiteLLM at
LITELLM_BASE_URLmust register bothHYPERGUILD_FAST_MODELandHYPERGUILD_THINKING_MODELfor routing to do useful work. If a model is missing, LiteLLM returns 4xx, the routing pod's fast route fails, the fail-open retry on the thinking model likely also fails (since both are missing), and the only signal isfinal_status: "fail"on_routingentries in the brain.
Phase 2 (planned)
reviewskill — structured code review with iron law enforcementdebugskill — hypothesis-driven debugging sessionsspecskill — generates specs from conversationstrainer— extracts SFT/DPO pairs from session logs for fine-tuning