# Spec: hyperguild CLI > Plan 4 of 7 — Hyperguild Skill Migration. Loaded after `feature-spec` skill. ## Problem Statement Three needs converge on a single small Go binary: 1. **Tier probing as MCP is overkill.** The supervisor's `tier` MCP runs on `koala:30320` and answers a one-shot question (which models are reachable right now?). Pulling Claude Code through MCP startup, tool listing, and a JSON-RPC call for a 2-second probe is wasteful and adds a network hop the answer doesn't need. 2. **Brain access from shell scripts has no good front door.** The brain's HTTP REST API exists (Plan 1) at `koala:3300` for non-MCP clients, but every shell script that wants to query or write to the brain re-implements the curl invocation. A CLI gives shell pipelines, ad-hoc agent prompts, and quick-debug scenarios a stable interface. 3. **Mode bootstrap is manual.** Each new project that wants to operate in a chosen mode (cloud / client-local / sovereign) needs a `.mcp.json` written by hand. Without automation, mode adoption is gated on remembering the right MCP server URLs. **Why now:** Plans 1–3 are merged. The CLI is the next building block in shrinking the supervisor pod toward a thin Mode-2 routing layer. Plans 5 and 6 build on the CLI's tier and brain helpers. ## Success Criteria - [ ] `hyperguild tier` returns the same `tier.Info` that `internal/tier.Detect` produces for the same probe URLs, in < 3 s under all three tier conditions, with both human-readable and `--json` output. - [ ] `hyperguild brain query ` returns BM25 results from the brain HTTP REST `/query` endpoint, exit 0 on success and non-zero on transport failure. - [ ] `hyperguild brain write ` reads markdown content from stdin, posts to `/write` with the type and slug, and creates `brain/knowledge/.md`. A round-trip (`hyperguild brain query ` immediately after) finds the entry. - [ ] `hyperguild mode ` writes a parseable JSON file at the target path with the per-mode `mcpServers` entries; `jq -e .mcpServers` succeeds on the output. - [ ] All commands print usage on `--help`, exit 2 on unknown flags, exit non-zero on operational errors. - [ ] `task check` passes (lint + test + vet) on each task and on the merged branch. ## Constraints - **Stdlib only.** No `cobra`, `urfave/cli`, `viper`, etc. CLI router and flag parsing use `flag.NewFlagSet`. - **Go 1.26.1**, project default. - **Module:** `github.com/mathiasbq/supervisor`, peer to `cmd/supervisor/`. New code at `cmd/hyperguild/`. The module name keeps its historical `supervisor` value — renaming the module is out of scope and would touch every import. - **Reuse `internal/tier`** unchanged. The CLI is a thin wrapper around `tier.Detect`. - **Brain endpoint configurable** via `BRAIN_URL` env var (default `http://koala:30330` — Tailscale-exposed NodePort, both MCP at `/mcp` and HTTP REST at `/query`, `/write`, etc., share the port). No hostname literals embedded in the CLI body — sourced from env per the existing "logical-addresses-in-instructions" memory. - **Test discipline:** table-driven, testify, fakes for HTTP and tier probing. No live network in tests. - **Errors:** wrapped via `fmt.Errorf("op: %w", err)`. No naked returns. Stderr for errors, stdout for results. ## Out of Scope - The Mode 6 routing pod itself — `mode client-local` writes a placeholder entry pointing at the future routing URL with a `_routing_pending` annotation; the CLI does not provision the pod. - Pass-rate logging (Plan 5) — the CLI's `brain write` does not emit `session_log` events. - Skill worker CLIs (`hyperguild tdd_red`, `hyperguild review`, etc.) — those stay on the supervisor MCP until Plan 7. - Brain HTTP server changes — the REST endpoints already exist. - Authentication / TLS — Tailscale provides network isolation; no auth currently. - Windows/Linux binaries — macOS-only per the user's setup. `go build` is portable but no cross-compilation in CI. - A `crush` config writer for Mode 3 — Mode 3 (sovereign) writes a Claude-Code-compatible `.mcp.json` with brain-only MCP, on the assumption that even Crush-primary users may fall back to Claude Code with brain access. Crush's own config is owned by the user manually. - A unified `--config` file for the CLI — env var + flags is enough today. ## Technical Approach - **Single binary, inline subcommand router.** `cmd/hyperguild/main.go` dispatches on `os.Args[1]` to per-subcommand functions, each owning its own `flag.NewFlagSet`. Rationale: 4 top-level subcommands (`tier`, `brain`, `mode`, plus `--help`) and one nested level (`brain query`, `brain write`); ~80 lines of routing plumbing in stdlib beats pulling cobra's ~3 KLOC of dependencies for a tiny CLI. The router is testable by injecting `args []string` instead of reading `os.Args` directly. - **`tier` subcommand reuses `internal/tier.Detect` verbatim.** Probe URLs (`https://api.anthropic.com` and the LiteLLM base URL) come from environment: `ANTHROPIC_PROBE_URL` (default the literal Anthropic URL) and `LITELLM_BASE_URL` (no default — error if `--mode-needs-llm` and unset). Rationale: matching the supervisor's existing wiring means the CLI cannot disagree with the supervisor about tier; a single source of truth. - **`brain` subcommand calls the HTTP REST API.** Two nested subcommands: - `brain query ` issues `POST /query` with JSON body `{query, limit}` (default `--limit 5`), prints results in human-readable form by default and with `--json` for machine consumption. - `brain write ` reads stdin, posts `POST /write` with JSON body `{type, slug, content}`, prints the resulting path on success. Rationale: HTTP REST is simpler than MCP framing for a CLI. Per CLAUDE.md, the REST endpoints are documented as the official non-MCP interface. - **`mode ` writes a per-mode `.mcp.json` template.** Defaults to writing `./.mcp.json` (cwd); accepts `--out `. Per-mode bodies: - `cloud` — `mcpServers` contains only `brain` at `http://koala:30330/mcp`. - `client-local` — `mcpServers` contains `brain` at `http://koala:30330/mcp` and a `routing` placeholder entry with `url` set to a marker (`http://koala:30310/mcp`) and an extra field `"_routing_pending": "Plan 6 — routing pod not deployed yet"`. Rationale: keeping strict-JSON parseable means using a placeholder field rather than a JSON comment, which the spec parser would reject. - `sovereign` — `mcpServers` contains only `brain`, plus a top-level `"_mode_note": "Sovereign mode primarily uses Crush + LiteLLM. This .mcp.json is provided as Claude Code fallback."`. All three are valid JSON and all three round-trip through `jq` for verification. Rationale: a single subcommand with three clearly-different outputs is easier to evolve than three nearly-duplicate subcommands. The placeholder fields are intentional documentation in the file itself, which the user actually opens and edits. - **No global state.** Each subcommand is a function `(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error`, allowing table-driven tests to exercise full subcommand flows without `os.Exit` or fd capture. - **HTTP client injection.** A package-level `http.Client` with 5s timeout for `brain` calls, overridable in tests via a constructor. Real client for `main`, `httptest.Server` for tests. ## Risks - **`.mcp.json` schema may evolve.** Claude Code's MCP config format is defined by the harness, and Anthropic could change it. Mitigation: document the format in the CLI's `--help` text and in the spec; if it breaks, the fix is local to one template function. - **Brain endpoint hostname drift.** If the brain moves off `koala`, the env-var override avoids breaking the CLI but the `mode` template's hardcoded `koala:30330` becomes stale. Mitigation: source the URL in the `mode` template from the same env var (`BRAIN_URL`) so all three subcommands stay in lockstep with the user's actual environment. - **`tier` probe URL gap.** The CLI inherits the supervisor's hardcoded `https://api.anthropic.com` probe URL via `internal/tier`. If Anthropic changes the URL, both supervisor and CLI break together. Mitigation: env-var override `ANTHROPIC_PROBE_URL`; default unchanged. - **No HTTP retry logic.** The CLI returns first-error to the user. For ad-hoc shell use this is fine; for automation a future `--retry` flag may be needed. Out of scope for this iteration. - **Tests don't cover live network.** Pure-fake tests catch regression but not "does the brain pod actually answer." Mitigation: add a smoke-test `task hyperguild:smoke` in a follow-up that runs against the real brain — separate concern, not in Plan 4. - **Mode 3 sovereign output may surprise users** who expect Mode 3 to skip writing a `.mcp.json` entirely (since Crush is the primary harness). Mitigation: the `_mode_note` field explains the choice; the `--out /dev/null` escape hatch lets users skip the write if they want.