Commit Graph

68 Commits

Author SHA1 Message Date
Mathias Bergqvist
43a8255272 fix(mcp): add SSE GET handler for streamable HTTP transport
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 4s
claude.ai probes with GET before initialize; without this the supervisor
returned application/json parse error instead of text/event-stream, causing
"Couldn't reach the MCP server" in the claude.ai connector setup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 23:27:56 +02:00
Mathias Bergqvist
bee4bb3c1f chore(routing): pre-merge cleanup — Plan 7 reminders, code_review→review, operator note
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 23:22:15 +02:00
Mathias Bergqvist
751f410ca6 test(routing): pin tool-schema parity with supervisor
Captures the four routed skills' (review, debug, retrospective, trainer)
tool definitions as a JSON snapshot and asserts the routing pod's registry
advertises byte-equal schemas. A deliberate schema change fails this test,
requiring an intentional snapshot update in lockstep with consumers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 22:59:06 +02:00
Mathias Bergqvist
3a99d5e20e refactor(routing): surface logger errors via slog.Warn
Replace silent `_ = r.Logger.LogDecision(...)` discards with an
if-err check that emits slog.Warn on failure. A brain outage now
produces a visible warn line instead of swallowing the telemetry
error entirely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 22:55:35 +02:00
Mathias Bergqvist
9a258ca32a feat(routing): router dispatch wrapper
Composes Fetcher + Policy + Logger + CompleteFunc into a single Run method.
Falls open to Claude on local-model errors; defaults to local when brain is
unreachable. Skill packages will receive Router.Run as their CompleteFunc.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 22:51:01 +02:00
Mathias Bergqvist
2a5a74f7c0 feat(routing): decision logger via brain MCP session_log
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:52:09 +02:00
Mathias Bergqvist
d40a5ac890 test(routing): cover TTL expiry in fetcher
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:50:01 +02:00
Mathias Bergqvist
b77820534a feat(routing): pass-rate fetcher with TTL cache
HTTP client that calls GET /pass-rate?skill=X&window=Y on the brain pod.
Caches *float64 results (including nil) per-skill for the configured TTL
(default 60s). On non-200 or network error returns (nil, err) so the
upstream router can fall through to default-to-local.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:46:11 +02:00
Mathias Bergqvist
db64ecb1d9 feat(routing): canonical request hash
SHA-256 of (system, user) joined with 0x00 separator, truncated to
uint64. Drives deterministic sample-band routing: identical prompt pair
→ same hash → same local-vs-Claude decision on every call.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:41:42 +02:00
Mathias Bergqvist
ea29e5ebb8 feat(routing): decision policy
Pure-function Policy{Floor,Ceil} with Decide(*float64, uint64) Decision.
Rules in priority order: nil → local; ≥floor → local; <ceil → claude;
sample band → low bit of requestHash.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:36:59 +02:00
Mathias Bergqvist
ccf080db59 refactor(routing): clarify Floor/Ceil semantics + extend test coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:34:22 +02:00
Mathias Bergqvist
69c038478b feat(routing): RoutingConfig + LoadRouting
Typed config struct and env parser for the routing pod. Kept separate
from the supervisor Config to avoid forcing routing fields onto the
supervisor and vice versa. Uses the existing envOr helper from config.go.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:25:31 +02:00
Mathias Bergqvist
928f23ab1b feat(mcp): optional bearer-token auth via SUPERVISOR_MCP_TOKEN
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
Enables exposing the supervisor MCP via Tailscale Funnel for claude.ai
custom-connector tests. Auth is opt-in: empty SUPERVISOR_MCP_TOKEN
preserves the existing unauthenticated behavior for tailnet-internal
callers and local dev.

When the token is set, every request must carry
"Authorization: Bearer <token>" or it is rejected with HTTP 401 and a
JSON-RPC -32001 error. Comparison uses crypto/subtle.ConstantTimeCompare;
the token value and the supplied header are never logged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 07:31:29 +02:00
Mathias Bergqvist
7f7524c859 fix(mcp): do not respond to JSON-RPC notifications
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
The supervisor's MCP HTTP handler was answering every parsed request,
including notifications (messages with no id field). Per JSON-RPC 2.0,
notifications must not receive a response. The Apr-29 incident saw
Claude Code's MCP client receive a -32601 error for the standard
notifications/initialized handshake step and disconnect immediately
after a successful initialize.

Skip writing the response when req.ID == nil. Cover both the
known-method (notifications/initialized) and unknown-method paths
with tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:39:20 +02:00
Mathias Bergqvist
0a70d9e972 feat(pipeline): add POST /ingest-raw for direct batch ingestion without LLM
All checks were successful
CI / Lint / Test / Vet (push) Successful in 9s
CI / Mirror to GitHub (push) Has been skipped
Allows callers to provide pre-structured RawPage data directly, bypassing the
LLM extraction step. The pipeline still handles slug computation, frontmatter,
link canonicalization, source back-references, and dedup — only the extraction
is skipped. Useful when a more capable model or manual curation produces the
structured data.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 11:15:59 +02:00
Mathias Bergqvist
2ae6bfe81e fix(brain): enforce mutual exclusivity and clarify brain_ingest schema
- Return error when both path and content are supplied simultaneously
- Improve tool description to clearly state the two valid call forms
- Add per-field descriptions so LLMs understand what each parameter requires

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 23:03:03 +02:00
Mathias Bergqvist
a6dce972d6 feat(brain): add path field to brain_ingest for /ingest-path routing
Adds an optional path field to brain_ingest so Claude can ingest files
or directories directly by path without embedding content in the call.
Routing: path set → /ingest-path; content+source set → /ingest; neither → error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 23:01:05 +02:00
Mathias Bergqvist
c7341a2607 feat(config): add IngestSvcURL and KBRetrievalURL to supervisor config 2026-04-22 22:24:27 +02:00
Mathias Bergqvist
b5a0085c0a feat(brain): add brain_ingest, brain_search tools and extend search to wiki/ 2026-04-22 22:16:02 +02:00
Mathias Bergqvist
ca8a691241 fix(exec): strip trailing result-schema JSON from local model output
All checks were successful
cd / Build and deploy (push) Successful in 6s
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
Small models (phi4-mini) produce correct markdown analysis but then
append the old {status/phase/skill} JSON schema out of training habit.
stripResultJSON() detects and removes these trailing fences so Claude
Code receives clean prose regardless of model behaviour.

Non-schema json blocks (config examples etc) are preserved.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 16:55:53 +02:00
Mathias Bergqvist
ce45592730 refactor: replace orchestrator/verifier chain with direct LiteLLM calls
All checks were successful
cd / Build and deploy (push) Successful in 6s
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
Drop the three-layer Claude subprocess orchestration (local model →
Claude verifier → cloud escalation). Skills now call LiteLLM directly
and return plain text to Claude Code, which decides what to do with it.

- Delete executor, orchestrator, verifier, result, attempts packages
- Simplify LiteLLMExecutor: Run(Request)→Result becomes Complete(model,sys,user)→(string,int64,error)
- Replace ExecutorFn with CompleteFunc in all 6 skill configs
- Rewrite all skill handlers to call Complete and return {"text","model","duration_ms"}
- Simplify config/models: remove Verifier/LlamaSwapURL, add ModelFor
- Bump version to v0.5.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 16:19:09 +02:00
Mathias Bergqvist
f2bc39b500 feat(skills): inject brain context into review, debug, spec, tdd before spawning workers 2026-04-22 15:37:56 +02:00
Mathias Bergqvist
47df642836 feat(brain): add Query client for skill handler context injection 2026-04-22 15:34:09 +02:00
Mathias Bergqvist
3d8fc9dacd feat(skills): wire session.Append into retrospective and trainer 2026-04-22 13:37:43 +02:00
Mathias Bergqvist
f9f804cd49 feat(skills): wire session.Append and PrependHistory into tdd 2026-04-22 13:37:06 +02:00
Mathias Bergqvist
85f142ade0 feat(skills): wire session.Append and PrependHistory into spec 2026-04-22 13:36:35 +02:00
Mathias Bergqvist
0dfad02513 feat(skills): wire session.Append and PrependHistory into review and debug 2026-04-22 13:36:13 +02:00
Mathias Bergqvist
c44eb680b2 feat(exec): surface AttemptRecord slice on Result for session logging 2026-04-22 13:35:38 +02:00
Mathias Bergqvist
38ada998a2 feat(session): add AttemptsFrom converter for exec.AttemptRecord 2026-04-22 13:35:09 +02:00
Mathias Bergqvist
74547c2bdf feat(session): export PrependHistory for shared use across skills 2026-04-22 13:34:48 +02:00
Mathias Bergqvist
5cb272a869 feat(exec): add Orchestrator chain walker with verification and warm-state logging 2026-04-20 11:06:13 +02:00
Mathias Bergqvist
e96b39a812 feat(exec): add Claude verifier for local model output quality gate 2026-04-20 11:02:59 +02:00
Mathias Bergqvist
5db5b33cd7 feat(exec): add LiteLLM HTTP executor for local model dispatch 2026-04-20 10:46:08 +02:00
Mathias Bergqvist
a32457b5bc feat(exec): pass --model flag to claude subprocess for cloud-tier dispatch 2026-04-20 08:55:03 +02:00
Mathias Bergqvist
e0be5f0f98 feat(config): replace single-model config with chain-based routing
Implements escalation chains per skill with three-layer priority:
1. Caller override (model param) — no escalation
2. Per-skill chain from models.yaml
3. default_chain fallback

New APIs:
- Verifier() — fixed verifier for output validation
- LlamaSwapURL() — base URL for warm-state probing
- ChainFor(skill, override) — ordered model list for escalation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 08:48:33 +02:00
Mathias Bergqvist
6d410b810b feat(session): extend Attempt with tier, timing, and verdict fields 2026-04-20 08:35:27 +02:00
Mathias Bergqvist
509c04b6e4 fix(session): use fmt.Fprintf with nolint to satisfy both staticcheck and errcheck
Some checks failed
CI / Lint / Test / Vet (push) Successful in 1m7s
CI / Mirror to GitHub (push) Failing after 3s
2026-04-19 18:56:12 +02:00
Mathias Bergqvist
38fcac4cba feat(trainer): add trainer MCP skill with reader→writer sub-agent chain
Reader agent scans session logs for SFT/DPO candidates; writer receives
reader output and formats+writes training pairs to brain/training-data/.
Adds trainer-reader.md and trainer-writer.md discipline prompts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 14:06:00 +02:00
Mathias Bergqvist
7697e901d2 feat(spec): add spec writing MCP skill
Adds the spec skill that generates structured implementation specs from
requirements and writes them to a configurable output path in the project.
Follows the same pattern as review/debug skills with session history injection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 11:59:28 +02:00
Mathias Bergqvist
8cff57009a feat(debug): add debug MCP skill with hypothesis generation
Implements the debug skill following the same pattern as review. The skill
accepts project_root + error (+ optional context/model/session_id), prepends
session history, and calls the executor to produce 3-5 ordered hypotheses —
diagnosis only, no fixes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 11:29:58 +02:00
Mathias Bergqvist
8fb44affef feat(review): add code review MCP skill with session history injection
Implements the review skill following the same pattern as retrospective/tdd.
Validates project_root and files args, prepends session history when a
session_id is provided, and delegates to the executor with Read,Bash tools.
Iron-law discipline prompt enforces CRITICAL/WARNING/SUGGESTION output format.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 11:11:29 +02:00
Mathias Bergqvist
582ca5019b feat(tdd): inject session history into green and refactor worker prompts
Adds SessionsDir to tdd.Config, session_id to tool input schemas, and a
prependHistory method that reads the session JSONL log and prepends a
formatted history block to the task prompt before worker invocation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 10:18:23 +02:00
Mathias Bergqvist
858a9ba1a1 fix(exec): expand validPhases and remove schema enum constraint for phase 2026-04-19 10:03:21 +02:00
Mathias Bergqvist
cbef2da8de feat(session): add FormatHistory for worker context injection
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 09:40:41 +02:00
Mathias Bergqvist
b493651c26 fix(test): update executor test fixture to match --output-format json envelope
All checks were successful
CI / Lint / Test / Vet (push) Successful in 1m9s
CI / Mirror to GitHub (push) Has been skipped
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 07:42:24 +02:00
Mathias Bergqvist
6169404f34 fix(lint): fix remaining errcheck in brain handlers_test
Some checks failed
CI / Lint / Test / Vet (push) Failing after 1m5s
CI / Mirror to GitHub (push) Has been skipped
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 07:17:45 +02:00
Mathias Bergqvist
a67106026f fix(lint): satisfy errcheck for io.Copy, json.Encode, Body.Close, deferred Close
Some checks failed
CI / Lint / Test / Vet (push) Failing after 3s
CI / Mirror to GitHub (push) Has been skipped
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 07:15:35 +02:00
Mathias Bergqvist
4bf5edb78e fix(exec): use --output-format json to get structured output from claude
--json-schema combined with --output-format text produces empty stdout.
The structured result is in the "structured_output" field of the json
envelope. Updated executor to unwrap the envelope.

Also removes --bare flag which disables OAuth keychain reads, causing
silent auth failure when ANTHROPIC_API_KEY is not set.

Adds goreman Procfile + stdio bridge (cmd/bridge) for Claude Code MCP
integration. Task start/stop replaced with goreman + port-kill.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 06:04:10 +02:00
Mathias Bergqvist
e98bb2ba65 feat: wire brain, org, sessionlog, retrospective skills into supervisor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:52:16 +02:00
Mathias Bergqvist
3dfc064353 fix: extend valid phases and return empty slice for missing session
Add "retrospective" to validPhases so non-TDD skills pass Validate().
Return []Entry{} instead of nil in session.Read when no file exists,
so JSON serialisation produces [] rather than null.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:48:32 +02:00