brain_write with a custom filename omitted the .md extension, causing search to skip the file (search.go filters on HasSuffix .md). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1872 lines
55 KiB
Markdown
1872 lines
55 KiB
Markdown
# Hyperguild Phase 2 Implementation Plan
|
||
|
||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||
|
||
**Goal:** Add four new MCP skills (review, debug, spec, trainer) to the hyperguild supervisor, with automatic session history injection into all multi-phase skill workers.
|
||
|
||
**Architecture:** The supervisor reads prior session entries and injects them into each worker's task prompt when `session_id` is provided — the orchestrator no longer re-summarises history. The `trainer` skill runs a two-step sub-agent chain: a reader agent identifies learning moments from the session log, then a writer agent formats them as SFT/DPO pairs and writes to `brain/training-data/`. All other new skills are single-worker.
|
||
|
||
**Tech Stack:** Go 1.26, `internal/session` (JSONL log), `internal/exec` (claude subprocess executor), `internal/registry` (MCP tool registry), `config/supervisor/*.md` (discipline files), `config/models.yaml` (model routing)
|
||
|
||
---
|
||
|
||
## File Map
|
||
|
||
### New files
|
||
| File | Responsibility |
|
||
|------|---------------|
|
||
| `internal/session/history.go` | `FormatHistory(entries, excludePhase)` — formats prior session entries as a prompt block |
|
||
| `internal/session/history_test.go` | Unit tests for FormatHistory |
|
||
| `internal/skills/review/skill.go` | review skill: Config, Skill, New, Name, Tools |
|
||
| `internal/skills/review/handlers.go` | `Handle`, `handleReview` — single-phase code review worker |
|
||
| `internal/skills/review/handlers_test.go` | review handler unit tests |
|
||
| `internal/skills/debug/skill.go` | debug skill |
|
||
| `internal/skills/debug/handlers.go` | `handleDebug` — hypothesis generation worker |
|
||
| `internal/skills/debug/handlers_test.go` | debug handler unit tests |
|
||
| `internal/skills/spec/skill.go` | spec skill |
|
||
| `internal/skills/spec/handlers.go` | `handleSpec` — spec writing worker |
|
||
| `internal/skills/spec/handlers_test.go` | spec handler unit tests |
|
||
| `internal/skills/trainer/skill.go` | trainer skill: Config with ReaderPrompt + WriterPrompt + BrainDir |
|
||
| `internal/skills/trainer/handlers.go` | `handleTrain` — calls ExecutorFn twice: reader then writer |
|
||
| `internal/skills/trainer/handlers_test.go` | trainer handler unit tests (verifies two-call chain) |
|
||
| `config/supervisor/review.md` | Review worker discipline file |
|
||
| `config/supervisor/debug.md` | Debug worker discipline file |
|
||
| `config/supervisor/spec.md` | Spec worker discipline file |
|
||
| `config/supervisor/trainer-reader.md` | Trainer reader agent discipline |
|
||
| `config/supervisor/trainer-writer.md` | Trainer writer agent discipline |
|
||
|
||
### Modified files
|
||
| File | Change |
|
||
|------|--------|
|
||
| `internal/exec/result.go` | Add review/debug/spec/trainer to `validPhases` and Schema enum |
|
||
| `internal/skills/tdd/skill.go` | Add `SessionsDir string` to Config; add `session_id` to green/refactor tool schemas |
|
||
| `internal/skills/tdd/handlers.go` | Add `session_id` to `greenArgs` and `refactorArgs`; inject history when `session_id` non-empty |
|
||
| `internal/skills/tdd/handlers_test.go` | Add tests for history injection in green and refactor |
|
||
| `cmd/supervisor/main.go` | Pass `SessionsDir` to tdd.Config; read discipline files and register 4 new skills |
|
||
| `config/models.yaml` | Add `trainer` model entry |
|
||
|
||
---
|
||
|
||
## Task 1: Session history utility
|
||
|
||
**Files:**
|
||
- Create: `internal/session/history.go`
|
||
- Create: `internal/session/history_test.go`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
```go
|
||
// internal/session/history_test.go
|
||
package session_test
|
||
|
||
import (
|
||
"testing"
|
||
"time"
|
||
|
||
"github.com/mathiasbq/supervisor/internal/session"
|
||
"github.com/stretchr/testify/assert"
|
||
)
|
||
|
||
func TestFormatHistoryEmpty(t *testing.T) {
|
||
result := session.FormatHistory(nil, "")
|
||
assert.Equal(t, "", result)
|
||
}
|
||
|
||
func TestFormatHistoryFormatsEntries(t *testing.T) {
|
||
entries := []session.Entry{
|
||
{
|
||
Skill: "tdd", Phase: "red", FinalStatus: "pass",
|
||
FilePath: "internal/foo/foo_test.go",
|
||
Message: "wrote failing test for Foo",
|
||
Timestamp: time.Now(),
|
||
},
|
||
}
|
||
result := session.FormatHistory(entries, "")
|
||
assert.Contains(t, result, "## Session history")
|
||
assert.Contains(t, result, "Phase: red")
|
||
assert.Contains(t, result, "wrote failing test for Foo")
|
||
assert.Contains(t, result, "internal/foo/foo_test.go")
|
||
}
|
||
|
||
func TestFormatHistoryExcludesCurrentPhase(t *testing.T) {
|
||
entries := []session.Entry{
|
||
{Skill: "tdd", Phase: "red", Message: "red done", FinalStatus: "pass"},
|
||
{Skill: "tdd", Phase: "green", Message: "green done", FinalStatus: "pass"},
|
||
}
|
||
result := session.FormatHistory(entries, "green")
|
||
assert.Contains(t, result, "red done")
|
||
assert.NotContains(t, result, "green done")
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run test to confirm it fails**
|
||
|
||
```bash
|
||
cd /path/to/supervisor
|
||
go test ./internal/session/... -run TestFormatHistory -v
|
||
```
|
||
Expected: `FAIL` — `session.FormatHistory undefined`
|
||
|
||
- [ ] **Step 3: Implement FormatHistory**
|
||
|
||
```go
|
||
// internal/session/history.go
|
||
package session
|
||
|
||
import (
|
||
"fmt"
|
||
"strings"
|
||
)
|
||
|
||
// FormatHistory formats prior session entries as a structured block for
|
||
// injection into a worker task prompt. Entries matching excludePhase are
|
||
// omitted (pass the current phase to avoid circular injection).
|
||
func FormatHistory(entries []Entry, excludePhase string) string {
|
||
var filtered []Entry
|
||
for _, e := range entries {
|
||
if e.Phase != excludePhase {
|
||
filtered = append(filtered, e)
|
||
}
|
||
}
|
||
if len(filtered) == 0 {
|
||
return ""
|
||
}
|
||
|
||
var b strings.Builder
|
||
b.WriteString("## Session history\n\n")
|
||
for _, e := range filtered {
|
||
fmt.Fprintf(&b, "### Phase: %s\n", e.Phase)
|
||
fmt.Fprintf(&b, "- Skill: %s\n", e.Skill)
|
||
fmt.Fprintf(&b, "- Status: %s\n", e.FinalStatus)
|
||
if e.FilePath != "" {
|
||
fmt.Fprintf(&b, "- File: %s\n", e.FilePath)
|
||
}
|
||
if e.Message != "" {
|
||
fmt.Fprintf(&b, "- Summary: %s\n", e.Message)
|
||
}
|
||
b.WriteString("\n")
|
||
}
|
||
return b.String()
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Run tests to confirm they pass**
|
||
|
||
```bash
|
||
go test ./internal/session/... -v
|
||
```
|
||
Expected: all `TestFormatHistory*` tests PASS
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add internal/session/history.go internal/session/history_test.go
|
||
git commit -m "feat(session): add FormatHistory for worker context injection"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 2: Fix Schema enum and validPhases
|
||
|
||
**Files:**
|
||
- Modify: `internal/exec/result.go`
|
||
|
||
The Schema's `phase` enum currently only lists `["red","green","refactor"]`. Workers using any other phase name get forced to pick from that list (the retrospective worker was returning `"refactor"` as a result). Fix by removing the enum constraint from the schema (let it be any string) and rely solely on `validPhases` for server-side validation.
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
```go
|
||
// Add to internal/exec/result_test.go (check existing file first for test helpers)
|
||
func TestValidateAcceptsAllPhases(t *testing.T) {
|
||
phases := []string{"red", "green", "refactor", "retrospective", "review", "debug", "spec", "trainer"}
|
||
for _, phase := range phases {
|
||
r := exec.Result{Status: "pass", Phase: phase, Skill: "test", ModelUsed: "self", Message: "ok"}
|
||
assert.NoError(t, r.Validate(), "phase %q should be valid", phase)
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Run test to confirm it fails**
|
||
|
||
```bash
|
||
go test ./internal/exec/... -run TestValidateAcceptsAllPhases -v
|
||
```
|
||
Expected: FAIL — `"review"`, `"debug"`, `"spec"`, `"trainer"` fail validation
|
||
|
||
- [ ] **Step 3: Update result.go**
|
||
|
||
In `internal/exec/result.go`, make these two changes:
|
||
|
||
**Change 1** — update `validPhases`:
|
||
```go
|
||
var validPhases = map[string]bool{
|
||
"red": true,
|
||
"green": true,
|
||
"refactor": true,
|
||
"retrospective": true,
|
||
"review": true,
|
||
"debug": true,
|
||
"spec": true,
|
||
"trainer": true,
|
||
}
|
||
```
|
||
|
||
**Change 2** — remove the enum constraint from the Schema `phase` property (replace the `"phase"` line):
|
||
```go
|
||
// Before:
|
||
"phase": {"type": "string", "enum": ["red","green","refactor"]},
|
||
|
||
// After:
|
||
"phase": {"type": "string"},
|
||
```
|
||
|
||
Also fix the error message in `Validate()`:
|
||
```go
|
||
// Before:
|
||
errs = append(errs, "phase must be red|green|refactor, got: "+r.Phase)
|
||
|
||
// After:
|
||
errs = append(errs, "phase must be one of red|green|refactor|retrospective|review|debug|spec|trainer, got: "+r.Phase)
|
||
```
|
||
|
||
- [ ] **Step 4: Run tests to confirm they pass**
|
||
|
||
```bash
|
||
go test ./internal/exec/... -v
|
||
```
|
||
Expected: all tests PASS, including `TestValidateAcceptsAllPhases`
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add internal/exec/result.go
|
||
git commit -m "fix(exec): expand validPhases and remove schema enum constraint for phase"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 3: Session history injection in TDD green and refactor
|
||
|
||
**Files:**
|
||
- Modify: `internal/skills/tdd/skill.go`
|
||
- Modify: `internal/skills/tdd/handlers.go`
|
||
- Modify: `internal/skills/tdd/handlers_test.go`
|
||
- Modify: `cmd/supervisor/main.go`
|
||
|
||
- [ ] **Step 1: Write the failing tests**
|
||
|
||
Add to `internal/skills/tdd/handlers_test.go`:
|
||
|
||
```go
|
||
func TestTDDGreenInjectsSessionHistory(t *testing.T) {
|
||
sessDir := t.TempDir()
|
||
require.NoError(t, session.Append(sessDir, "sess-1", session.Entry{
|
||
SessionID: "sess-1", Skill: "tdd", Phase: "red", FinalStatus: "pass",
|
||
FilePath: "internal/foo/foo_test.go",
|
||
Message: "wrote failing test for Foo",
|
||
}))
|
||
|
||
var capturedPrompt string
|
||
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
|
||
capturedPrompt = req.TaskPrompt
|
||
return iexec.Result{Status: "pass", Phase: "green", Skill: "tdd", Verified: true, ModelUsed: "self", Message: "ok"}, nil
|
||
}
|
||
|
||
sk := tdd.New(tdd.Config{SkillPrompt: "tdd", ExecutorFn: fakeFn, SessionsDir: sessDir})
|
||
_, err := sk.Handle(context.Background(), "tdd_green", json.RawMessage(
|
||
`{"project_root":"/tmp","test_path":"internal/foo/foo_test.go","test_cmd":"go test ./...","session_id":"sess-1"}`,
|
||
))
|
||
require.NoError(t, err)
|
||
assert.Contains(t, capturedPrompt, "## Session history")
|
||
assert.Contains(t, capturedPrompt, "wrote failing test for Foo")
|
||
}
|
||
|
||
func TestTDDGreenNoHistoryWhenSessionIDEmpty(t *testing.T) {
|
||
var capturedPrompt string
|
||
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
|
||
capturedPrompt = req.TaskPrompt
|
||
return iexec.Result{Status: "pass", Phase: "green", Skill: "tdd", Verified: true, ModelUsed: "self", Message: "ok"}, nil
|
||
}
|
||
|
||
sk := tdd.New(tdd.Config{SkillPrompt: "tdd", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
|
||
_, err := sk.Handle(context.Background(), "tdd_green", json.RawMessage(
|
||
`{"project_root":"/tmp","test_path":"internal/foo/foo_test.go"}`,
|
||
))
|
||
require.NoError(t, err)
|
||
assert.NotContains(t, capturedPrompt, "## Session history")
|
||
}
|
||
```
|
||
|
||
You will need these imports in the test file:
|
||
```go
|
||
import (
|
||
"github.com/mathiasbq/supervisor/internal/session"
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
)
|
||
```
|
||
|
||
- [ ] **Step 2: Run tests to confirm they fail**
|
||
|
||
```bash
|
||
go test ./internal/skills/tdd/... -run TestTDDGreen -v
|
||
```
|
||
Expected: FAIL — `tdd.Config` has no `SessionsDir` field, `session_id` not in args
|
||
|
||
- [ ] **Step 3: Add SessionsDir to Config and session_id to tool schemas**
|
||
|
||
In `internal/skills/tdd/skill.go`, add `SessionsDir` to Config:
|
||
```go
|
||
type Config struct {
|
||
SystemPrompt string
|
||
SkillPrompt string
|
||
ExecutorFn ExecutorFn
|
||
DefaultModel string
|
||
SessionsDir string // optional: path to brain/sessions/ for history injection
|
||
}
|
||
```
|
||
|
||
In `Tools()`, add `"session_id"` as an optional property to `tdd_green` and `tdd_refactor` input schemas:
|
||
```go
|
||
// tdd_green InputSchema:
|
||
InputSchema: schema(
|
||
[]string{"project_root", "test_path"},
|
||
map[string]any{
|
||
"project_root": strProp,
|
||
"test_path": strProp,
|
||
"model": strProp,
|
||
"test_cmd": strProp,
|
||
"session_id": strProp,
|
||
},
|
||
),
|
||
// tdd_refactor InputSchema (same addition):
|
||
InputSchema: schema(
|
||
[]string{"project_root", "test_path", "impl_path"},
|
||
map[string]any{
|
||
"project_root": strProp,
|
||
"test_path": strProp,
|
||
"impl_path": strProp,
|
||
"model": strProp,
|
||
"test_cmd": strProp,
|
||
"session_id": strProp,
|
||
},
|
||
),
|
||
```
|
||
|
||
- [ ] **Step 4: Update greenArgs, refactorArgs, and inject history**
|
||
|
||
In `internal/skills/tdd/handlers.go`, add imports and update structs + handlers:
|
||
|
||
```go
|
||
import (
|
||
// existing imports...
|
||
"github.com/mathiasbq/supervisor/internal/session"
|
||
)
|
||
|
||
type greenArgs struct {
|
||
ProjectRoot string `json:"project_root"`
|
||
TestPath string `json:"test_path"`
|
||
Model string `json:"model"`
|
||
TestCmd string `json:"test_cmd"`
|
||
SessionID string `json:"session_id"`
|
||
}
|
||
|
||
func (s *Skill) handleGreen(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
|
||
var args greenArgs
|
||
if err := json.Unmarshal(raw, &args); err != nil {
|
||
return nil, fmt.Errorf("parse args: %w", err)
|
||
}
|
||
if args.ProjectRoot == "" {
|
||
return nil, fmt.Errorf("project_root is required")
|
||
}
|
||
if args.TestPath == "" {
|
||
return nil, fmt.Errorf("test_path is required")
|
||
}
|
||
|
||
task := fmt.Sprintf(
|
||
"phase: green\nproject_root: %s\ntest_path: %s\nmodel: %s\ntest_cmd: %s",
|
||
args.ProjectRoot, args.TestPath, s.resolveModel(args.Model), args.TestCmd,
|
||
)
|
||
task = s.prependHistory(args.SessionID, "green", task)
|
||
return s.execute(ctx, task)
|
||
}
|
||
|
||
type refactorArgs struct {
|
||
ProjectRoot string `json:"project_root"`
|
||
TestPath string `json:"test_path"`
|
||
ImplPath string `json:"impl_path"`
|
||
Model string `json:"model"`
|
||
TestCmd string `json:"test_cmd"`
|
||
SessionID string `json:"session_id"`
|
||
}
|
||
|
||
func (s *Skill) handleRefactor(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
|
||
var args refactorArgs
|
||
if err := json.Unmarshal(raw, &args); err != nil {
|
||
return nil, fmt.Errorf("parse args: %w", err)
|
||
}
|
||
if args.ProjectRoot == "" {
|
||
return nil, fmt.Errorf("project_root is required")
|
||
}
|
||
if args.TestPath == "" {
|
||
return nil, fmt.Errorf("test_path is required")
|
||
}
|
||
if args.ImplPath == "" {
|
||
return nil, fmt.Errorf("impl_path is required")
|
||
}
|
||
|
||
task := fmt.Sprintf(
|
||
"phase: refactor\nproject_root: %s\ntest_path: %s\nimpl_path: %s\nmodel: %s\ntest_cmd: %s",
|
||
args.ProjectRoot, args.TestPath, args.ImplPath, s.resolveModel(args.Model), args.TestCmd,
|
||
)
|
||
task = s.prependHistory(args.SessionID, "refactor", task)
|
||
return s.execute(ctx, task)
|
||
}
|
||
|
||
// prependHistory reads the session log and prepends prior phase entries to the task prompt.
|
||
// If sessionID is empty or SessionsDir is not configured, task is returned unchanged.
|
||
func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
|
||
if sessionID == "" || s.cfg.SessionsDir == "" {
|
||
return task
|
||
}
|
||
entries, err := session.Read(s.cfg.SessionsDir, sessionID)
|
||
if err != nil || len(entries) == 0 {
|
||
return task
|
||
}
|
||
history := session.FormatHistory(entries, currentPhase)
|
||
if history == "" {
|
||
return task
|
||
}
|
||
return history + "\n---\n\n" + task
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Update main.go to pass SessionsDir to tdd.Config**
|
||
|
||
In `cmd/supervisor/main.go`, find the `tdd.New(tdd.Config{...})` call and add `SessionsDir`:
|
||
```go
|
||
reg.Register(tdd.New(tdd.Config{
|
||
SystemPrompt: string(systemPrompt),
|
||
SkillPrompt: string(tddPrompt),
|
||
DefaultModel: models.Resolve("tdd", ""),
|
||
ExecutorFn: executor.Run,
|
||
SessionsDir: cfg.SessionsDir, // ← add this line
|
||
}))
|
||
```
|
||
|
||
- [ ] **Step 6: Run all tests**
|
||
|
||
```bash
|
||
go test ./... -race -count=1
|
||
```
|
||
Expected: all tests PASS
|
||
|
||
- [ ] **Step 7: Commit**
|
||
|
||
```bash
|
||
git add internal/skills/tdd/skill.go internal/skills/tdd/handlers.go internal/skills/tdd/handlers_test.go cmd/supervisor/main.go
|
||
git commit -m "feat(tdd): inject session history into green and refactor worker prompts"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 4: review skill
|
||
|
||
**Files:**
|
||
- Create: `internal/skills/review/skill.go`
|
||
- Create: `internal/skills/review/handlers.go`
|
||
- Create: `internal/skills/review/handlers_test.go`
|
||
- Create: `config/supervisor/review.md`
|
||
- Modify: `cmd/supervisor/main.go`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
```go
|
||
// internal/skills/review/handlers_test.go
|
||
package review_test
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"testing"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/skills/review"
|
||
"github.com/stretchr/testify/assert"
|
||
"github.com/stretchr/testify/require"
|
||
)
|
||
|
||
func TestReviewToolRegistered(t *testing.T) {
|
||
sk := review.New(review.Config{SkillPrompt: "review rules"})
|
||
names := make([]string, 0)
|
||
for _, tool := range sk.Tools() {
|
||
names = append(names, tool.Name)
|
||
}
|
||
assert.Contains(t, names, "review")
|
||
}
|
||
|
||
func TestReviewRequiresProjectRoot(t *testing.T) {
|
||
sk := review.New(review.Config{SkillPrompt: "r"})
|
||
_, err := sk.Handle(context.Background(), "review", json.RawMessage(`{"files":["main.go"]}`))
|
||
assert.ErrorContains(t, err, "project_root")
|
||
}
|
||
|
||
func TestReviewRequiresFiles(t *testing.T) {
|
||
sk := review.New(review.Config{SkillPrompt: "r"})
|
||
_, err := sk.Handle(context.Background(), "review", json.RawMessage(`{"project_root":"/tmp"}`))
|
||
assert.ErrorContains(t, err, "files")
|
||
}
|
||
|
||
func TestReviewCallsExecutor(t *testing.T) {
|
||
called := false
|
||
var capturedTask string
|
||
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
|
||
called = true
|
||
capturedTask = req.TaskPrompt
|
||
return iexec.Result{
|
||
Status: "pass", Phase: "review", Skill: "review",
|
||
Verified: true, ModelUsed: "self", Message: "2 warnings found",
|
||
}, nil
|
||
}
|
||
|
||
sk := review.New(review.Config{SkillPrompt: "review rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
|
||
out, err := sk.Handle(context.Background(), "review", json.RawMessage(
|
||
`{"project_root":"/tmp/proj","files":["internal/foo/foo.go"],"context":"PR: add Foo helper"}`,
|
||
))
|
||
require.NoError(t, err)
|
||
assert.True(t, called)
|
||
assert.Contains(t, capturedTask, "internal/foo/foo.go")
|
||
assert.Contains(t, capturedTask, "PR: add Foo helper")
|
||
|
||
var result iexec.Result
|
||
require.NoError(t, json.Unmarshal(out, &result))
|
||
assert.Equal(t, "pass", result.Status)
|
||
assert.Equal(t, "review", result.Phase)
|
||
}
|
||
|
||
var _ = require.New
|
||
```
|
||
|
||
- [ ] **Step 2: Run test to confirm it fails**
|
||
|
||
```bash
|
||
go test ./internal/skills/review/... -v
|
||
```
|
||
Expected: FAIL — package `review` does not exist
|
||
|
||
- [ ] **Step 3: Create skill.go**
|
||
|
||
```go
|
||
// internal/skills/review/skill.go
|
||
package review
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/registry"
|
||
)
|
||
|
||
type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)
|
||
|
||
type Config struct {
|
||
SkillPrompt string
|
||
DefaultModel string
|
||
ExecutorFn ExecutorFn
|
||
SessionsDir string
|
||
}
|
||
|
||
type Skill struct{ cfg Config }
|
||
|
||
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
|
||
|
||
func (s *Skill) Name() string { return "review" }
|
||
|
||
func (s *Skill) Tools() []registry.ToolDef {
|
||
schema := func(required []string, props map[string]any) json.RawMessage {
|
||
b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
|
||
return b
|
||
}
|
||
return []registry.ToolDef{
|
||
{
|
||
Name: "review",
|
||
Description: "Perform a structured code review of the specified files. Returns findings with severity levels.",
|
||
InputSchema: schema(
|
||
[]string{"project_root", "files"},
|
||
map[string]any{
|
||
"project_root": map[string]any{"type": "string"},
|
||
"files": map[string]any{"type": "array", "items": map[string]any{"type": "string"}},
|
||
"context": map[string]any{"type": "string"},
|
||
"model": map[string]any{"type": "string"},
|
||
"session_id": map[string]any{"type": "string"},
|
||
},
|
||
),
|
||
},
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Create handlers.go**
|
||
|
||
```go
|
||
// internal/skills/review/handlers.go
|
||
package review
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"fmt"
|
||
"strings"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/session"
|
||
)
|
||
|
||
type reviewArgs struct {
|
||
ProjectRoot string `json:"project_root"`
|
||
Files []string `json:"files"`
|
||
Context string `json:"context"`
|
||
Model string `json:"model"`
|
||
SessionID string `json:"session_id"`
|
||
}
|
||
|
||
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
|
||
if tool != "review" {
|
||
return nil, fmt.Errorf("unknown tool: %s", tool)
|
||
}
|
||
var a reviewArgs
|
||
if err := json.Unmarshal(args, &a); err != nil {
|
||
return nil, fmt.Errorf("parse args: %w", err)
|
||
}
|
||
if a.ProjectRoot == "" {
|
||
return nil, fmt.Errorf("project_root is required")
|
||
}
|
||
if len(a.Files) == 0 {
|
||
return nil, fmt.Errorf("files is required")
|
||
}
|
||
|
||
model := a.Model
|
||
if model == "" {
|
||
model = s.cfg.DefaultModel
|
||
}
|
||
|
||
task := fmt.Sprintf(
|
||
"phase: review\nproject_root: %s\nfiles: %s\ncontext: %s\nmodel: %s",
|
||
a.ProjectRoot, strings.Join(a.Files, ", "), a.Context, model,
|
||
)
|
||
task = s.prependHistory(a.SessionID, "review", task)
|
||
|
||
if s.cfg.ExecutorFn == nil {
|
||
return nil, fmt.Errorf("no executor configured")
|
||
}
|
||
result, err := s.cfg.ExecutorFn(ctx, iexec.Request{
|
||
SkillPrompt: s.cfg.SkillPrompt,
|
||
TaskPrompt: task,
|
||
Model: model,
|
||
Tools: "Read,Bash",
|
||
})
|
||
if err != nil {
|
||
return nil, err
|
||
}
|
||
b, err := json.Marshal(result)
|
||
if err != nil {
|
||
return nil, fmt.Errorf("marshal result: %w", err)
|
||
}
|
||
return b, nil
|
||
}
|
||
|
||
func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
|
||
if sessionID == "" || s.cfg.SessionsDir == "" {
|
||
return task
|
||
}
|
||
entries, err := session.Read(s.cfg.SessionsDir, sessionID)
|
||
if err != nil || len(entries) == 0 {
|
||
return task
|
||
}
|
||
history := session.FormatHistory(entries, currentPhase)
|
||
if history == "" {
|
||
return task
|
||
}
|
||
return history + "\n---\n\n" + task
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Create config/supervisor/review.md**
|
||
|
||
```markdown
|
||
# Code Review Discipline
|
||
|
||
You are a disciplined code reviewer. Read files carefully before commenting.
|
||
|
||
## Iron laws
|
||
1. Never approve security vulnerabilities: command injection, SQL injection, credential exposure, path traversal, unchecked input at system boundaries
|
||
2. Never approve silently swallowed errors — `err != nil` without wrapping or handling is always wrong
|
||
3. Never approve missing validation at system boundaries (user input, external APIs, file reads)
|
||
|
||
## Output contract
|
||
Return JSON result with:
|
||
- `status`: "pass" if no blocking issues; "fail" if any iron law is violated
|
||
- `phase`: "review"
|
||
- `skill`: "review"
|
||
- `file_path`: first file reviewed
|
||
- `runner_output`: full review formatted as:
|
||
```
|
||
CRITICAL: <issue> at <file>:<line>
|
||
WARNING: <issue> at <file>:<line>
|
||
SUGGESTION: <issue> at <file>:<line>
|
||
```
|
||
- `verified`: true if you read all specified files; false if any were missing or unreadable
|
||
- `message`: "N critical, M warnings, K suggestions" or "clean: <which iron law checks passed and why>"
|
||
|
||
## Rules
|
||
1. Read every file listed before writing feedback
|
||
2. Check iron laws first — any violation is CRITICAL and sets status to "fail"
|
||
3. Then check: correctness, test coverage for new code, Go style conventions
|
||
4. Never rubber-stamp — if nothing is wrong, explain specifically which iron law checks you ran and why they passed
|
||
5. Line references are required for every finding — "roughly around the middle" is not acceptable
|
||
```
|
||
|
||
- [ ] **Step 6: Wire into main.go**
|
||
|
||
In `cmd/supervisor/main.go`, add after existing `os.ReadFile` calls for discipline files:
|
||
```go
|
||
reviewPrompt, err := os.ReadFile(cfg.ConfigDir + "/review.md")
|
||
if err != nil {
|
||
logger.Error("read review.md", "path", cfg.ConfigDir+"/review.md", "err", err)
|
||
os.Exit(1)
|
||
}
|
||
```
|
||
|
||
Add import: `"github.com/mathiasbq/supervisor/internal/skills/review"`
|
||
|
||
Register after existing `reg.Register` calls:
|
||
```go
|
||
reg.Register(review.New(review.Config{
|
||
SkillPrompt: string(reviewPrompt),
|
||
DefaultModel: models.Resolve("review", ""),
|
||
ExecutorFn: executor.Run,
|
||
SessionsDir: cfg.SessionsDir,
|
||
}))
|
||
```
|
||
|
||
- [ ] **Step 7: Run all tests**
|
||
|
||
```bash
|
||
go test ./... -race -count=1
|
||
```
|
||
Expected: all tests PASS
|
||
|
||
- [ ] **Step 8: Commit**
|
||
|
||
```bash
|
||
git add internal/skills/review/ config/supervisor/review.md cmd/supervisor/main.go
|
||
git commit -m "feat(review): add code review MCP skill with session history injection"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 5: debug skill
|
||
|
||
**Files:**
|
||
- Create: `internal/skills/debug/skill.go`
|
||
- Create: `internal/skills/debug/handlers.go`
|
||
- Create: `internal/skills/debug/handlers_test.go`
|
||
- Create: `config/supervisor/debug.md`
|
||
- Modify: `cmd/supervisor/main.go`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
```go
|
||
// internal/skills/debug/handlers_test.go
|
||
package debug_test
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"testing"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/skills/debug"
|
||
"github.com/stretchr/testify/assert"
|
||
"github.com/stretchr/testify/require"
|
||
)
|
||
|
||
func TestDebugToolRegistered(t *testing.T) {
|
||
sk := debug.New(debug.Config{SkillPrompt: "debug rules"})
|
||
names := make([]string, 0)
|
||
for _, tool := range sk.Tools() {
|
||
names = append(names, tool.Name)
|
||
}
|
||
assert.Contains(t, names, "debug")
|
||
}
|
||
|
||
func TestDebugRequiresProjectRoot(t *testing.T) {
|
||
sk := debug.New(debug.Config{SkillPrompt: "d"})
|
||
_, err := sk.Handle(context.Background(), "debug", json.RawMessage(`{"error":"panic: nil pointer"}`))
|
||
assert.ErrorContains(t, err, "project_root")
|
||
}
|
||
|
||
func TestDebugRequiresError(t *testing.T) {
|
||
sk := debug.New(debug.Config{SkillPrompt: "d"})
|
||
_, err := sk.Handle(context.Background(), "debug", json.RawMessage(`{"project_root":"/tmp"}`))
|
||
assert.ErrorContains(t, err, "error")
|
||
}
|
||
|
||
func TestDebugCallsExecutor(t *testing.T) {
|
||
called := false
|
||
var capturedTask string
|
||
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
|
||
called = true
|
||
capturedTask = req.TaskPrompt
|
||
return iexec.Result{
|
||
Status: "pass", Phase: "debug", Skill: "debug",
|
||
RunnerOutput: "HYPOTHESIS 1 (likelihood: high): nil map access\nVERIFY: go test ./... → expected: panic line reference",
|
||
Verified: false, ModelUsed: "self", Message: "3 hypotheses for: panic nil pointer at foo.go:42",
|
||
}, nil
|
||
}
|
||
|
||
sk := debug.New(debug.Config{SkillPrompt: "debug rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
|
||
out, err := sk.Handle(context.Background(), "debug", json.RawMessage(
|
||
`{"project_root":"/tmp/proj","error":"panic: nil pointer dereference at foo.go:42","context":"occurs on startup"}`,
|
||
))
|
||
require.NoError(t, err)
|
||
assert.True(t, called)
|
||
assert.Contains(t, capturedTask, "panic: nil pointer dereference")
|
||
assert.Contains(t, capturedTask, "occurs on startup")
|
||
|
||
var result iexec.Result
|
||
require.NoError(t, json.Unmarshal(out, &result))
|
||
assert.Equal(t, "debug", result.Phase)
|
||
}
|
||
|
||
var _ = require.New
|
||
```
|
||
|
||
- [ ] **Step 2: Run test to confirm it fails**
|
||
|
||
```bash
|
||
go test ./internal/skills/debug/... -v
|
||
```
|
||
Expected: FAIL — package does not exist
|
||
|
||
- [ ] **Step 3: Create skill.go**
|
||
|
||
```go
|
||
// internal/skills/debug/skill.go
|
||
package debug
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/registry"
|
||
)
|
||
|
||
type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)
|
||
|
||
type Config struct {
|
||
SkillPrompt string
|
||
DefaultModel string
|
||
ExecutorFn ExecutorFn
|
||
SessionsDir string
|
||
}
|
||
|
||
type Skill struct{ cfg Config }
|
||
|
||
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
|
||
|
||
func (s *Skill) Name() string { return "debug" }
|
||
|
||
func (s *Skill) Tools() []registry.ToolDef {
|
||
schema := func(required []string, props map[string]any) json.RawMessage {
|
||
b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
|
||
return b
|
||
}
|
||
str := map[string]any{"type": "string"}
|
||
return []registry.ToolDef{
|
||
{
|
||
Name: "debug",
|
||
Description: "Analyse an error and return 3-5 hypotheses ordered by likelihood, each with a concrete verification step.",
|
||
InputSchema: schema(
|
||
[]string{"project_root", "error"},
|
||
map[string]any{
|
||
"project_root": str,
|
||
"error": str,
|
||
"context": str,
|
||
"model": str,
|
||
"session_id": str,
|
||
},
|
||
),
|
||
},
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Create handlers.go**
|
||
|
||
```go
|
||
// internal/skills/debug/handlers.go
|
||
package debug
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"fmt"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/session"
|
||
)
|
||
|
||
type debugArgs struct {
|
||
ProjectRoot string `json:"project_root"`
|
||
Error string `json:"error"`
|
||
Context string `json:"context"`
|
||
Model string `json:"model"`
|
||
SessionID string `json:"session_id"`
|
||
}
|
||
|
||
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
|
||
if tool != "debug" {
|
||
return nil, fmt.Errorf("unknown tool: %s", tool)
|
||
}
|
||
var a debugArgs
|
||
if err := json.Unmarshal(args, &a); err != nil {
|
||
return nil, fmt.Errorf("parse args: %w", err)
|
||
}
|
||
if a.ProjectRoot == "" {
|
||
return nil, fmt.Errorf("project_root is required")
|
||
}
|
||
if a.Error == "" {
|
||
return nil, fmt.Errorf("error is required")
|
||
}
|
||
|
||
model := a.Model
|
||
if model == "" {
|
||
model = s.cfg.DefaultModel
|
||
}
|
||
|
||
task := fmt.Sprintf(
|
||
"phase: debug\nproject_root: %s\nerror: %s\ncontext: %s\nmodel: %s",
|
||
a.ProjectRoot, a.Error, a.Context, model,
|
||
)
|
||
task = s.prependHistory(a.SessionID, "debug", task)
|
||
|
||
if s.cfg.ExecutorFn == nil {
|
||
return nil, fmt.Errorf("no executor configured")
|
||
}
|
||
result, err := s.cfg.ExecutorFn(ctx, iexec.Request{
|
||
SkillPrompt: s.cfg.SkillPrompt,
|
||
TaskPrompt: task,
|
||
Model: model,
|
||
Tools: "Read,Bash",
|
||
})
|
||
if err != nil {
|
||
return nil, err
|
||
}
|
||
b, err := json.Marshal(result)
|
||
if err != nil {
|
||
return nil, fmt.Errorf("marshal result: %w", err)
|
||
}
|
||
return b, nil
|
||
}
|
||
|
||
func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
|
||
if sessionID == "" || s.cfg.SessionsDir == "" {
|
||
return task
|
||
}
|
||
entries, err := session.Read(s.cfg.SessionsDir, sessionID)
|
||
if err != nil || len(entries) == 0 {
|
||
return task
|
||
}
|
||
history := session.FormatHistory(entries, currentPhase)
|
||
if history == "" {
|
||
return task
|
||
}
|
||
return history + "\n---\n\n" + task
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Create config/supervisor/debug.md**
|
||
|
||
```markdown
|
||
# Debug Discipline
|
||
|
||
You are a systematic debugger. Form hypotheses before suggesting fixes.
|
||
|
||
## Iron laws
|
||
1. Never suggest "try X and see what happens" — every hypothesis must have a specific expected outcome if correct
|
||
2. Generate exactly 3-5 hypotheses, ordered by likelihood (most likely first)
|
||
3. Never fix the bug — diagnose only; the caller decides what to do with the hypotheses
|
||
|
||
## Output contract
|
||
Return JSON result with:
|
||
- `status`: "pass" (hypotheses generated) or "error" (error too ambiguous to analyse)
|
||
- `phase`: "debug"
|
||
- `skill`: "debug"
|
||
- `file_path`: the most relevant file to the error (read it)
|
||
- `runner_output`: your hypotheses, formatted as:
|
||
```
|
||
HYPOTHESIS 1 (likelihood: high): <mechanism>
|
||
VERIFY: <exact command or file to check> → expected if correct: <specific output>
|
||
|
||
HYPOTHESIS 2 (likelihood: medium): <mechanism>
|
||
VERIFY: <exact command or file to check> → expected if correct: <specific output>
|
||
```
|
||
- `verified`: false — verification is the caller's job
|
||
- `message`: "N hypotheses for: <one-line error summary>"
|
||
|
||
## Rules
|
||
1. Read the error and any context files provided before forming hypotheses
|
||
2. Identify the failure mode first — what actually went wrong, not just what the error says
|
||
3. For each hypothesis: name the mechanism, explain why it would produce this exact error, give a concrete verification command with expected output
|
||
4. If the error is clearly a typo or trivial mistake, still form 3 hypotheses — surface the most likely cause as #1
|
||
```
|
||
|
||
- [ ] **Step 6: Wire into main.go**
|
||
|
||
Add file read:
|
||
```go
|
||
debugPrompt, err := os.ReadFile(cfg.ConfigDir + "/debug.md")
|
||
if err != nil {
|
||
logger.Error("read debug.md", "path", cfg.ConfigDir+"/debug.md", "err", err)
|
||
os.Exit(1)
|
||
}
|
||
```
|
||
|
||
Add import: `"github.com/mathiasbq/supervisor/internal/skills/debug"`
|
||
|
||
Register skill:
|
||
```go
|
||
reg.Register(debug.New(debug.Config{
|
||
SkillPrompt: string(debugPrompt),
|
||
DefaultModel: models.Resolve("debug", ""),
|
||
ExecutorFn: executor.Run,
|
||
SessionsDir: cfg.SessionsDir,
|
||
}))
|
||
```
|
||
|
||
- [ ] **Step 7: Run all tests**
|
||
|
||
```bash
|
||
go test ./... -race -count=1
|
||
```
|
||
Expected: all tests PASS
|
||
|
||
- [ ] **Step 8: Commit**
|
||
|
||
```bash
|
||
git add internal/skills/debug/ config/supervisor/debug.md cmd/supervisor/main.go
|
||
git commit -m "feat(debug): add debug MCP skill with hypothesis generation"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 6: spec skill
|
||
|
||
**Files:**
|
||
- Create: `internal/skills/spec/skill.go`
|
||
- Create: `internal/skills/spec/handlers.go`
|
||
- Create: `internal/skills/spec/handlers_test.go`
|
||
- Create: `config/supervisor/spec.md`
|
||
- Modify: `cmd/supervisor/main.go`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
```go
|
||
// internal/skills/spec/handlers_test.go
|
||
package spec_test
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"testing"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/skills/spec"
|
||
"github.com/stretchr/testify/assert"
|
||
"github.com/stretchr/testify/require"
|
||
)
|
||
|
||
func TestSpecToolRegistered(t *testing.T) {
|
||
sk := spec.New(spec.Config{SkillPrompt: "spec rules"})
|
||
names := make([]string, 0)
|
||
for _, tool := range sk.Tools() {
|
||
names = append(names, tool.Name)
|
||
}
|
||
assert.Contains(t, names, "spec")
|
||
}
|
||
|
||
func TestSpecRequiresProjectRoot(t *testing.T) {
|
||
sk := spec.New(spec.Config{SkillPrompt: "s"})
|
||
_, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"requirements":"add login"}`))
|
||
assert.ErrorContains(t, err, "project_root")
|
||
}
|
||
|
||
func TestSpecRequiresRequirements(t *testing.T) {
|
||
sk := spec.New(spec.Config{SkillPrompt: "s"})
|
||
_, err := sk.Handle(context.Background(), "spec", json.RawMessage(`{"project_root":"/tmp"}`))
|
||
assert.ErrorContains(t, err, "requirements")
|
||
}
|
||
|
||
func TestSpecCallsExecutor(t *testing.T) {
|
||
called := false
|
||
var capturedTask string
|
||
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
|
||
called = true
|
||
capturedTask = req.TaskPrompt
|
||
return iexec.Result{
|
||
Status: "pass", Phase: "spec", Skill: "spec",
|
||
FilePath: "/tmp/proj/docs/login-spec.md",
|
||
Verified: true, ModelUsed: "self", Message: "spec written: login feature",
|
||
}, nil
|
||
}
|
||
|
||
sk := spec.New(spec.Config{SkillPrompt: "spec rules", ExecutorFn: fakeFn, SessionsDir: t.TempDir()})
|
||
out, err := sk.Handle(context.Background(), "spec", json.RawMessage(
|
||
`{"project_root":"/tmp/proj","requirements":"add OAuth2 login","output_path":"docs/login-spec.md"}`,
|
||
))
|
||
require.NoError(t, err)
|
||
assert.True(t, called)
|
||
assert.Contains(t, capturedTask, "OAuth2 login")
|
||
assert.Contains(t, capturedTask, "docs/login-spec.md")
|
||
|
||
var result iexec.Result
|
||
require.NoError(t, json.Unmarshal(out, &result))
|
||
assert.Equal(t, "spec", result.Phase)
|
||
}
|
||
|
||
var _ = require.New
|
||
```
|
||
|
||
- [ ] **Step 2: Run test to confirm it fails**
|
||
|
||
```bash
|
||
go test ./internal/skills/spec/... -v
|
||
```
|
||
Expected: FAIL — package does not exist
|
||
|
||
- [ ] **Step 3: Create skill.go**
|
||
|
||
```go
|
||
// internal/skills/spec/skill.go
|
||
package spec
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/registry"
|
||
)
|
||
|
||
type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)
|
||
|
||
type Config struct {
|
||
SkillPrompt string
|
||
DefaultModel string
|
||
ExecutorFn ExecutorFn
|
||
SessionsDir string
|
||
}
|
||
|
||
type Skill struct{ cfg Config }
|
||
|
||
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
|
||
|
||
func (s *Skill) Name() string { return "spec" }
|
||
|
||
func (s *Skill) Tools() []registry.ToolDef {
|
||
schema := func(required []string, props map[string]any) json.RawMessage {
|
||
b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
|
||
return b
|
||
}
|
||
str := map[string]any{"type": "string"}
|
||
return []registry.ToolDef{
|
||
{
|
||
Name: "spec",
|
||
Description: "Generate a structured implementation spec from requirements. Writes the spec to output_path in the project.",
|
||
InputSchema: schema(
|
||
[]string{"project_root", "requirements"},
|
||
map[string]any{
|
||
"project_root": str,
|
||
"requirements": str,
|
||
"output_path": str,
|
||
"context": str,
|
||
"model": str,
|
||
"session_id": str,
|
||
},
|
||
),
|
||
},
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Create handlers.go**
|
||
|
||
```go
|
||
// internal/skills/spec/handlers.go
|
||
package spec
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"fmt"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/session"
|
||
)
|
||
|
||
type specArgs struct {
|
||
ProjectRoot string `json:"project_root"`
|
||
Requirements string `json:"requirements"`
|
||
OutputPath string `json:"output_path"`
|
||
Context string `json:"context"`
|
||
Model string `json:"model"`
|
||
SessionID string `json:"session_id"`
|
||
}
|
||
|
||
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
|
||
if tool != "spec" {
|
||
return nil, fmt.Errorf("unknown tool: %s", tool)
|
||
}
|
||
var a specArgs
|
||
if err := json.Unmarshal(args, &a); err != nil {
|
||
return nil, fmt.Errorf("parse args: %w", err)
|
||
}
|
||
if a.ProjectRoot == "" {
|
||
return nil, fmt.Errorf("project_root is required")
|
||
}
|
||
if a.Requirements == "" {
|
||
return nil, fmt.Errorf("requirements is required")
|
||
}
|
||
outputPath := a.OutputPath
|
||
if outputPath == "" {
|
||
outputPath = "docs/spec.md"
|
||
}
|
||
|
||
model := a.Model
|
||
if model == "" {
|
||
model = s.cfg.DefaultModel
|
||
}
|
||
|
||
task := fmt.Sprintf(
|
||
"phase: spec\nproject_root: %s\nrequirements: %s\noutput_path: %s\ncontext: %s\nmodel: %s",
|
||
a.ProjectRoot, a.Requirements, outputPath, a.Context, model,
|
||
)
|
||
task = s.prependHistory(a.SessionID, "spec", task)
|
||
|
||
if s.cfg.ExecutorFn == nil {
|
||
return nil, fmt.Errorf("no executor configured")
|
||
}
|
||
result, err := s.cfg.ExecutorFn(ctx, iexec.Request{
|
||
SkillPrompt: s.cfg.SkillPrompt,
|
||
TaskPrompt: task,
|
||
Model: model,
|
||
Tools: "Read,Write",
|
||
})
|
||
if err != nil {
|
||
return nil, err
|
||
}
|
||
b, err := json.Marshal(result)
|
||
if err != nil {
|
||
return nil, fmt.Errorf("marshal result: %w", err)
|
||
}
|
||
return b, nil
|
||
}
|
||
|
||
func (s *Skill) prependHistory(sessionID, currentPhase, task string) string {
|
||
if sessionID == "" || s.cfg.SessionsDir == "" {
|
||
return task
|
||
}
|
||
entries, err := session.Read(s.cfg.SessionsDir, sessionID)
|
||
if err != nil || len(entries) == 0 {
|
||
return task
|
||
}
|
||
history := session.FormatHistory(entries, currentPhase)
|
||
if history == "" {
|
||
return task
|
||
}
|
||
return history + "\n---\n\n" + task
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Create config/supervisor/spec.md**
|
||
|
||
```markdown
|
||
# Spec Writing Discipline
|
||
|
||
You write structured implementation specs. Nothing is left ambiguous.
|
||
|
||
## Iron laws
|
||
1. Success criteria must be measurable — "the system is fast" is banned; "p99 < 200ms under 100 RPS" is valid
|
||
2. Always include an explicit "Out of scope" section — if you don't draw the boundary, the developer will guess wrong
|
||
3. Every technical decision in the approach must have a rationale
|
||
|
||
## Output contract
|
||
Return JSON result with:
|
||
- `status`: "pass" (spec written) or "error" (requirements too ambiguous to spec without more input)
|
||
- `phase`: "spec"
|
||
- `skill`: "spec"
|
||
- `file_path`: the output_path where the spec was written (absolute path)
|
||
- `runner_output`: ""
|
||
- `verified`: true if the file was written successfully
|
||
- `message`: "spec written: <one-line summary of what was specced>"
|
||
|
||
## Spec structure
|
||
Write the spec as markdown to the output_path:
|
||
|
||
```markdown
|
||
# [Feature] Spec
|
||
|
||
## Problem statement
|
||
[What problem does this solve? For whom? Why now?]
|
||
|
||
## Success criteria
|
||
- [ ] [Criterion 1 — measurable and verifiable]
|
||
- [ ] [Criterion 2 — measurable and verifiable]
|
||
|
||
## Constraints
|
||
[Non-negotiable requirements the solution must satisfy]
|
||
|
||
## Out of scope
|
||
[What we are explicitly NOT doing in this iteration]
|
||
|
||
## Technical approach
|
||
[Architecture decisions, key components, rationale for each choice]
|
||
|
||
## Risks
|
||
[What could go wrong, and how we'd mitigate it]
|
||
```
|
||
|
||
If the requirements are too vague to produce measurable success criteria, return status "error" with a message listing the specific questions that need answers.
|
||
```
|
||
|
||
- [ ] **Step 6: Wire into main.go**
|
||
|
||
Add file read:
|
||
```go
|
||
specPrompt, err := os.ReadFile(cfg.ConfigDir + "/spec.md")
|
||
if err != nil {
|
||
logger.Error("read spec.md", "path", cfg.ConfigDir+"/spec.md", "err", err)
|
||
os.Exit(1)
|
||
}
|
||
```
|
||
|
||
Add import: `"github.com/mathiasbq/supervisor/internal/skills/spec"`
|
||
|
||
Register skill:
|
||
```go
|
||
reg.Register(spec.New(spec.Config{
|
||
SkillPrompt: string(specPrompt),
|
||
DefaultModel: models.Resolve("spec", ""),
|
||
ExecutorFn: executor.Run,
|
||
SessionsDir: cfg.SessionsDir,
|
||
}))
|
||
```
|
||
|
||
- [ ] **Step 7: Run all tests**
|
||
|
||
```bash
|
||
go test ./... -race -count=1
|
||
```
|
||
Expected: all tests PASS
|
||
|
||
- [ ] **Step 8: Commit**
|
||
|
||
```bash
|
||
git add internal/skills/spec/ config/supervisor/spec.md cmd/supervisor/main.go
|
||
git commit -m "feat(spec): add spec writing MCP skill"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 7: trainer skill
|
||
|
||
**Files:**
|
||
- Create: `internal/skills/trainer/skill.go`
|
||
- Create: `internal/skills/trainer/handlers.go`
|
||
- Create: `internal/skills/trainer/handlers_test.go`
|
||
- Create: `config/supervisor/trainer-reader.md`
|
||
- Create: `config/supervisor/trainer-writer.md`
|
||
- Modify: `cmd/supervisor/main.go`
|
||
- Modify: `config/models.yaml`
|
||
|
||
- [ ] **Step 1: Write the failing test**
|
||
|
||
```go
|
||
// internal/skills/trainer/handlers_test.go
|
||
package trainer_test
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"testing"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/session"
|
||
"github.com/mathiasbq/supervisor/internal/skills/trainer"
|
||
"github.com/stretchr/testify/assert"
|
||
"github.com/stretchr/testify/require"
|
||
)
|
||
|
||
func TestTrainerToolRegistered(t *testing.T) {
|
||
sk := trainer.New(trainer.Config{ReaderPrompt: "r", WriterPrompt: "w"})
|
||
names := make([]string, 0)
|
||
for _, tool := range sk.Tools() {
|
||
names = append(names, tool.Name)
|
||
}
|
||
assert.Contains(t, names, "trainer")
|
||
}
|
||
|
||
func TestTrainerRequiresSessionID(t *testing.T) {
|
||
sk := trainer.New(trainer.Config{ReaderPrompt: "r", WriterPrompt: "w"})
|
||
_, err := sk.Handle(context.Background(), "trainer", json.RawMessage(`{}`))
|
||
assert.ErrorContains(t, err, "session_id")
|
||
}
|
||
|
||
func TestTrainerCallsReaderThenWriter(t *testing.T) {
|
||
sessDir := t.TempDir()
|
||
require.NoError(t, session.Append(sessDir, "sess-1", session.Entry{
|
||
SessionID: "sess-1", Skill: "tdd", Phase: "red", FinalStatus: "pass",
|
||
Message: "wrote failing test", FilePath: "internal/foo/foo_test.go",
|
||
}))
|
||
|
||
callCount := 0
|
||
var readerTask, writerTask string
|
||
|
||
fakeFn := func(_ context.Context, req iexec.Request) (iexec.Result, error) {
|
||
callCount++
|
||
if callCount == 1 {
|
||
// reader call
|
||
readerTask = req.TaskPrompt
|
||
return iexec.Result{
|
||
Status: "pass", Phase: "trainer", Skill: "trainer",
|
||
RunnerOutput: `[{"type":"sft","moment":"first-pass clean TDD","score":4}]`,
|
||
Verified: true, ModelUsed: "self", Message: "1 sft candidate found",
|
||
}, nil
|
||
}
|
||
// writer call
|
||
writerTask = req.TaskPrompt
|
||
return iexec.Result{
|
||
Status: "pass", Phase: "trainer", Skill: "trainer",
|
||
FilePath: sessDir + "/training-data/sft/sess-1.jsonl",
|
||
Verified: true, ModelUsed: "self", Message: "1 sft pair written",
|
||
}, nil
|
||
}
|
||
|
||
sk := trainer.New(trainer.Config{
|
||
ReaderPrompt: "reader rules",
|
||
WriterPrompt: "writer rules",
|
||
ExecutorFn: fakeFn,
|
||
SessionsDir: sessDir,
|
||
BrainDir: t.TempDir(),
|
||
})
|
||
out, err := sk.Handle(context.Background(), "trainer", json.RawMessage(`{"session_id":"sess-1"}`))
|
||
require.NoError(t, err)
|
||
|
||
assert.Equal(t, 2, callCount, "executor must be called exactly twice: reader then writer")
|
||
assert.Contains(t, readerTask, "role: reader")
|
||
assert.Contains(t, readerTask, "sess-1")
|
||
assert.Contains(t, readerTask, "wrote failing test") // session history in reader prompt
|
||
assert.Contains(t, writerTask, "role: writer")
|
||
assert.Contains(t, writerTask, "sft candidate") // reader output passed to writer
|
||
|
||
var result iexec.Result
|
||
require.NoError(t, json.Unmarshal(out, &result))
|
||
assert.Equal(t, "trainer", result.Phase)
|
||
assert.Equal(t, "pass", result.Status)
|
||
}
|
||
|
||
var _ = require.New
|
||
```
|
||
|
||
- [ ] **Step 2: Run test to confirm it fails**
|
||
|
||
```bash
|
||
go test ./internal/skills/trainer/... -v
|
||
```
|
||
Expected: FAIL — package does not exist
|
||
|
||
- [ ] **Step 3: Create skill.go**
|
||
|
||
```go
|
||
// internal/skills/trainer/skill.go
|
||
package trainer
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/registry"
|
||
)
|
||
|
||
type ExecutorFn func(ctx context.Context, req iexec.Request) (iexec.Result, error)
|
||
|
||
type Config struct {
|
||
ReaderPrompt string
|
||
WriterPrompt string
|
||
DefaultModel string
|
||
ExecutorFn ExecutorFn
|
||
SessionsDir string
|
||
BrainDir string // root of brain/ directory; writer writes to BrainDir/training-data/
|
||
}
|
||
|
||
type Skill struct{ cfg Config }
|
||
|
||
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
|
||
|
||
func (s *Skill) Name() string { return "trainer" }
|
||
|
||
func (s *Skill) Tools() []registry.ToolDef {
|
||
schema := func(required []string, props map[string]any) json.RawMessage {
|
||
b, _ := json.Marshal(map[string]any{"type": "object", "required": required, "properties": props})
|
||
return b
|
||
}
|
||
return []registry.ToolDef{
|
||
{
|
||
Name: "trainer",
|
||
Description: "Extract SFT and DPO training pairs from a session log. Runs a reader→writer chain: reader identifies learning moments, writer formats and writes pairs to brain/training-data/.",
|
||
InputSchema: schema(
|
||
[]string{"session_id"},
|
||
map[string]any{
|
||
"session_id": map[string]any{"type": "string"},
|
||
"model": map[string]any{"type": "string"},
|
||
},
|
||
),
|
||
},
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Create handlers.go**
|
||
|
||
```go
|
||
// internal/skills/trainer/handlers.go
|
||
package trainer
|
||
|
||
import (
|
||
"context"
|
||
"encoding/json"
|
||
"fmt"
|
||
|
||
iexec "github.com/mathiasbq/supervisor/internal/exec"
|
||
"github.com/mathiasbq/supervisor/internal/session"
|
||
)
|
||
|
||
type trainArgs struct {
|
||
SessionID string `json:"session_id"`
|
||
Model string `json:"model"`
|
||
}
|
||
|
||
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
|
||
if tool != "trainer" {
|
||
return nil, fmt.Errorf("unknown tool: %s", tool)
|
||
}
|
||
var a trainArgs
|
||
if err := json.Unmarshal(args, &a); err != nil {
|
||
return nil, fmt.Errorf("parse args: %w", err)
|
||
}
|
||
if a.SessionID == "" {
|
||
return nil, fmt.Errorf("session_id is required")
|
||
}
|
||
if s.cfg.ExecutorFn == nil {
|
||
return nil, fmt.Errorf("no executor configured")
|
||
}
|
||
|
||
model := a.Model
|
||
if model == "" {
|
||
model = s.cfg.DefaultModel
|
||
}
|
||
|
||
entries, err := session.Read(s.cfg.SessionsDir, a.SessionID)
|
||
if err != nil {
|
||
return nil, fmt.Errorf("read session log: %w", err)
|
||
}
|
||
|
||
// ── Step 1: Reader agent ─────────────────────────────────────────────────
|
||
history := session.FormatHistory(entries, "")
|
||
readerTask := fmt.Sprintf(
|
||
"role: reader\nsession_id: %s\nbrain_dir: %s\n\n%s",
|
||
a.SessionID, s.cfg.BrainDir, history,
|
||
)
|
||
readerResult, err := s.cfg.ExecutorFn(ctx, iexec.Request{
|
||
SkillPrompt: s.cfg.ReaderPrompt,
|
||
TaskPrompt: readerTask,
|
||
Model: model,
|
||
Tools: "Read",
|
||
})
|
||
if err != nil {
|
||
return nil, fmt.Errorf("reader agent: %w", err)
|
||
}
|
||
|
||
// ── Step 2: Writer agent (receives reader candidates) ────────────────────
|
||
writerTask := fmt.Sprintf(
|
||
"role: writer\nsession_id: %s\nbrain_dir: %s\n\nreader_candidates:\n%s",
|
||
a.SessionID, s.cfg.BrainDir, readerResult.RunnerOutput,
|
||
)
|
||
writerResult, err := s.cfg.ExecutorFn(ctx, iexec.Request{
|
||
SkillPrompt: s.cfg.WriterPrompt,
|
||
TaskPrompt: writerTask,
|
||
Model: model,
|
||
Tools: "Read,Write",
|
||
})
|
||
if err != nil {
|
||
return nil, fmt.Errorf("writer agent: %w", err)
|
||
}
|
||
|
||
b, err := json.Marshal(writerResult)
|
||
if err != nil {
|
||
return nil, fmt.Errorf("marshal result: %w", err)
|
||
}
|
||
return b, nil
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Create config/supervisor/trainer-reader.md**
|
||
|
||
```markdown
|
||
# Trainer Reader Discipline
|
||
|
||
You scan session logs and identify candidate learning moments worth converting to training data.
|
||
|
||
## What to look for
|
||
- **SFT candidates**: the worker did exactly the right thing — a clean pattern worth reinforcing
|
||
- **DPO candidates**: the worker first produced a wrong or suboptimal response, then corrected — you have both rejected and chosen
|
||
|
||
## Scoring (1–5)
|
||
- 5: novel pattern, clearly correct, generalises across projects
|
||
- 4: good pattern, correct, somewhat project-specific but still useful
|
||
- 3: correct but obvious — include only if especially clean
|
||
- 2 or below: skip — too ambiguous or too context-specific
|
||
|
||
## Output contract
|
||
Return JSON result with:
|
||
- `status`: "pass" or "error"
|
||
- `phase`: "trainer"
|
||
- `skill`: "trainer"
|
||
- `file_path`: ""
|
||
- `runner_output`: JSON array of candidates (valid JSON, not markdown):
|
||
[{"type":"sft","moment":"<what happened>","prompt":"<what was asked>","completion":"<what was done right>","score":4},
|
||
{"type":"dpo","moment":"<what happened>","prompt":"<what was asked>","chosen":"<correct>","rejected":"<incorrect>","score":3}]
|
||
- `verified`: true
|
||
- `message`: "N sft candidates, M dpo candidates found"
|
||
|
||
## Rules
|
||
1. Read all session entries in the task prompt
|
||
2. Score each entry — only include entries scoring >= 3
|
||
3. Prompt/completion fields must be phrased to generalise: no project-specific paths or names
|
||
4. If no candidates score >= 3, return an empty array `[]` — never force low-quality candidates
|
||
```
|
||
|
||
- [ ] **Step 6: Create config/supervisor/trainer-writer.md**
|
||
|
||
```markdown
|
||
# Trainer Writer Discipline
|
||
|
||
You receive candidate learning moments from the reader and write clean SFT/DPO training pairs.
|
||
|
||
## Quality gate (apply before writing)
|
||
- SFT: prompt must be phrased so it could come from any project, not just this one
|
||
- DPO: chosen and rejected must be clearly distinguishable — skip if a reader can't tell which is better
|
||
- Never include project-specific paths, variable names, or identifiers in any pair
|
||
|
||
## Output contract
|
||
Return JSON result with:
|
||
- `status`: "pass" (pairs written or skipped due to quality) or "error" (candidates JSON was malformed)
|
||
- `phase`: "trainer"
|
||
- `skill`: "trainer"
|
||
- `file_path`: path of the last file written (empty if nothing passed quality gate)
|
||
- `runner_output`: "N SFT pairs written to brain/training-data/sft/, M DPO pairs to brain/training-data/dpo/" or "0 pairs passed quality gate"
|
||
- `verified`: true if files were written; false if nothing passed
|
||
- `message`: "N sft + M dpo pairs for session <id>" or "no pairs passed quality gate"
|
||
|
||
## File format
|
||
JSONL — one JSON object per line.
|
||
|
||
SFT: `{"prompt": "...", "completion": "..."}`
|
||
DPO: `{"prompt": "...", "chosen": "...", "rejected": "..."}`
|
||
|
||
Write SFT to: `<brain_dir>/training-data/sft/<session_id>.jsonl`
|
||
Write DPO to: `<brain_dir>/training-data/dpo/<session_id>.jsonl`
|
||
|
||
Append to existing files if they exist (don't overwrite).
|
||
|
||
## Rules
|
||
1. Parse the `reader_candidates` JSON from the task prompt
|
||
2. For each candidate: apply quality gate
|
||
3. Write passing SFT candidates to sft JSONL, DPO candidates to dpo JSONL
|
||
4. If nothing passes, return status "pass" with verified: false and message "no pairs passed quality gate"
|
||
```
|
||
|
||
- [ ] **Step 7: Wire into main.go**
|
||
|
||
Add file reads:
|
||
```go
|
||
trainerReaderPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-reader.md")
|
||
if err != nil {
|
||
logger.Error("read trainer-reader.md", "path", cfg.ConfigDir+"/trainer-reader.md", "err", err)
|
||
os.Exit(1)
|
||
}
|
||
trainerWriterPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-writer.md")
|
||
if err != nil {
|
||
logger.Error("read trainer-writer.md", "path", cfg.ConfigDir+"/trainer-writer.md", "err", err)
|
||
os.Exit(1)
|
||
}
|
||
```
|
||
|
||
Add import: `"github.com/mathiasbq/supervisor/internal/skills/trainer"`
|
||
|
||
Register skill:
|
||
```go
|
||
reg.Register(trainer.New(trainer.Config{
|
||
ReaderPrompt: string(trainerReaderPrompt),
|
||
WriterPrompt: string(trainerWriterPrompt),
|
||
DefaultModel: models.Resolve("trainer", ""),
|
||
ExecutorFn: executor.Run,
|
||
SessionsDir: cfg.SessionsDir,
|
||
BrainDir: cfg.BrainDir,
|
||
}))
|
||
```
|
||
|
||
- [ ] **Step 8: Update config/models.yaml**
|
||
|
||
Add `trainer` entry following existing format:
|
||
```yaml
|
||
skills:
|
||
tdd: ollama/qwen3-coder-30b-tuned
|
||
review: ollama/devstral-tuned
|
||
debug: ollama/deepseek-r1-tuned
|
||
retrospective: ollama/qwen3-coder-30b-tuned
|
||
spec: ollama/qwen3-coder-30b-tuned
|
||
trainer: ollama/qwen3-coder-30b-tuned
|
||
```
|
||
|
||
- [ ] **Step 9: Run all tests**
|
||
|
||
```bash
|
||
go test ./... -race -count=1
|
||
```
|
||
Expected: all tests PASS, including `TestTrainerCallsReaderThenWriter`
|
||
|
||
- [ ] **Step 10: Commit**
|
||
|
||
```bash
|
||
git add internal/skills/trainer/ config/supervisor/trainer-reader.md config/supervisor/trainer-writer.md config/models.yaml cmd/supervisor/main.go
|
||
git commit -m "feat(trainer): add trainer MCP skill with reader→writer sub-agent chain"
|
||
```
|
||
|
||
---
|
||
|
||
## Task 8: Integration smoke test and CI push
|
||
|
||
**Files:** none new — validates the full system
|
||
|
||
- [ ] **Step 1: Run task check (full quality gate)**
|
||
|
||
```bash
|
||
task check
|
||
```
|
||
Expected: lint, test, and vet all pass for both modules
|
||
|
||
- [ ] **Step 2: Verify all 12 MCP tools are registered**
|
||
|
||
With servers running (`task start` in another terminal):
|
||
```bash
|
||
curl -s -X POST http://localhost:3200/mcp \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' \
|
||
| python3 -c "import json,sys; tools=json.load(sys.stdin)['result']['tools']; [print(t['name']) for t in tools]"
|
||
```
|
||
|
||
Expected output (12 tools):
|
||
```
|
||
tdd_red
|
||
tdd_green
|
||
tdd_refactor
|
||
brain_query
|
||
brain_write
|
||
tier
|
||
session_log
|
||
retrospective
|
||
review
|
||
debug
|
||
spec
|
||
trainer
|
||
```
|
||
|
||
- [ ] **Step 3: Smoke test each new skill**
|
||
|
||
```bash
|
||
# review
|
||
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"review","arguments":{"project_root":"/tmp","files":["nonexistent.go"],"context":"test"}}}'
|
||
|
||
# debug
|
||
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"debug","arguments":{"project_root":"/tmp","error":"test error"}}}'
|
||
|
||
# spec
|
||
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"jsonrpc":"2.0","id":4,"method":"tools/call","params":{"name":"spec","arguments":{"project_root":"/tmp","requirements":"add login button"}}}'
|
||
|
||
# trainer (requires a session log entry)
|
||
curl -s --max-time 10 -X POST http://localhost:3200/mcp \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"jsonrpc":"2.0","id":5,"method":"tools/call","params":{"name":"trainer","arguments":{"session_id":"2026-04-17-validate-hyperguild"}}}'
|
||
```
|
||
|
||
Each should return a valid JSON-RPC result (not `-32000` error). The actual worker call will take longer — `--max-time 10` just tests that the MCP dispatch layer works.
|
||
|
||
- [ ] **Step 4: Push and verify CI**
|
||
|
||
```bash
|
||
git push origin main
|
||
```
|
||
|
||
Watch the CI run complete on Gitea. Check for:
|
||
- `check` job: green (lint + test + vet)
|
||
- `mirror` job: green (pushed to GitHub)
|
||
|
||
```bash
|
||
curl -s "https://gitea.d-ma.be/api/v1/repos/mathias/hyperguild/actions/runs?limit=1" \
|
||
-H "Authorization: token $GITEA_TOKEN" | python3 -c "
|
||
import json,sys
|
||
d=json.load(sys.stdin)
|
||
for r in d.get('workflow_runs',[]):
|
||
print(f'#{r[\"id\"]} {r[\"status\"]:12} {r.get(\"conclusion\",\"\"):8} {r[\"display_title\"][:50]}')
|
||
"
|
||
```
|
||
|
||
- [ ] **Step 5: Tag v0.2.0**
|
||
|
||
```bash
|
||
task tag version=v0.2.0
|
||
```
|
||
|
||
---
|
||
|
||
## Self-review
|
||
|
||
**Spec coverage check:**
|
||
- Session history injection into tdd_green and tdd_refactor → Task 3 ✓
|
||
- All new skills accept `session_id` for history injection → Tasks 4–7 ✓
|
||
- Trainer reader→writer chain → Task 7 ✓
|
||
- Schema enum fixed (was causing retrospective to return wrong phase) → Task 2 ✓
|
||
- Phase 2 skills registered in main.go → Tasks 4–7 each include main.go wiring ✓
|
||
- CI passes → Task 8 ✓
|
||
|
||
**Placeholder scan:** None found — all steps include complete code.
|
||
|
||
**Type consistency:**
|
||
- `ExecutorFn` is defined in each skill package as `func(ctx context.Context, req iexec.Request) (iexec.Result, error)` — consistent across tdd, review, debug, spec, trainer ✓
|
||
- `Config.SessionsDir` present in all new skills ✓
|
||
- `trainer.Config.BrainDir` used in handlers.go Task 7 writer task prompt ✓
|
||
- `session.FormatHistory` signature `(entries []Entry, excludePhase string) string` used consistently ✓
|
||
- `prependHistory` method is defined identically in review, debug, spec handlers — this is intentional duplication (YAGNI: not enough skills to justify extracting a shared mixin) ✓
|
||
|
||
**Note on tasks 4–7:** These are independent of each other and can be executed in parallel by separate subagents after Tasks 1–3 are complete.
|