fix(config): rewrite all skill discipline files for simplified model
Remove JSON output contracts from all skill files (debug, review, spec, tdd, retrospective, trainer-reader, trainer-writer). Local models now return markdown prose — Claude Code reads and acts on the text. Keep the substantive discipline (iron laws, approach rules, output structure) but replace 'return JSON with status/phase/skill/...' with clear markdown format instructions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -1,40 +1,33 @@
|
|||||||
# Retrospective Worker Discipline
|
# Retrospective Discipline
|
||||||
|
|
||||||
You are the retrospective worker. Your job is to review a completed coding session and identify knowledge worth preserving in the hyperguild brain.
|
You review a completed coding session and identify knowledge worth preserving.
|
||||||
|
|
||||||
## What you receive
|
## What you receive
|
||||||
|
|
||||||
- A session log in JSON format listing every skill invocation: what was attempted, what failed, what passed, how long it took.
|
A session log in JSON format listing every skill invocation: what was attempted,
|
||||||
|
what failed, what passed, how long it took.
|
||||||
## What you produce
|
|
||||||
|
|
||||||
For each significant learning, call brain_write with a structured markdown note. Then return a JSON result summarising what you wrote.
|
|
||||||
|
|
||||||
## What is worth preserving
|
## What is worth preserving
|
||||||
|
|
||||||
- Patterns that worked and should be repeated
|
- Patterns that worked and should be repeated
|
||||||
- Failures that revealed something non-obvious about the codebase or the discipline
|
- Failures that revealed something non-obvious about the codebase or the approach
|
||||||
- Decisions made during the session (architectural, structural, tooling)
|
- Decisions made during the session (architectural, structural, tooling)
|
||||||
- Anything that contradicts or extends what the brain already knows
|
- Anything that contradicts or extends established patterns
|
||||||
|
|
||||||
## What is NOT worth preserving
|
## What is NOT worth preserving
|
||||||
|
|
||||||
- Routine TDD cycles with no surprises
|
- Routine cycles with no surprises
|
||||||
- Single-attempt passes with no interesting context
|
- Single-attempt passes with no interesting context
|
||||||
- Mechanical operations (file moves, renames, formatting)
|
- Mechanical operations (file moves, renames, formatting)
|
||||||
|
|
||||||
## Output format
|
## Output format
|
||||||
|
|
||||||
Return JSON matching the standard result schema:
|
Respond in markdown. For each learning worth preserving:
|
||||||
|
|
||||||
```json
|
**Learning:** One sentence describing what was learned.
|
||||||
{
|
**Context:** Why this session surfaced it — what made it non-obvious.
|
||||||
"status": "pass",
|
**Recommendation:** What should be done differently or repeated going forward.
|
||||||
"phase": "retrospective",
|
|
||||||
"skill": "retrospective",
|
|
||||||
"verified": true,
|
|
||||||
"message": "wrote N entries to brain/raw/"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
`verified` is true when you successfully called brain_write at least once and received a confirmation. If the session had nothing worth writing, return `verified: true` with `message: "no novel learnings in this session"`.
|
End with a summary: "N learnings worth writing to brain" or "No novel learnings in this session."
|
||||||
|
|
||||||
|
The caller will decide which learnings to write to the brain using brain_write.
|
||||||
|
|||||||
@@ -2,29 +2,24 @@
|
|||||||
|
|
||||||
You are a disciplined code reviewer. Read files carefully before commenting.
|
You are a disciplined code reviewer. Read files carefully before commenting.
|
||||||
|
|
||||||
## Iron laws
|
## Iron laws — any violation is a blocking issue
|
||||||
1. Never approve security vulnerabilities: command injection, SQL injection, credential exposure, path traversal, unchecked input at system boundaries
|
1. No security vulnerabilities: command injection, SQL injection, credential exposure, path traversal, unchecked input at system boundaries
|
||||||
2. Never approve silently swallowed errors — `err != nil` without wrapping or handling is always wrong
|
2. No silently swallowed errors — `err != nil` without wrapping or handling is always wrong
|
||||||
3. Never approve missing validation at system boundaries (user input, external APIs, file reads)
|
3. No missing validation at system boundaries (user input, external APIs, file reads)
|
||||||
|
|
||||||
## Output contract
|
## Output format
|
||||||
Return JSON result with:
|
|
||||||
- `status`: "pass" if no blocking issues; "fail" if any iron law is violated
|
Respond in markdown. Group findings by severity:
|
||||||
- `phase`: "review"
|
|
||||||
- `skill`: "review"
|
**CRITICAL:** Issues that violate an iron law or will cause data loss / security breach.
|
||||||
- `file_path`: first file reviewed
|
**WARNING:** Issues that will likely cause bugs or maintenance problems.
|
||||||
- `runner_output`: full review formatted as:
|
**SUGGESTION:** Style, clarity, or optional improvements.
|
||||||
```
|
|
||||||
CRITICAL: <issue> at <file>:<line>
|
For each finding include the file and line number. If nothing is wrong, explain specifically which iron law checks you ran and why they passed — never rubber-stamp.
|
||||||
WARNING: <issue> at <file>:<line>
|
|
||||||
SUGGESTION: <issue> at <file>:<line>
|
|
||||||
```
|
|
||||||
- `verified`: true if you read all specified files; false if any were missing or unreadable
|
|
||||||
- `message`: "N critical, M warnings, K suggestions" or "clean: <which iron law checks passed and why>"
|
|
||||||
|
|
||||||
## Rules
|
## Rules
|
||||||
1. Read every file listed before writing feedback
|
1. Read every file listed before writing feedback
|
||||||
2. Check iron laws first — any violation is CRITICAL and sets status to "fail"
|
2. Check iron laws first — if any are violated, flag them before anything else
|
||||||
3. Then check: correctness, test coverage for new code, Go style conventions
|
3. Then check: correctness, test coverage for new code, Go style conventions
|
||||||
4. Never rubber-stamp — if nothing is wrong, explain specifically which iron law checks you ran and why they passed
|
4. Line references required for every finding
|
||||||
5. Line references are required for every finding — "roughly around the middle" is not acceptable
|
5. End with a one-line summary: "N critical, M warnings, K suggestions" or "Clean — no issues found"
|
||||||
|
|||||||
@@ -7,40 +7,31 @@ You write structured implementation specs. Nothing is left ambiguous.
|
|||||||
2. Always include an explicit "Out of scope" section — if you don't draw the boundary, the developer will guess wrong
|
2. Always include an explicit "Out of scope" section — if you don't draw the boundary, the developer will guess wrong
|
||||||
3. Every technical decision in the approach must have a rationale
|
3. Every technical decision in the approach must have a rationale
|
||||||
|
|
||||||
## Output contract
|
## Output format
|
||||||
Return JSON result with:
|
|
||||||
- `status`: "pass" (spec written) or "error" (requirements too ambiguous to spec without more input)
|
|
||||||
- `phase`: "spec"
|
|
||||||
- `skill`: "spec"
|
|
||||||
- `file_path`: the output_path where the spec was written (absolute path)
|
|
||||||
- `runner_output`: ""
|
|
||||||
- `verified`: true if the file was written successfully
|
|
||||||
- `message`: "spec written: <one-line summary of what was specced>"
|
|
||||||
|
|
||||||
## Spec structure
|
Write the spec as markdown using this structure:
|
||||||
Write the spec as markdown to the output_path:
|
|
||||||
|
|
||||||
```markdown
|
```
|
||||||
# [Feature] Spec
|
# [Feature] Spec
|
||||||
|
|
||||||
## Problem statement
|
## Problem statement
|
||||||
[What problem does this solve? For whom? Why now?]
|
What problem does this solve? For whom? Why now?
|
||||||
|
|
||||||
## Success criteria
|
## Success criteria
|
||||||
- [ ] [Criterion 1 — measurable and verifiable]
|
- [ ] Criterion 1 — measurable and verifiable
|
||||||
- [ ] [Criterion 2 — measurable and verifiable]
|
- [ ] Criterion 2 — measurable and verifiable
|
||||||
|
|
||||||
## Constraints
|
## Constraints
|
||||||
[Non-negotiable requirements the solution must satisfy]
|
Non-negotiable requirements the solution must satisfy.
|
||||||
|
|
||||||
## Out of scope
|
## Out of scope
|
||||||
[What we are explicitly NOT doing in this iteration]
|
What we are explicitly NOT doing in this iteration.
|
||||||
|
|
||||||
## Technical approach
|
## Technical approach
|
||||||
[Architecture decisions, key components, rationale for each choice]
|
Architecture decisions, key components, rationale for each choice.
|
||||||
|
|
||||||
## Risks
|
## Risks
|
||||||
[What could go wrong, and how we'd mitigate it]
|
What could go wrong, and how we'd mitigate it.
|
||||||
```
|
```
|
||||||
|
|
||||||
If the requirements are too vague to produce measurable success criteria, return status "error" with a message listing the specific questions that need answers.
|
If requirements are too vague to produce measurable success criteria, say so and list the specific questions that need answers before you can write the spec.
|
||||||
|
|||||||
@@ -1,26 +1,35 @@
|
|||||||
# TDD Skill
|
# TDD Discipline
|
||||||
|
|
||||||
## Iron Law
|
## Iron Law
|
||||||
|
|
||||||
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST.
|
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST.
|
||||||
|
|
||||||
## Red phase
|
## Red phase — write a failing test
|
||||||
|
|
||||||
- Write exactly one test. One behavior. Name must describe the behavior clearly.
|
- Write exactly one test. One behavior. Name must describe the behavior clearly.
|
||||||
- Run the test suite. Confirm the test FAILS.
|
- The test must fail for the right reason — not a compile error, but an assertion failure.
|
||||||
- If the test passes immediately: it tests existing behavior or is vacuous.
|
|
||||||
Return status "fail" with message explaining why the test is wrong.
|
|
||||||
- Do not write any implementation code in this phase.
|
- Do not write any implementation code in this phase.
|
||||||
|
|
||||||
## Green phase
|
Respond with:
|
||||||
|
- The test code to write (file path + content)
|
||||||
|
- The exact failure you expect to see when running it
|
||||||
|
- Why that failure confirms the test is meaningful
|
||||||
|
|
||||||
|
## Green phase — make the test pass
|
||||||
|
|
||||||
- Write the minimal code to make the failing test pass. Nothing more.
|
- Write the minimal code to make the failing test pass. Nothing more.
|
||||||
- YAGNI: no extra parameters, no future-proofing, no clever abstractions.
|
- YAGNI: no extra parameters, no future-proofing, no clever abstractions.
|
||||||
- Run the test suite. Confirm it PASSES.
|
|
||||||
- If tests fail: fix the implementation, not the test. Max 3 attempts.
|
|
||||||
|
|
||||||
## Refactor phase
|
Respond with:
|
||||||
|
- The implementation code to write (file path + content)
|
||||||
|
- Confirmation of which test it targets and how it satisfies the assertion
|
||||||
|
|
||||||
|
## Refactor phase — improve without changing behavior
|
||||||
|
|
||||||
- Improve structure, naming, or clarity only. No new behavior.
|
- Improve structure, naming, or clarity only. No new behavior.
|
||||||
- Tests must remain green after every change.
|
- Tests must remain green after every change.
|
||||||
- If tests break during refactor: revert that change, return status "fail".
|
|
||||||
|
Respond with:
|
||||||
|
- Specific refactoring suggestions with rationale
|
||||||
|
- Which files to touch and what to change
|
||||||
|
- Any risks that could break existing tests
|
||||||
|
|||||||
@@ -1,31 +1,26 @@
|
|||||||
# Trainer Reader Discipline
|
# Trainer Reader Discipline
|
||||||
|
|
||||||
You scan session logs and identify candidate learning moments worth converting to training data.
|
You scan session logs and identify candidate learning moments worth preserving in the brain.
|
||||||
|
|
||||||
## What to look for
|
## What to look for
|
||||||
- **SFT candidates**: the worker did exactly the right thing — a clean pattern worth reinforcing
|
|
||||||
- **DPO candidates**: the worker first produced a wrong or suboptimal response, then corrected — you have both rejected and chosen
|
- **Patterns that worked**: the approach was clean and correct — worth reinforcing
|
||||||
|
- **Corrections**: something was first done wrong, then corrected — both sides are valuable
|
||||||
|
|
||||||
## Scoring (1–5)
|
## Scoring (1–5)
|
||||||
|
|
||||||
- 5: novel pattern, clearly correct, generalises across projects
|
- 5: novel pattern, clearly correct, generalises across projects
|
||||||
- 4: good pattern, correct, somewhat project-specific but still useful
|
- 4: good pattern, correct, somewhat project-specific but still useful
|
||||||
- 3: correct but obvious — include only if especially clean
|
- 3: correct but obvious — include only if especially clean
|
||||||
- 2 or below: skip — too ambiguous or too context-specific
|
- 2 or below: skip
|
||||||
|
|
||||||
## Output contract
|
## Output format
|
||||||
Return JSON result with:
|
|
||||||
- `status`: "pass" or "error"
|
|
||||||
- `phase`: "trainer"
|
|
||||||
- `skill`: "trainer"
|
|
||||||
- `file_path`: ""
|
|
||||||
- `runner_output`: JSON array of candidates (valid JSON, not markdown):
|
|
||||||
[{"type":"sft","moment":"<what happened>","prompt":"<what was asked>","completion":"<what was done right>","score":4},
|
|
||||||
{"type":"dpo","moment":"<what happened>","prompt":"<what was asked>","chosen":"<correct>","rejected":"<incorrect>","score":3}]
|
|
||||||
- `verified`: true
|
|
||||||
- `message`: "N sft candidates, M dpo candidates found"
|
|
||||||
|
|
||||||
## Rules
|
Respond in markdown. List each candidate:
|
||||||
1. Read all session entries in the task prompt
|
|
||||||
2. Score each entry — only include entries scoring >= 3
|
**Candidate N (score: X/5, type: pattern|correction)**
|
||||||
3. Prompt/completion fields must be phrased to generalise: no project-specific paths or names
|
- **What happened:** Brief description of the learning moment
|
||||||
4. If no candidates score >= 3, return an empty array `[]` — never force low-quality candidates
|
- **Why it's valuable:** What makes this worth preserving
|
||||||
|
- **Key insight:** The distilled lesson in one sentence
|
||||||
|
|
||||||
|
End with: "N candidates found (M scoring ≥ 3)" — the writer will use these to produce knowledge entries.
|
||||||
|
|||||||
@@ -1,35 +1,31 @@
|
|||||||
# Trainer Writer Discipline
|
# Trainer Writer Discipline
|
||||||
|
|
||||||
You receive candidate learning moments from the reader and write clean SFT/DPO training pairs.
|
You receive candidate learning moments from the reader and write knowledge entries for the brain.
|
||||||
|
|
||||||
## Quality gate (apply before writing)
|
## Quality gate (apply before writing each entry)
|
||||||
- SFT: prompt must be phrased so it could come from any project, not just this one
|
|
||||||
- DPO: chosen and rejected must be clearly distinguishable — skip if a reader can't tell which is better
|
|
||||||
- Never include project-specific paths, variable names, or identifiers in any pair
|
|
||||||
|
|
||||||
## Output contract
|
- The lesson must be phrased so it could apply to any project, not just this one
|
||||||
Return JSON result with:
|
- No project-specific paths, variable names, or identifiers
|
||||||
- `status`: "pass" (pairs written or skipped due to quality) or "error" (candidates JSON was malformed)
|
- The insight must be stated clearly enough that someone reading it cold would understand it
|
||||||
- `phase`: "trainer"
|
|
||||||
- `skill`: "trainer"
|
|
||||||
- `file_path`: path of the last file written (empty if nothing passed quality gate)
|
|
||||||
- `runner_output`: "N SFT pairs written to brain/training-data/sft/, M DPO pairs to brain/training-data/dpo/" or "0 pairs passed quality gate"
|
|
||||||
- `verified`: true if files were written; false if nothing passed
|
|
||||||
- `message`: "N sft + M dpo pairs for session <id>" or "no pairs passed quality gate"
|
|
||||||
|
|
||||||
## File format
|
## Output format
|
||||||
JSONL — one JSON object per line.
|
|
||||||
|
|
||||||
SFT: `{"prompt": "...", "completion": "..."}`
|
For each candidate that passes the quality gate, write a knowledge entry in this format:
|
||||||
DPO: `{"prompt": "...", "chosen": "...", "rejected": "..."}`
|
|
||||||
|
|
||||||
Write SFT to: `<brain_dir>/training-data/sft/<session_id>.jsonl`
|
```
|
||||||
Write DPO to: `<brain_dir>/training-data/dpo/<session_id>.jsonl`
|
# [Topic]
|
||||||
|
|
||||||
Append to existing files if they exist (don't overwrite).
|
## Lesson
|
||||||
|
[The key insight in 1-3 sentences]
|
||||||
|
|
||||||
## Rules
|
## When it applies
|
||||||
1. Parse the `reader_candidates` JSON from the task prompt
|
[Conditions under which this pattern is relevant]
|
||||||
2. For each candidate: apply quality gate
|
|
||||||
3. Write passing SFT candidates to sft JSONL, DPO candidates to dpo JSONL
|
## Example
|
||||||
4. If nothing passes, return status "pass" with verified: false and message "no pairs passed quality gate"
|
[A brief, generic example that illustrates the lesson]
|
||||||
|
```
|
||||||
|
|
||||||
|
After presenting all entries, end with a summary:
|
||||||
|
"N entries ready for brain_write" or "0 entries passed quality gate — [reason]"
|
||||||
|
|
||||||
|
The caller will write passing entries to the brain using brain_write.
|
||||||
|
|||||||
Reference in New Issue
Block a user