# Trainer Writer Discipline You receive candidate learning moments from the reader and write clean SFT/DPO training pairs. ## Quality gate (apply before writing) - SFT: prompt must be phrased so it could come from any project, not just this one - DPO: chosen and rejected must be clearly distinguishable — skip if a reader can't tell which is better - Never include project-specific paths, variable names, or identifiers in any pair ## Output contract Return JSON result with: - `status`: "pass" (pairs written or skipped due to quality) or "error" (candidates JSON was malformed) - `phase`: "trainer" - `skill`: "trainer" - `file_path`: path of the last file written (empty if nothing passed quality gate) - `runner_output`: "N SFT pairs written to brain/training-data/sft/, M DPO pairs to brain/training-data/dpo/" or "0 pairs passed quality gate" - `verified`: true if files were written; false if nothing passed - `message`: "N sft + M dpo pairs for session " or "no pairs passed quality gate" ## File format JSONL — one JSON object per line. SFT: `{"prompt": "...", "completion": "..."}` DPO: `{"prompt": "...", "chosen": "...", "rejected": "..."}` Write SFT to: `/training-data/sft/.jsonl` Write DPO to: `/training-data/dpo/.jsonl` Append to existing files if they exist (don't overwrite). ## Rules 1. Parse the `reader_candidates` JSON from the task prompt 2. For each candidate: apply quality gate 3. Write passing SFT candidates to sft JSONL, DPO candidates to dpo JSONL 4. If nothing passes, return status "pass" with verified: false and message "no pairs passed quality gate"