Files
skills/spec-driven-dev/SKILL.md
Mathias d6a71e370e
Some checks failed
release / tag (push) Has been cancelled
chore: bootstrap skills library — 19 skills + installer + CI auto-tag
Phase 1 of mathias/skills extraction (infra#62 Track D — homelab
next-step plan addendum). Imports ~/dev/.skills/ verbatim (19 skill
dirs + SKILLS_INDEX.md) and adds the installation surface:

- Taskfile.yml — install / update / list / release / check targets
- install.sh — bootstrap installer for hosts without Task. Idempotent
  symlink wirer; default checkout at ~/.local/share/skills/ on every
  host; SKILLS_REF env var pins a tag (default: main).
- .gitea/workflows/release.yml — auto-tag every push to main by
  Bump-Type footer (major/minor/patch, default patch). Skipped when
  commit contains [skip-release].
- README — usage, versioning, contribution flow, secret-hygiene rule.

Phase 1 wires Claude Code only (~/.claude/skills/<name> global +
<repo>/.claude/skills/<name> per-repo). Phase 2 adds Crush, opencode,
antigravity, and gitea-resident agents (cobalt-dingo, agentsquad)
once their skill conventions are researched.

Public repo, markdown-only — no secrets, no client names. Verified
via pre-push grep before initial push.

[skip-release]
2026-05-24 14:59:54 +02:00

9.7 KiB
Raw Permalink Blame History

name, description
name description
spec-driven-dev Write a structured specification before writing any code. Use when starting a new project, feature, or significant change. Adapted for a PM-first workflow where why comes before how.

Spec-Driven Development

Overview

Write a structured specification before writing any code. The spec is the shared source of truth — it defines what we're building, why it matters, and how we'll know it's done.

Code without a spec is guessing.

A spec doesn't need to be long. A two-paragraph spec beats no spec. The value is in forcing clarity before code is written, not in the length of the document.

Mathias PM Context

As a digital product manager building software:

  • Why before how: The spec must capture the business context and user need before technical decisions. Agents reading the spec should understand why this matters, not just what to build.
  • Explicit success criteria: Vague requirements produce vague results. Every spec must have testable success criteria.
  • Surfaces assumptions: The spec's primary job is to surface misunderstandings before they become expensive code.
  • Living document: Update the spec when decisions change. An outdated spec is still better than no spec.

When to Use

Always create a spec when:

  • Starting a new project or feature
  • Requirements are ambiguous or incomplete
  • The change touches multiple files or modules
  • You're about to make an architectural decision
  • The task would take more than a day to implement

When NOT to use: Single-line fixes, typos, or changes where requirements are unambiguous and self-contained.

The Gated Workflow

Do not advance to the next phase without validation at each gate.

SPECIFY → [review] → PLAN → [review] → TASKS → [review] → IMPLEMENT

Each gate is a deliberate pause: does the next phase make sense given what we know?

Phase 1: SPECIFY

Surface Assumptions Immediately

Before writing spec content, list what you're assuming:

ASSUMPTIONS I'M MAKING:
1. This is a Go backend service (no frontend changes)
2. Authentication is handled by the existing middleware
3. The database is PostgreSQL (matching the rest of the stack)
4. The feature is used by authenticated users only, not public
→ Confirm or correct before I proceed.

Don't silently fill in ambiguous requirements.

Write the Spec

A spec covers six areas:

1. Objective — WHY are we building this?

This is the most important section. It must answer:

  • What user problem does this solve?
  • Who is the user?
  • What does success look like from the user's perspective?
  • Why now?
## Objective

Invoice importers at small accounting firms manually copy payment details
from PDF invoices into their banking system, taking 1020 minutes per invoice.

**User:** Invoice processor at an accounting firm (1050 invoices/day)
**Problem:** Manual data entry is slow, error-prone, and creates compliance risk
**Goal:** Reduce per-invoice processing time from ~15 minutes to < 2 minutes

Success: Invoice processor can extract and queue a payment from a PDF in under 2 minutes,
with confidence the data is correct.

2. Commands — Full executable commands

Build:   task build
Test:    task test (or: go test ./...)
Lint:    task lint (or: golangci-lint run)
Dev:     task dev
Deploy:  task deploy:staging

3. Project Structure — Where things live

internal/
  domain/          → Core types and interfaces
  service/         → Business logic
  store/           → Database implementations
  handler/         → HTTP handlers
cmd/
  server/          → Main entry point

4. Code Style — One real example beats three paragraphs

// Error handling: always wrap with context
if err != nil {
    return fmt.Errorf("parse invoice PDF: %w", err)
}

// Dependency injection: accept interfaces
func NewInvoiceService(store InvoiceStore, parser PDFParser) *InvoiceService { ... }

// Context: always first parameter for I/O operations
func (s *InvoiceService) ProcessPDF(ctx context.Context, r io.Reader) (Invoice, error) { ... }

5. Testing Strategy

Framework:  testing package + testify
Locations:  *_test.go files in same package
Unit tests: table-driven, in-memory implementations for stores
Fast path:  go test ./... (unit tests only)
Full suite: go test -tags=integration ./... (includes DB tests)
Coverage:   >80% for business logic packages

6. Boundaries

Always:
  - Run go test ./... before committing
  - Wrap errors with fmt.Errorf("context: %w", err)
  - Pass ctx as first parameter to any I/O function
  - Run govulncheck before adding new dependencies

Ask first:
  - Schema changes
  - Adding new external dependencies
  - Changing public API contracts
  - Performance changes that affect existing behavior

Never:
  - Commit secrets or API keys
  - Remove or skip failing tests
  - Send client data to external APIs
  - Use naked returns on errors

Spec Template

# Spec: [Feature/Project Name]

## Objective
[What we're building and why. User story or problem statement.]
[Who is the user? What does success look like from their perspective?]

## Tech Stack
[Language, key libraries, relevant existing infrastructure]

## Commands
Build:   [full command]
Test:    [full command]
Lint:    [full command]
Dev:     [full command]

## Project Structure
[Directory layout with descriptions]

## Code Style
[One real code example showing the patterns to follow]

## Testing Strategy
[Framework, test locations, what to unit test vs integration test]

## Boundaries
- Always: [...]
- Ask first: [...]
- Never: [...]

## Success Criteria
[Specific, testable conditions that define "done"]
- [ ] [Condition 1: metric/threshold/method]
- [ ] [Condition 2: ...]

## Open Questions
[Anything unresolved that needs input before implementation begins]

Reframing Vague Requirements

When you receive a vague requirement, translate it into specific success criteria before writing any spec content:

Vague: "Make the invoice parser more reliable"

Reframed success criteria:
- Parser correctly extracts IBAN from 95% of Swedish invoice formats
- Parser correctly extracts total amount from 98% of tested invoices
- Parser returns a structured error (not a panic) for unrecognized formats
- Processing time < 2 seconds for PDFs up to 10MB
→ Are these the right targets?

Phase 2: PLAN

With a validated spec, create a technical implementation plan:

  1. Identify major components and their dependencies
  2. Determine implementation order (foundations first)
  3. Note risks and unknowns
  4. Identify what can be built in parallel vs. what must be sequential
  5. Define verification checkpoints between phases

The plan should be reviewable: anyone should be able to read it and say "yes, that's the right approach" or "no, change X."

Load the planning skill for detailed task breakdown.

Phase 3: TASKS

Break the plan into discrete, implementable tasks. Load the planning skill for the full task breakdown methodology.

Each task must have:

  • Acceptance criteria
  • Verification step (test command, build, manual check)
  • File count estimate (no task should touch more than ~5 files)

Phase 4: IMPLEMENT

Execute tasks one at a time. For each task:

  1. Load tdd skill — write failing tests first
  2. Implement minimal code to pass
  3. Load clean-code skill — refactor

Keeping the Spec Alive

  • Update when decisions change — spec first, then code
  • Update when scope changes
  • Commit the spec — it belongs in version control
  • Reference the spec in PRs — link to the section each PR implements

Common Rationalizations

Rationalization Reality
"This is simple, I don't need a spec" Simple tasks still need acceptance criteria. A two-line spec is fine.
"I'll write the spec after" That's documentation, not specification. The spec's value is forcing clarity before code.
"The spec will slow us down" A 15-minute spec prevents hours of rework.
"Requirements will change anyway" That's why the spec is a living document.
"The user knows what they want" Even clear requests have implicit assumptions. The spec surfaces those.

Verification

Before starting implementation:

  • Spec covers all six core areas
  • Mathias has reviewed and approved the spec
  • Success criteria are specific and testable
  • Boundaries (Always/Ask First/Never) are defined
  • Open questions are resolved or accepted as unknowns
  • Spec is saved to a file in the repository

Brain MCP Integration

Logging

Call session_log once at the end of every phase to record the outcome. Pass-rate is computed downstream by the /pass-rate HTTP endpoint, which treats pass as success, fail as failure, skip as neither.

At end of each phase:

  • session_log with {skill: "spec-driven-dev", phase: "<phase-name>", final_status: "pass" | "fail" | "skip", message: "<one-line summary>", duration_ms: <wall-clock>, project_root: "<absolute path>"}

Phases for this skill: specify, plan, tasks, implement

Status semantics:

  • pass — the phase's intended outcome was reached (gate passed).
  • fail — the phase's intended outcome was NOT reached (gate blocked, rework required).
  • skip — phase was skipped intentionally.

Why this matters: the routing pod (Plan 6) reads pass-rate to decide whether to route a future call to a local model. If your skill never logs, the routing pod sees no data.

Cross-References

  • Load problem-analysis skill for deep requirement understanding before speccing
  • Load user-stories skill to decompose the spec into stories
  • Load planning skill for task breakdown
  • Load feature-spec skill once implementation begins, to scope individual features inside the project
  • Load tdd skill during implementation