Commit Graph

7 Commits

Author SHA1 Message Date
Mathias
e34cd6c12b docs(eval): record post-fix scorer run — phase 1 lift insufficient
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
Top-1 stayed at 20% (4/20), top-3 +5pt (65→70%) after:
- extract.go wing/topic parser fix (commit 3084c41)
- qwen35-9b-fast entity pad (was 239-byte stub → full entity)
- grafana entry: add "pod restart" synonym to lesson body
- dangling refs stripped from index.md + entities/k3s.md

The only retrieval move: qwen35-9b-fast climbed from rank 0 (off top-5)
to rank 2 — the entity pad worked. Other 5 misses are ranker behaviour
on already-keyword-overlapping entries; BM25 doesn't weight the right
slugs to the top.

Per the proposal's gate (≥10pt lift = stop, <10pt = Phase 2 justified),
the DIKW tier redesign earns its cost. Next session: tier column +
file moves + tier-weighted retrieval, then re-measure against this
same eval set.
2026-05-24 22:48:48 +02:00
Mathias
3084c4173d fix(graph): route wiki/<flat>.md to Type=knowledge, not Type=hall with filename-as-wing
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
classifyByPath had a hole: paths like wiki/index.md or wiki/<slug>.md
(direct children of wiki/, no subdirectory) hit the default branch and
wrote Wing=parts[1] — which IS the filename, not a wing. Symptom in
brain_entities: rows like (slug=index, wing=index.md) and
(slug=autobe-..., wing=autobe-evaluation-pattern-....md).

Fix: when len(parts) < 3 (no subdirectory at all), fall through to
Type=knowledge and let frontmatter set wing/hall if present.

Add brain/eval/ artifacts at the same time:
- qa-2026-05.md — 20 hand-authored Q→expected-slug pairs covering the
  homelab knowledge corpus across mcp, dex, gitops, postgres, go,
  models, methodology
- score.py — calls brain_query for each pair, scores top-1 + top-3,
  emits per-question detail. BRAIN_MCP_TOKEN via env.

Pre-fix baseline against the live brain: top-1 = 20% (4/20),
top-3 = 65% (13/20). Six hard misses where the expected slug doesn't
even land in the top-5.

Used to gate the phase 2 DIKW redesign (infra#62 follow-up): if
phase 1 fixes (this parser fix + 20 backlink authoring on top
orphans) lift top-1 by <10 absolute points, structure is the
bottleneck and the tier redesign is justified.
2026-05-24 22:33:04 +02:00
Mathias Bergqvist
537aebc302 feat(pipeline): update system prompt for new LLM JSON contract (no slugs)
- Change prompt to reflect new output format: title, type, subtype, domain, content
- Remove slug/path generation responsibility from LLM — pipeline now handles it
- Wikilinks change from [[slug|Display Name]] to [[Display Name]] only
- LLM no longer includes frontmatter or paths in output

docs(schema): update LLM output format and wikilink convention for Level 3

- Specify JSON schema: title, type, subtype, domain, content fields
- Remove frontmatter requirements from schema output (handled by pipeline)
- Simplify wikilink format to [[Display Name]] — no slug or pipe
- Pipeline now responsible for slug generation and frontmatter construction

These changes shift slug/frontmatter generation from LLM to pipeline,
reducing cognitive load on the model and improving control over output.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-23 19:45:21 +02:00
Mathias Bergqvist
1b0706f270 chore(brain): rename CLAUDE.md to schema.md for clarity
CLAUDE.md has a specific meaning in the Claude Code ecosystem (agent
instructions). The wiki schema for the ingestion pipeline should live
in schema.md to avoid confusion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 23:06:32 +02:00
Mathias Bergqvist
08dd7b9365 docs(brain): add wiki schema document for ingest prompt 2026-04-22 22:25:52 +02:00
Mathias Bergqvist
344def20bb test: phase 1 integration smoke test passing
All 8 MCP tools verified (tdd_red, tdd_green, tdd_refactor, brain_query,
brain_write, tier, session_log, retrospective). Ingestion write/query,
brain_query, tier, and session_log all return correct responses end-to-end.
Brain note written during smoke test committed to raw/ and wiki/concepts/.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:18:08 +02:00
Mathias Bergqvist
23dd355b8a feat: add protocols.md, retrospective discipline, and brain directory structure 2026-04-17 20:49:56 +02:00