hyperguild

Author	SHA1	Message	Date
Mathias	2b7bbe38c7	docs(eval): record M4 + M4b scorer runs — phase 2 gate cleared (infra#72) All checks were successful CI / Lint / Test / Vet (push) Successful in 11s Details CI / Mirror to GitHub (push) Successful in 4s Details Tier-weighted retrieval against the qa-2026-05.md 20-question set: \| run \| top-1 \| top-3 \| \|--------------------------------\|-------\|-------\| \| baseline (pre-phase-1) \| 20% \| 65% \| \| post phase 1 (parser+content) \| 20% \| 70% \| \| post M4 (tier weighting) \| 30% \| 75% \| \| post M4b (entities → K tier) \| 35% \| 80% \| Net Phase 2 lift: +15pt top-1, +15pt top-3 — comfortably above the ≥10pt close-gate set in infra#72. Three remaining misses are content-keyword issues, not structure issues (the questions don't share enough lexical surface with the target entries to surface via BM25 alone). Vector search would help here but the iguana embedder is off-mesh (see infra#64).	2026-05-25 18:51:29 +02:00
Mathias	e34cd6c12b	docs(eval): record post-fix scorer run — phase 1 lift insufficient All checks were successful CI / Lint / Test / Vet (push) Successful in 12s Details CI / Mirror to GitHub (push) Successful in 4s Details Top-1 stayed at 20% (4/20), top-3 +5pt (65→70%) after: - extract.go wing/topic parser fix (commit `3084c41`) - qwen35-9b-fast entity pad (was 239-byte stub → full entity) - grafana entry: add "pod restart" synonym to lesson body - dangling refs stripped from index.md + entities/k3s.md The only retrieval move: qwen35-9b-fast climbed from rank 0 (off top-5) to rank 2 — the entity pad worked. Other 5 misses are ranker behaviour on already-keyword-overlapping entries; BM25 doesn't weight the right slugs to the top. Per the proposal's gate (≥10pt lift = stop, <10pt = Phase 2 justified), the DIKW tier redesign earns its cost. Next session: tier column + file moves + tier-weighted retrieval, then re-measure against this same eval set.	2026-05-24 22:48:48 +02:00
Mathias	3084c4173d	fix(graph): route wiki/<flat>.md to Type=knowledge, not Type=hall with filename-as-wing All checks were successful CI / Lint / Test / Vet (push) Successful in 12s Details CI / Mirror to GitHub (push) Successful in 4s Details classifyByPath had a hole: paths like wiki/index.md or wiki/<slug>.md (direct children of wiki/, no subdirectory) hit the default branch and wrote Wing=parts[1] — which IS the filename, not a wing. Symptom in brain_entities: rows like (slug=index, wing=index.md) and (slug=autobe-..., wing=autobe-evaluation-pattern-....md). Fix: when len(parts) < 3 (no subdirectory at all), fall through to Type=knowledge and let frontmatter set wing/hall if present. Add brain/eval/ artifacts at the same time: - qa-2026-05.md — 20 hand-authored Q→expected-slug pairs covering the homelab knowledge corpus across mcp, dex, gitops, postgres, go, models, methodology - score.py — calls brain_query for each pair, scores top-1 + top-3, emits per-question detail. BRAIN_MCP_TOKEN via env. Pre-fix baseline against the live brain: top-1 = 20% (4/20), top-3 = 65% (13/20). Six hard misses where the expected slug doesn't even land in the top-5. Used to gate the phase 2 DIKW redesign (infra#62 follow-up): if phase 1 fixes (this parser fix + 20 backlink authoring on top orphans) lift top-1 by <10 absolute points, structure is the bottleneck and the tier redesign is justified.	2026-05-24 22:33:04 +02:00
Mathias Bergqvist	537aebc302	feat(pipeline): update system prompt for new LLM JSON contract (no slugs) - Change prompt to reflect new output format: title, type, subtype, domain, content - Remove slug/path generation responsibility from LLM — pipeline now handles it - Wikilinks change from [[slug\|Display Name]] to [[Display Name]] only - LLM no longer includes frontmatter or paths in output docs(schema): update LLM output format and wikilink convention for Level 3 - Specify JSON schema: title, type, subtype, domain, content fields - Remove frontmatter requirements from schema output (handled by pipeline) - Simplify wikilink format to [[Display Name]] — no slug or pipe - Pipeline now responsible for slug generation and frontmatter construction These changes shift slug/frontmatter generation from LLM to pipeline, reducing cognitive load on the model and improving control over output. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 19:45:21 +02:00
Mathias Bergqvist	1b0706f270	chore(brain): rename CLAUDE.md to schema.md for clarity CLAUDE.md has a specific meaning in the Claude Code ecosystem (agent instructions). The wiki schema for the ingestion pipeline should live in schema.md to avoid confusion. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 23:06:32 +02:00
Mathias Bergqvist	08dd7b9365	docs(brain): add wiki schema document for ingest prompt	2026-04-22 22:25:52 +02:00
Mathias Bergqvist	344def20bb	test: phase 1 integration smoke test passing All 8 MCP tools verified (tdd_red, tdd_green, tdd_refactor, brain_query, brain_write, tier, session_log, retrospective). Ingestion write/query, brain_query, tier, and session_log all return correct responses end-to-end. Brain note written during smoke test committed to raw/ and wiki/concepts/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 21:18:08 +02:00
Mathias Bergqvist	23dd355b8a	feat: add protocols.md, retrospective discipline, and brain directory structure	2026-04-17 20:49:56 +02:00

8 Commits