Commit Graph

4 Commits

Author SHA1 Message Date
Mathias
1b00cbc0ae fix(search,graph): M4b wiki/entities/ → tier=knowledge
All checks were successful
CI / Lint / Test / Vet (push) Successful in 13s
CI / Mirror to GitHub (push) Successful in 3s
Initial M4 mapping put wiki/entities/* in tier=note. Post-M4 eval
regressed qwen35-9b-fast from rank 2 → off top-5: knowledge entries
that cite the entity in passing now outscore the entity page itself
(1.5× weight vs 1.0×).

Entity anchor pages are durable facts about concrete things — they
map cleanly to the knowledge/facts/ slot in the post-M3 layout
target. Promote them now so the path inference matches.

Eval re-run after deploy is in infra#72.
2026-05-25 18:49:37 +02:00
Mathias
d5f112b600 feat(graph,graphstore): M2 parse tier+topic from frontmatter, persist via Upsert (infra#72)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 13s
CI / Mirror to GitHub (push) Successful in 4s
extract.go now reads `tier:` and `topic:` from YAML frontmatter, with
a path-based fallback when frontmatter is absent (the pre-M3 state on
every existing entry):

  knowledge/* → tier=knowledge
  notes/*     → tier=note
  wiki/**     → tier=note   (sources + concepts + entities are I-level)
  inbox/**, raw/**, sessions/**, clips/** → tier=inbox

Frontmatter wins when present — covers the M3-migrated case where an
entry's path may not match the tier the author chose for it.

UpsertEntity persists both columns. M1's schema already has them.

Backfill on next pod start populates tier for the whole corpus
without any file moves; M3 will follow up with the actual layout
migration and explicit frontmatter writes.
2026-05-25 12:35:38 +02:00
Mathias
3084c4173d fix(graph): route wiki/<flat>.md to Type=knowledge, not Type=hall with filename-as-wing
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
classifyByPath had a hole: paths like wiki/index.md or wiki/<slug>.md
(direct children of wiki/, no subdirectory) hit the default branch and
wrote Wing=parts[1] — which IS the filename, not a wing. Symptom in
brain_entities: rows like (slug=index, wing=index.md) and
(slug=autobe-..., wing=autobe-evaluation-pattern-....md).

Fix: when len(parts) < 3 (no subdirectory at all), fall through to
Type=knowledge and let frontmatter set wing/hall if present.

Add brain/eval/ artifacts at the same time:
- qa-2026-05.md — 20 hand-authored Q→expected-slug pairs covering the
  homelab knowledge corpus across mcp, dex, gitops, postgres, go,
  models, methodology
- score.py — calls brain_query for each pair, scores top-1 + top-3,
  emits per-question detail. BRAIN_MCP_TOKEN via env.

Pre-fix baseline against the live brain: top-1 = 20% (4/20),
top-3 = 65% (13/20). Six hard misses where the expected slug doesn't
even land in the top-5.

Used to gate the phase 2 DIKW redesign (infra#62 follow-up): if
phase 1 fixes (this parser fix + 20 backlink authoring on top
orphans) lift top-1 by <10 absolute points, structure is the
bottleneck and the tier redesign is justified.
2026-05-24 22:33:04 +02:00
Mathias
f53ee18cb6 feat(graph): add brain_entities + brain_edges store and wikilink parser
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 3s
Foundation for Track A (GraphRAG on top of existing wiki). Two new
packages, both unwired — service behaviour unchanged until commit 2
hooks the pipeline.

- internal/graph: pure parser. Extract() walks markdown + frontmatter
  and emits one Entity + N wikilink Edges per doc. Dedupes per (dst,
  line), ignores self-references, classifies hall/concept/entity/
  source/knowledge from path layout.

- internal/graphstore: pgx-backed PGStore mirroring vectorstore's
  shape. Idempotent Init() creates brain_entities + brain_edges with
  indexes on src_slug, dst_slug, src_doc, wing, type. Operations:
  UpsertEntity, ReplaceEdgesForDoc (tx), DeleteByDoc, Neighbors,
  Subgraph (recursive CTE, depth ≤6), Path (shortest path, depth ≤8).

Schema lives on the shared postgres18 instance alongside the
brain_embeddings table — no new datastore. See
docs/superpowers/specs/2026-05-homelab-training-graph-next-step.md
in infra repo + infra#62.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 15:18:08 +02:00