fix(search,graph): M4b wiki/entities/ → tier=knowledge
All checks were successful
CI / Lint / Test / Vet (push) Successful in 13s
CI / Mirror to GitHub (push) Successful in 3s

Initial M4 mapping put wiki/entities/* in tier=note. Post-M4 eval
regressed qwen35-9b-fast from rank 2 → off top-5: knowledge entries
that cite the entity in passing now outscore the entity page itself
(1.5× weight vs 1.0×).

Entity anchor pages are durable facts about concrete things — they
map cleanly to the knowledge/facts/ slot in the post-M3 layout
target. Promote them now so the path inference matches.

Eval re-run after deploy is in infra#72.
This commit is contained in:
Mathias
2026-05-25 18:47:25 +02:00
parent 4f78fecd06
commit 1b00cbc0ae
2 changed files with 17 additions and 5 deletions

View File

@@ -99,11 +99,18 @@ func inferTierFromPath(e *Entity, docPath string) {
case "knowledge": case "knowledge":
e.Tier = "knowledge" e.Tier = "knowledge"
case "wiki": case "wiki":
// Pre-M3 wiki layout: sources are synth output of raw inbox // Pre-M3 wiki layout. Most subdirs are I-level:
// material (I tier); concepts + entities are reference notes // wiki/sources/ — synth summaries of raw inbox material
// (also I tier); top-level wiki/<slug>.md is unstructured // wiki/concepts/ — definitions, not lessons
// reference too. None of these are reusable lessons (K). // One exception: wiki/entities/ holds anchor facts about
e.Tier = "note" // concrete things (models, services, people) that the eval
// expects to surface when queried directly. Those map to K
// to match the post-M3 layout target (knowledge/facts/).
if len(parts) >= 2 && parts[1] == "entities" {
e.Tier = "knowledge"
} else {
e.Tier = "note"
}
case "raw", "sessions", "clips": case "raw", "sessions", "clips":
e.Tier = "inbox" e.Tier = "inbox"
} }

View File

@@ -343,6 +343,11 @@ func extractTier(content, relPath string) string {
case "notes": case "notes":
return "note" return "note"
case "wiki": case "wiki":
// wiki/entities/ anchor pages map to knowledge (see
// graph.inferTierFromPath for the rationale).
if len(parts) >= 2 && parts[1] == "entities" {
return "knowledge"
}
return "note" return "note"
case "knowledge": case "knowledge":
return "knowledge" return "knowledge"