fix(pipeline): skip RawPages with empty title in BuildPages instead of producing broken paths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat(pipeline): update system prompt for new LLM JSON contract (no slugs)
2026-04-23 19:55:37 +02:00 · 2026-04-23 19:45:21 +02:00 · 2026-04-23 19:07:33 +02:00 · 2026-04-23 18:59:10 +02:00 · 2026-04-23 18:56:39 +02:00 · 2026-04-23 18:50:37 +02:00
37 changed files with 4338 additions and 184 deletions
--- a/.gitea/workflows/cd.yml
+++ b/.gitea/workflows/cd.yml
@@ -1,13 +1,16 @@
 name: cd
 on:
-  push:
+  workflow_run:
    workflows: ["CI"]
    types: [completed]
    branches: [main]
 jobs:
  deploy:
    name: Build and deploy
    runs-on: self-hosted
    if: ${{ github.event.workflow_run.conclusion == 'success' && github.event.workflow_run.event == 'push' }}
    env:
      SERVICE: supervisor
      IMAGE: gitea.d-ma.be/mathias/supervisor
--- a/brain/schema.md
+++ b/brain/schema.md
@@ -3,21 +3,34 @@
 This document defines the three page types in the brain wiki.
 The LLM must follow this schema exactly when generating wiki pages.
 ## Output Format
 Return a JSON array. Each element:
 ```json
 {
  "title":   "exact page title",
  "type":    "source | concept | entity",
  "subtype": "see below — omit for concept",
  "domain":  "see domains — omit if none fits",
  "content": "Markdown body only — no frontmatter, no path"
 }
 ```
 - `subtype` for **source**: `article | pdf | book | video | note | project`
 - `subtype` for **entity**: `person | company | tool | model | framework | technology`
 - The pipeline computes slugs and frontmatter — never include them in output.
 ## Wikilink Format
-All cross-references use `[[slug|Display Text]]`.
+All cross-references use `[[Display Name]]` — just the display name, no slug, no pipe.
 Rules:
- slug = lowercase filename without .md, spaces → hyphens, strip all non-alphanumeric except hyphens
+- Only link to pages in the inventory or pages you are creating in this response
- The `|` separator is REQUIRED — never use `[[Title]]` without a slug
+- The pipeline converts `[[Display Name]]` to `[[slug|Display Name]]` automatically
- Examples: `[[domain-driven-design|Domain Driven Design]]`, `[[ryan-singer|Ryan Singer]]`
+- Section links must match their section type (Related Concepts → concept pages only, etc.)
 - Slugs must resolve to an existing file in the inventory, or a file you are creating in this response
-Slug generation examples:
+Examples: `[[Domain Driven Design]]`, `[[Ryan Singer]]`, `[[Shape Up]]`
 - "Domain Driven Design" → `domain-driven-design`
 - "It's Complicated" → `its-complicated`
 - "gRPC" → `grpc`
 - "GPT-4o" → `gpt-4o`
 ## Domains
@@ -30,17 +43,6 @@ Use one of: `ai-llm`, `software-engineering`, `product-strategy`, `finance-marke
 One page per ingested source. Books are NEVER split across multiple source pages — update the existing one.
 Required frontmatter:
 ```yaml
 title: <exact title>
 type: article | pdf | book | video | note | project
 domain: <domain>
 date_ingested: YYYY-MM-DD
 last_updated: YYYY-MM-DD
 aliases:
  - <exact title>
 ```
 Body sections (in this order):
 ### Summary
@@ -50,10 +52,10 @@ Body sections (in this order):
 Bulleted list. Paraphrase — no verbatim quotes or code.
 ### Concepts Introduced or Reinforced
-Wikilinks to wiki/concepts/ ONLY. One per line.
+Wikilinks to concept pages ONLY. One per line.
 ### Entities Mentioned
-Wikilinks to wiki/entities/ ONLY. One per line.
+Wikilinks to entity pages ONLY. One per line.
 ### Open Questions Raised
 Gaps or follow-up questions from this source.
@@ -75,15 +77,6 @@ Dated entries appended on re-ingestion. NEVER rewrite — only append.
 One page per idea, framework, methodology, or pattern.
 Required frontmatter:
 ```yaml
 title: <concept name>
 domain: <domain>
 last_updated: YYYY-MM-DD
 aliases:
  - <exact title>
 ```
 Body sections (in this order):
 ### Definition
@@ -93,13 +86,13 @@ One-paragraph plain-language explanation.
 Practical significance. Why should anyone care?
 ### Related Concepts
-Wikilinks to wiki/concepts/ ONLY.
+Wikilinks to concept pages ONLY.
 ### Related Entities
-Wikilinks to wiki/entities/ ONLY.
+Wikilinks to entity pages ONLY.
 ### Sources
-Wikilinks to wiki/sources/ ONLY.
+Wikilinks to source pages ONLY.
 ### Evolving Notes
 Updated as new sources arrive. Append, do not rewrite.
@@ -110,16 +103,6 @@ Updated as new sources arrive. Append, do not rewrite.
 One page per person, tool, organisation, technology, or product.
 Required frontmatter:
 ```yaml
 title: <name>
 type: person | company | tool | model | framework | technology
 domain: <domain>
 last_updated: YYYY-MM-DD
 aliases:
  - <exact title>
 ```
 Body sections (in this order):
 ### Description
@@ -132,23 +115,23 @@ Why this entity matters to this knowledge base.
 With dates where known.
 ### Related Concepts
-Wikilinks to wiki/concepts/ ONLY.
+Wikilinks to concept pages ONLY.
 ### Related Entities
-Wikilinks to wiki/entities/ ONLY.
+Wikilinks to entity pages ONLY.
 ### Sources
-Wikilinks to wiki/sources/ ONLY.
+Wikilinks to source pages ONLY.
 ---
 ## Non-Negotiable Rules
 1. Output ONLY a valid JSON array — no markdown fences, no prose before or after
-2. Each element: `{"path": "wiki/<type>/<slug>.md", "content": "...full markdown..."}`
+2. Each element: `{"title": "...", "type": "...", "subtype": "...", "domain": "...", "content": "..."}`
-3. Slugs are kebab-case: lowercase, spaces→hyphens, strip special characters
+3. Never include slugs, paths, or frontmatter in output — the pipeline handles these
-4. Every wikilink must be `[[slug|Display Text]]` — the pipe separator is required
+4. Wikilinks: `[[Display Name]]` only — no pipe, no slug
-5. Dates always YYYY-MM-DD
+5. Dates always YYYY-MM-DD (used only in content body where contextually relevant)
 6. Never reproduce verbatim code — describe the pattern or technique
-7. Section links must match their section type (Related Concepts → concepts/ only, etc.)
+7. Section links must match their section type
 8. One source page per book — if inventory shows it exists, include it as an UPDATE
--- a/docs/superpowers/plans/2026-04-22-brain-ingestion-quality.md
+++ b/docs/superpowers/plans/2026-04-22-brain-ingestion-quality.md
@@ -0,0 +1,858 @@
 # Brain Ingestion Quality: PDF Extraction + Entity Resolution
 > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
 **Goal:** Fix PDF ingestion (currently passes raw bytes to LLM) and add fuzzy entity resolution (prevents slug proliferation at scale).
 **Architecture:** Two independent improvements wired into the existing pipeline. A new `extract` package handles text extraction by file type (pdftotext subprocess, passthrough for .md/.txt). A new `resolve.go` in the `pipeline` package normalizes proposed entity/concept titles against the loaded inventory to reuse existing slugs instead of creating duplicates. Both changes are wired into `watcher.go` and `api/handler.go` with no new dependencies except `poppler-utils` in the Docker image.
 **Tech Stack:** Go stdlib (`os/exec`, `bufio`, `strings`), testify, poppler-utils (`pdftotext`)
 ---
 ## File Structure
 **New files:**
 - `ingestion/internal/extract/extract.go` — `Text(path string) (string, error)` dispatcher
 - `ingestion/internal/extract/pdf.go` — `pdftotext` subprocess extraction
 - `ingestion/internal/extract/extract_test.go` — table-driven tests for all paths
 - `ingestion/internal/pipeline/resolve.go` — `Resolve(proposed []wiki.Page, inventory map[wiki.PageType][]wiki.Entry) []wiki.Page`
 - `ingestion/internal/pipeline/resolve_test.go` — table-driven tests
 **Modified files:**
 - `ingestion/internal/wiki/types.go` — add `Aliases []string` to `Entry`
 - `ingestion/internal/wiki/inventory.go` — `readFrontmatter` reads both title and aliases
 - `ingestion/internal/wiki/inventory_test.go` — add alias coverage
 - `ingestion/internal/pipeline/pipeline.go` — call `Resolve` after `ParsePages`
 - `ingestion/internal/watcher/watcher.go` — call `extract.Text` instead of `os.ReadFile`
 - `ingestion/internal/api/handler.go` — call `extract.Text` for path-based ingestion
 - `ingestion/Dockerfile` — `apk add poppler-utils`
 ---
 ### Task 1: `extract` package — Text() dispatcher with .md/.txt passthrough
 **Files:**
 - Create: `ingestion/internal/extract/extract.go`
 - Create: `ingestion/internal/extract/extract_test.go`
 - [ ] **Step 1: Write the failing test**
 ```go
 // ingestion/internal/extract/extract_test.go
 package extract
 import (
 	"os"
 	"path/filepath"
 	"testing"
 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
 )
 func TestText_Markdown(t *testing.T) {
 	dir := t.TempDir()
 	path := filepath.Join(dir, "note.md")
 	require.NoError(t, os.WriteFile(path, []byte("# Hello\n\nWorld."), 0o644))
 	got, err := Text(path)
 	require.NoError(t, err)
 	assert.Equal(t, "# Hello\n\nWorld.", got)
 }
 func TestText_Txt(t *testing.T) {
 	dir := t.TempDir()
 	path := filepath.Join(dir, "note.txt")
 	require.NoError(t, os.WriteFile(path, []byte("plain text"), 0o644))
 	got, err := Text(path)
 	require.NoError(t, err)
 	assert.Equal(t, "plain text", got)
 }
 func TestText_UnsupportedExtension(t *testing.T) {
 	dir := t.TempDir()
 	path := filepath.Join(dir, "data.csv")
 	require.NoError(t, os.WriteFile(path, []byte("a,b,c"), 0o644))
 	_, err := Text(path)
 	assert.ErrorContains(t, err, "unsupported")
 }
 ```
 - [ ] **Step 2: Run to verify it fails**
 ```bash
 cd ingestion && go test ./internal/extract/... -v
 ```
 Expected: compile error — package does not exist yet.
 - [ ] **Step 3: Implement extract.go**
 ```go
 // ingestion/internal/extract/extract.go
 package extract
 import (
 	"fmt"
 	"os"
 	"strings"
 )
 // Text reads the file at path and returns its plain-text content.
 // Supported extensions: .md, .txt (passthrough), .pdf (via pdftotext).
 func Text(path string) (string, error) {
 	ext := strings.ToLower(fileExt(path))
 	switch ext {
 	case ".md", ".txt":
 		b, err := os.ReadFile(path)
 		if err != nil {
 			return "", fmt.Errorf("read %s: %w", path, err)
 		}
 		return string(b), nil
 	case ".pdf":
 		return extractPDF(path)
 	default:
 		return "", fmt.Errorf("unsupported file extension: %s", ext)
 	}
 }
 // fileExt returns the file extension including the dot, lowercased.
 func fileExt(path string) string {
 	for i := len(path) - 1; i >= 0; i-- {
 		if path[i] == '.' {
 			return path[i:]
 		}
 		if path[i] == '/' || path[i] == '\\' {
 			break
 		}
 	}
 	return ""
 }
 ```
 - [ ] **Step 4: Add pdf.go stub so it compiles**
 ```go
 // ingestion/internal/extract/pdf.go
 package extract
 import "fmt"
 func extractPDF(_ string) (string, error) {
 	return "", fmt.Errorf("PDF extraction not implemented")
 }
 ```
 - [ ] **Step 5: Run tests to verify they pass**
 ```bash
 cd ingestion && go test ./internal/extract/... -v
 ```
 Expected: PASS — 3 tests passing.
 - [ ] **Step 6: Commit**
 ```bash
 cd ingestion && git add internal/extract/
 git commit -m "feat(extract): add Text() dispatcher with md/txt passthrough"
 ```
 ---
 ### Task 2: PDF extraction via pdftotext
 **Files:**
 - Modify: `ingestion/internal/extract/pdf.go`
 - Modify: `ingestion/internal/extract/extract_test.go`
 - [ ] **Step 1: Add PDF test (skip if pdftotext absent)**
 Append to `extract_test.go`:
 ```go
 func TestText_PDF(t *testing.T) {
 	if _, err := exec.LookPath("pdftotext"); err != nil {
 		t.Skip("pdftotext not available")
 	}
 	// Use a known PDF fixture; if none, create a minimal one via echo.
 	// The test verifies the round-trip: a PDF containing "Hello PDF" yields that string.
 	dir := t.TempDir()
 	pdfPath := filepath.Join(dir, "test.pdf")
 	// Generate a minimal single-page PDF using a here-doc approach.
 	// This is a valid minimal PDF containing the text "Hello PDF".
 	minimalPDF := "%PDF-1.4\n1 0 obj<</Type/Catalog/Pages 2 0 R>>endobj\n" +
 		"2 0 obj<</Type/Pages/Kids[3 0 R]/Count 1>>endobj\n" +
 		"3 0 obj<</Type/Page/MediaBox[0 0 612 792]/Parent 2 0 R/Contents 4 0 R/Resources<</Font<</F1<</Type/Font/Subtype/Type1/BaseFont/Helvetica>>>>>>>>endobj\n" +
 		"4 0 obj<</Length 44>>\nstream\nBT /F1 12 Tf 100 700 Td (Hello PDF) Tj ET\nendstream\nendobj\n" +
 		"xref\n0 5\n0000000000 65535 f\n0000000009 00000 n\n0000000058 00000 n\n0000000115 00000 n\n0000000310 00000 n\n" +
 		"trailer<</Size 5/Root 1 0 R>>\nstartxref\n406\n%%EOF\n"
 	require.NoError(t, os.WriteFile(pdfPath, []byte(minimalPDF), 0o644))
 	got, err := Text(pdfPath)
 	require.NoError(t, err)
 	assert.Contains(t, got, "Hello PDF")
 }
 ```
 Add `"os/exec"` to imports in `extract_test.go`.
 - [ ] **Step 2: Run to verify it fails (or skips)**
 ```bash
 cd ingestion && go test ./internal/extract/... -v -run TestText_PDF
 ```
 Expected: SKIP (pdftotext not installed locally) or FAIL with "not implemented".
 - [ ] **Step 3: Implement pdf.go**
 ```go
 // ingestion/internal/extract/pdf.go
 package extract
 import (
 	"bytes"
 	"fmt"
 	"os/exec"
 	"strings"
 )
 // extractPDF runs pdftotext on path and returns the extracted text.
 // pdftotext must be installed (package: poppler-utils on Alpine/Debian, poppler on Homebrew).
 func extractPDF(path string) (string, error) {
 	cmd := exec.Command("pdftotext", "-q", path, "-")
 	var stdout, stderr bytes.Buffer
 	cmd.Stdout = &stdout
 	cmd.Stderr = &stderr
 	if err := cmd.Run(); err != nil {
 		errMsg := strings.TrimSpace(stderr.String())
 		if errMsg == "" {
 			errMsg = err.Error()
 		}
 		return "", fmt.Errorf("pdftotext: %s", errMsg)
 	}
 	return strings.TrimSpace(stdout.String()), nil
 }
 ```
 - [ ] **Step 4: Run all extract tests**
 ```bash
 cd ingestion && go test ./internal/extract/... -v
 ```
 Expected: PASS (PDF test skips if pdftotext absent, passes if present).
 - [ ] **Step 5: Commit**
 ```bash
 cd ingestion && git add internal/extract/pdf.go internal/extract/extract_test.go
 git commit -m "feat(extract): implement PDF extraction via pdftotext"
 ```
 ---
 ### Task 3: `Entry.Aliases` + inventory reads aliases from frontmatter
 **Files:**
 - Modify: `ingestion/internal/wiki/types.go`
 - Modify: `ingestion/internal/wiki/inventory.go`
 - Modify: `ingestion/internal/wiki/inventory_test.go`
 - [ ] **Step 1: Write failing test for alias loading**
 Add to `inventory_test.go`:
 ```go
 func TestLoadInventory_ReadsAliases(t *testing.T) {
 	dir := t.TempDir()
 	require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki", "entities"), 0o755))
 	require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki", "concepts"), 0o755))
 	require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki", "sources"), 0o755))
 	require.NoError(t, os.WriteFile(
 		filepath.Join(dir, "wiki", "entities", "ryan-singer.md"),
 		[]byte("---\ntitle: Ryan Singer\naliases:\n  - Singer\n  - R. Singer\n---\n\n## Description\n\nDesigner.\n"),
 		0o644,
 	))
 	inv, err := LoadInventory(dir)
 	require.NoError(t, err)
 	require.Len(t, inv[PageTypeEntity], 1)
 	e := inv[PageTypeEntity][0]
 	assert.Equal(t, "Ryan Singer", e.Title)
 	assert.Equal(t, []string{"Singer", "R. Singer"}, e.Aliases)
 }
 ```
 - [ ] **Step 2: Run to verify it fails**
 ```bash
 cd ingestion && go test ./internal/wiki/... -v -run TestLoadInventory_ReadsAliases
 ```
 Expected: compile error — `Entry` has no `Aliases` field.
 - [ ] **Step 3: Add Aliases to Entry in types.go**
 ```go
 // Entry is a summary of an existing wiki page used to build the inventory.
 type Entry struct {
 	Slug    string
 	Title   string
 	Aliases []string
 	Type    PageType
 }
 ```
 - [ ] **Step 4: Replace readTitle with readFrontmatter in inventory.go**
 Replace the `readTitle` function and its call site:
 ```go
 // readFrontmatter extracts title and aliases from YAML frontmatter.
 // Falls back to slug for title and empty aliases on any error.
 func readFrontmatter(path, fallbackSlug string) (title string, aliases []string) {
 	title = fallbackSlug
 	f, err := os.Open(path)
 	if err != nil {
 		return
 	}
 	defer f.Close()
 	scanner := bufio.NewScanner(f)
 	inFM := false
 	inAliases := false
 	for scanner.Scan() {
 		line := scanner.Text()
 		if strings.TrimSpace(line) == "---" {
 			if !inFM {
 				inFM = true
 				continue
 			}
 			break // end of frontmatter
 		}
 		if !inFM {
 			continue
 		}
 		// Detect alias list items (lines starting with "  - ").
 		if inAliases {
 			trimmed := strings.TrimSpace(line)
 			if strings.HasPrefix(trimmed, "- ") {
 				aliases = append(aliases, strings.TrimPrefix(trimmed, "- "))
 				continue
 			}
 			inAliases = false // end of alias block
 		}
 		key, val, ok := strings.Cut(line, ":")
 		if !ok {
 			continue
 		}
 		switch strings.TrimSpace(key) {
 		case "title":
 			title = strings.Trim(strings.TrimSpace(val), `"'`)
 		case "aliases":
 			inAliases = true
 		}
 	}
 	return
 }
 ```
 Update `LoadInventory` to use `readFrontmatter`:
 ```go
 title, aliases := readFrontmatter(path, slug)
 result[pt] = append(result[pt], Entry{Slug: slug, Title: title, Aliases: aliases, Type: pt})
 ```
 Remove the old `readTitle` function entirely.
 - [ ] **Step 5: Run all wiki tests**
 ```bash
 cd ingestion && go test ./internal/wiki/... -v
 ```
 Expected: PASS — all existing tests plus new alias test.
 - [ ] **Step 6: Commit**
 ```bash
 cd ingestion && git add internal/wiki/types.go internal/wiki/inventory.go internal/wiki/inventory_test.go
 git commit -m "feat(wiki): add Aliases to Entry and read from YAML frontmatter"
 ```
 ---
 ### Task 4: Fuzzy entity resolution
 **Files:**
 - Create: `ingestion/internal/pipeline/resolve.go`
 - Create: `ingestion/internal/pipeline/resolve_test.go`
 - [ ] **Step 1: Write failing tests**
 ```go
 // ingestion/internal/pipeline/resolve_test.go
 package pipeline
 import (
 	"testing"
 	"github.com/stretchr/testify/assert"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 func TestResolve_NoMatch(t *testing.T) {
 	proposed := []wiki.Page{
 		{Path: "wiki/entities/new-person.md", Content: "---\ntitle: New Person\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeEntity: {
 			{Slug: "ryan-singer", Title: "Ryan Singer", Aliases: []string{"Singer"}},
 		},
 	}
 	got := Resolve(proposed, inventory)
 	assert.Len(t, got, 1)
 	assert.Equal(t, "wiki/entities/new-person.md", got[0].Path)
 }
 func TestResolve_TitleMatchRedirectsSlug(t *testing.T) {
 	// Proposed slug differs from existing but title matches.
 	proposed := []wiki.Page{
 		{Path: "wiki/entities/ryan-singer-the-designer.md", Content: "---\ntitle: Ryan Singer\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeEntity: {
 			{Slug: "ryan-singer", Title: "Ryan Singer", Aliases: nil},
 		},
 	}
 	got := Resolve(proposed, inventory)
 	assert.Len(t, got, 1)
 	assert.Equal(t, "wiki/entities/ryan-singer.md", got[0].Path)
 }
 func TestResolve_AliasMatchRedirectsSlug(t *testing.T) {
 	// Proposed title matches an existing alias.
 	proposed := []wiki.Page{
 		{Path: "wiki/entities/singer.md", Content: "---\ntitle: Singer\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeEntity: {
 			{Slug: "ryan-singer", Title: "Ryan Singer", Aliases: []string{"Singer", "R. Singer"}},
 		},
 	}
 	got := Resolve(proposed, inventory)
 	assert.Len(t, got, 1)
 	assert.Equal(t, "wiki/entities/ryan-singer.md", got[0].Path)
 }
 func TestResolve_NormalizationCaseAndArticles(t *testing.T) {
 	// "the shape up method" normalizes to "shape up method" which matches "Shape Up Method".
 	proposed := []wiki.Page{
 		{Path: "wiki/concepts/the-shape-up-method.md", Content: "---\ntitle: The Shape Up Method\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeConcept: {
 			{Slug: "shape-up-method", Title: "Shape Up Method", Aliases: nil},
 		},
 	}
 	got := Resolve(proposed, inventory)
 	assert.Len(t, got, 1)
 	assert.Equal(t, "wiki/concepts/shape-up-method.md", got[0].Path)
 }
 func TestResolve_OnlyMatchesSamePageType(t *testing.T) {
 	// A concept slug must not redirect to an entity with the same normalized name.
 	proposed := []wiki.Page{
 		{Path: "wiki/concepts/ryan-singer.md", Content: "---\ntitle: Ryan Singer\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeEntity: {
 			{Slug: "ryan-singer", Title: "Ryan Singer", Aliases: nil},
 		},
 		wiki.PageTypeConcept: {},
 	}
 	got := Resolve(proposed, inventory)
 	assert.Len(t, got, 1)
 	// Not redirected — different page type.
 	assert.Equal(t, "wiki/concepts/ryan-singer.md", got[0].Path)
 }
 func TestResolve_EmptyInventory(t *testing.T) {
 	proposed := []wiki.Page{
 		{Path: "wiki/entities/first.md", Content: "---\ntitle: First\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{}
 	got := Resolve(proposed, inventory)
 	assert.Equal(t, proposed, got)
 }
 ```
 - [ ] **Step 2: Run to verify it fails**
 ```bash
 cd ingestion && go test ./internal/pipeline/... -v -run TestResolve
 ```
 Expected: compile error — `Resolve` not defined.
 - [ ] **Step 3: Implement resolve.go**
 ```go
 // ingestion/internal/pipeline/resolve.go
 package pipeline
 import (
 	"path/filepath"
 	"strings"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 // Resolve remaps proposed pages to existing slugs when a fuzzy title match is found.
 // It only matches within the same page type (entities→entities, concepts→concepts).
 // Pages with no inventory match are returned unchanged.
 func Resolve(proposed []wiki.Page, inventory map[wiki.PageType][]wiki.Entry) []wiki.Page {
 	// Build normalized lookup: normalized_title → canonical slug, keyed by page type.
 	type key struct {
 		pt         wiki.PageType
 		normalized string
 	}
 	lookup := make(map[key]string) // key → canonical slug
 	for pt, entries := range inventory {
 		for _, e := range entries {
 			k := key{pt: pt, normalized: normalizeTitle(e.Title)}
 			lookup[k] = e.Slug
 			for _, alias := range e.Aliases {
 				ak := key{pt: pt, normalized: normalizeTitle(alias)}
 				if _, exists := lookup[ak]; !exists {
 					lookup[ak] = e.Slug
 				}
 			}
 		}
 	}
 	out := make([]wiki.Page, 0, len(proposed))
 	for _, page := range proposed {
 		pt := pageTypeFromPath(page.Path)
 		title := extractTitle(page.Content)
 		k := key{pt: pt, normalized: normalizeTitle(title)}
 		if canonicalSlug, ok := lookup[k]; ok {
 			// Redirect path to canonical slug.
 			dir := filepath.Dir(page.Path)
 			page.Path = dir + "/" + canonicalSlug + ".md"
 		}
 		out = append(out, page)
 	}
 	return out
 }
 // normalizeTitle lowercases, removes leading articles, collapses whitespace.
 // "The Shape Up Method" → "shape up method"
 func normalizeTitle(s string) string {
 	s = strings.ToLower(strings.TrimSpace(s))
 	// Strip leading articles.
 	for _, article := range []string{"the ", "a ", "an "} {
 		s = strings.TrimPrefix(s, article)
 	}
 	// Collapse internal whitespace and replace hyphens.
 	s = strings.ReplaceAll(s, "-", " ")
 	return strings.Join(strings.Fields(s), " ")
 }
 // pageTypeFromPath extracts the wiki.PageType from a path like "wiki/entities/foo.md".
 func pageTypeFromPath(path string) wiki.PageType {
 	parts := strings.Split(filepath.ToSlash(path), "/")
 	if len(parts) >= 2 {
 		return wiki.PageType(parts[1])
 	}
 	return ""
 }
 // extractTitle reads the title field from YAML frontmatter in content.
 // Falls back to empty string if not found.
 func extractTitle(content string) string {
 	lines := strings.SplitN(content, "\n", 30)
 	inFM := false
 	for _, line := range lines {
 		if strings.TrimSpace(line) == "---" {
 			if !inFM {
 				inFM = true
 				continue
 			}
 			break
 		}
 		if inFM {
 			key, val, ok := strings.Cut(line, ":")
 			if ok && strings.TrimSpace(key) == "title" {
 				return strings.Trim(strings.TrimSpace(val), `"'`)
 			}
 		}
 	}
 	return ""
 }
 ```
 - [ ] **Step 4: Run resolve tests**
 ```bash
 cd ingestion && go test ./internal/pipeline/... -v -run TestResolve
 ```
 Expected: PASS — 6 tests passing.
 - [ ] **Step 5: Commit**
 ```bash
 cd ingestion && git add internal/pipeline/resolve.go internal/pipeline/resolve_test.go
 git commit -m "feat(pipeline): add fuzzy entity resolution to prevent slug proliferation"
 ```
 ---
 ### Task 5: Wire Resolve into pipeline.Run
 **Files:**
 - Modify: `ingestion/internal/pipeline/pipeline.go`
 - [ ] **Step 1: Add Resolve call after ParsePages in Run()**
 In `pipeline.go`, locate the loop that builds `allPages`. After `allPages = append(allPages, pages...)`, we have all pages from all chunks. Resolve must run after all chunks are merged, against the snapshot inventory loaded at the start of the run.
 Replace the `merged := mergeAll(allPages)` line with:
 ```go
 resolved := Resolve(allPages, inventory)
 merged := mergeAll(resolved)
 ```
 The full relevant section of `Run` after this change:
 ```go
 for _, chunk := range chunks {
    userPrompt := BuildPrompt(schema, source, chunk, inventory)
    output, err := cfg.Complete(ctx, systemPrompt, userPrompt)
    if err != nil {
        return Result{}, fmt.Errorf("LLM call: %w", err)
    }
    pages, warnings := ParsePages(output)
    allPages = append(allPages, pages...)
    allWarnings = append(allWarnings, warnings...)
 }
 resolved := Resolve(allPages, inventory)
 merged := mergeAll(resolved)
 ```
 - [ ] **Step 2: Run all pipeline tests**
 ```bash
 cd ingestion && go test ./internal/pipeline/... -v
 ```
 Expected: PASS — all existing tests still pass (Resolve is a no-op when inventory is empty or no title matches).
 - [ ] **Step 3: Commit**
 ```bash
 cd ingestion && git add internal/pipeline/pipeline.go
 git commit -m "feat(pipeline): resolve proposed pages against inventory before writing"
 ```
 ---
 ### Task 6: Wire extract.Text into watcher and handler
 **Files:**
 - Modify: `ingestion/internal/watcher/watcher.go`
 - Modify: `ingestion/internal/api/handler.go`
 - [ ] **Step 1: Update watcher.go**
 In `processFile`, replace:
 ```go
 content, err := os.ReadFile(path)
 if err != nil {
    return fmt.Errorf("read file: %w", err)
 }
 _, runErr := pipeline.Run(ctx, cfg.Pipeline, cfg.BrainDir, string(content), source, false)
 ```
 With:
 ```go
 content, err := extract.Text(path)
 if err != nil {
    return fmt.Errorf("extract text: %w", err)
 }
 _, runErr := pipeline.Run(ctx, cfg.Pipeline, cfg.BrainDir, content, source, false)
 ```
 Add import: `"github.com/mathiasbq/hyperguild/ingestion/internal/extract"`
 Remove import: `"os"` if no longer used (check — `os` is still used for `os.MkdirAll`, `os.WriteFile`, `os.Stat`; keep it).
 - [ ] **Step 2: Update handler.go — single-file path**
 In `IngestPath`, the single-file branch reads:
 ```go
 content, readErr := os.ReadFile(req.Path)
 if readErr != nil {
    writeError(w, http.StatusInternalServerError, fmt.Sprintf("read file: %v", readErr))
    return
 }
 ```
 Replace with:
 ```go
 content, readErr := extract.Text(req.Path)
 if readErr != nil {
    writeError(w, http.StatusInternalServerError, fmt.Sprintf("extract text: %v", readErr))
    return
 }
 ```
 - [ ] **Step 3: Update handler.go — directory walk branch**
 In `IngestPath`, the directory walk reads:
 ```go
 content, readErr := os.ReadFile(path)
 if readErr != nil {
    allWarnings = append(allWarnings, fmt.Sprintf("read %s: %v", path, readErr))
    return nil
 }
 source := req.Source
 if source == "" {
    source = filepath.Base(path)
 }
 result, runErr := pipeline.Run(r.Context(), h.pipeline, h.brainDir, string(content), source, req.DryRun)
 ```
 Replace with:
 ```go
 content, readErr := extract.Text(path)
 if readErr != nil {
    allWarnings = append(allWarnings, fmt.Sprintf("extract %s: %v", path, readErr))
    return nil
 }
 source := req.Source
 if source == "" {
    source = filepath.Base(path)
 }
 result, runErr := pipeline.Run(r.Context(), h.pipeline, h.brainDir, content, source, req.DryRun)
 ```
 Add import: `"github.com/mathiasbq/hyperguild/ingestion/internal/extract"` to handler.go.
 - [ ] **Step 4: Build to verify no compile errors**
 ```bash
 cd ingestion && go build ./...
 ```
 Expected: success, no errors.
 - [ ] **Step 5: Run all tests**
 ```bash
 cd ingestion && go test ./...
 ```
 Expected: PASS — all tests pass (watcher tests use .md files, already covered by extract passthrough).
 - [ ] **Step 6: Commit**
 ```bash
 cd ingestion && git add internal/watcher/watcher.go internal/api/handler.go
 git commit -m "feat(watcher,api): use extract.Text() for file reading — fixes PDF ingestion"
 ```
 ---
 ### Task 7: Add poppler-utils to Dockerfile
 **Files:**
 - Modify: `ingestion/Dockerfile`
 - [ ] **Step 1: Add apk install for poppler-utils**
 In `ingestion/Dockerfile`, add `poppler-utils` to the Alpine runtime stage. The current final stage is:
 ```dockerfile
 FROM alpine:3.21
 COPY --from=builder /out/ingestion /usr/local/bin/ingestion
 RUN addgroup -S ingestion && adduser -S -G ingestion ingestion
 ```
 Replace with:
 ```dockerfile
 FROM alpine:3.21
 RUN apk add --no-cache poppler-utils
 COPY --from=builder /out/ingestion /usr/local/bin/ingestion
 RUN addgroup -S ingestion && adduser -S -G ingestion ingestion
 ```
 - [ ] **Step 2: Verify Dockerfile builds (local Docker)**
 ```bash
 cd ingestion && docker build -t ingestion:test .
 ```
 Expected: image builds successfully; `pdftotext` is available inside.
 - [ ] **Step 3: Verify pdftotext is accessible in the image**
 ```bash
 docker run --rm ingestion:test pdftotext -v
 ```
 Expected: prints version string like `pdftotext version 24.x.x`.
 - [ ] **Step 4: Commit**
 ```bash
 cd ingestion && git add Dockerfile
 git commit -m "chore(docker): add poppler-utils for PDF text extraction"
 ```
 ---
 ## Self-Review
 **Spec coverage check:**
 | Requirement | Task |
 |---|---|
 | PDF extraction via pdftotext | Tasks 2, 6, 7 |
 | .md and .txt passthrough (no regression) | Task 1 |
 | Unsupported extension → clear error | Task 1 |
 | Entry.Aliases loaded from frontmatter | Task 3 |
 | Fuzzy normalization (case, articles, hyphens) | Task 4 |
 | Alias matching | Task 4 |
 | Title matching across different proposed slugs | Task 4 |
 | Cross-page-type isolation (concept ≠ entity) | Task 4 |
 | Resolve wired into pipeline.Run | Task 5 |
 | extract.Text wired into watcher | Task 6 |
 | extract.Text wired into handler (single + dir) | Task 6 |
 | Dockerfile includes poppler-utils | Task 7 |
 **Placeholder scan:** None found.
 **Type consistency:**
 - `Resolve([]wiki.Page, map[wiki.PageType][]wiki.Entry) []wiki.Page` — consistent across Tasks 4 and 5.
 - `extract.Text(path string) (string, error)` — consistent across Tasks 1, 2, and 6.
 - `Entry.Aliases []string` — added in Task 3, used by Resolve in Task 4 (reads `e.Aliases`).
 - `readFrontmatter` replaces `readTitle` entirely in Task 3 — no lingering `readTitle` calls.
--- a/docs/superpowers/plans/2026-04-23-level3-slug-authority.md
+++ b/docs/superpowers/plans/2026-04-23-level3-slug-authority.md
--- a/docs/superpowers/plans/2026-04-23-source-backrefs.md
+++ b/docs/superpowers/plans/2026-04-23-source-backrefs.md
@@ -0,0 +1,433 @@
 # Source Back-References Implementation Plan
 > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
 **Goal:** After the LLM produces wiki pages for an ingestion, automatically inject a `## Sources` back-reference on every concept and entity page that the source page links to.
 **Architecture:** A new `injectSourceRefs` post-processing step is inserted between `Resolve` and `mergeAll` in `pipeline.Run`. It finds the source page in the proposed batch, extracts all `[[slug|...]]` wikilinks, then calls `wiki.Merge` with a minimal patch page to add the back-reference. `wiki.Merge` already treats `## Sources` as a bullet section with deduplication — no custom section parsing is needed. For concepts/entities that exist on disk but weren't proposed in the current batch (the common case on re-ingestion), the function loads them from disk and adds them to the pages list so they are updated.
 **Tech Stack:** Go stdlib (`regexp`, `os`, `path/filepath`, `strings`), existing `wiki.Merge` and `wiki.Page` types.
 ---
 ## File Structure
 **New files:**
 - `ingestion/internal/pipeline/refs.go` — `injectSourceRefs`, `addSourceRef`, `extractWikilinks`, `findSourcePage`, `findInInventory`
 - `ingestion/internal/pipeline/refs_test.go` — table-driven tests
 **Modified files:**
 - `ingestion/internal/pipeline/pipeline.go` — insert `injectSourceRefs` call between `Resolve` and `mergeAll`
 ---
 ### Task 1: `refs.go` — source back-reference injection
 **Files:**
 - Create: `ingestion/internal/pipeline/refs_test.go`
 - Create: `ingestion/internal/pipeline/refs.go`
 - [ ] **Step 1: Write the failing tests**
 ```go
 // ingestion/internal/pipeline/refs_test.go
 package pipeline
 import (
 	"os"
 	"path/filepath"
 	"testing"
 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 // makeInventory builds a minimal inventory for test use.
 func makeInventory(concepts, entities []string) map[wiki.PageType][]wiki.Entry {
 	inv := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeConcept: {},
 		wiki.PageTypeEntity:  {},
 		wiki.PageTypeSource:  {},
 	}
 	for _, slug := range concepts {
 		inv[wiki.PageTypeConcept] = append(inv[wiki.PageTypeConcept], wiki.Entry{Slug: slug, Title: slug})
 	}
 	for _, slug := range entities {
 		inv[wiki.PageTypeEntity] = append(inv[wiki.PageTypeEntity], wiki.Entry{Slug: slug, Title: slug})
 	}
 	return inv
 }
 func TestInjectSourceRefs_NoSourcePage(t *testing.T) {
 	pages := []wiki.Page{
 		{Path: "wiki/concepts/foo.md", Content: "---\ntitle: Foo\n---\n\n## Definition\n\nFoo.\n"},
 	}
 	got := injectSourceRefs(pages, makeInventory(nil, nil), t.TempDir())
 	assert.Equal(t, pages, got)
 }
 func TestInjectSourceRefs_InjectsIntoProposedConcept(t *testing.T) {
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/my-article.md",
 			Content: "---\ntitle: My Article\n---\n\n## Summary\n\nSee [[domain-driven-design|Domain Driven Design]].\n",
 		},
 		{
 			Path:    "wiki/concepts/domain-driven-design.md",
 			Content: "---\ntitle: Domain Driven Design\n---\n\n## Definition\n\nA methodology.\n",
 		},
 	}
 	got := injectSourceRefs(pages, makeInventory(nil, nil), t.TempDir())
 	require.Len(t, got, 2)
 	assert.Contains(t, got[1].Content, "## Sources")
 	assert.Contains(t, got[1].Content, "[[my-article|My Article]]")
 }
 func TestInjectSourceRefs_LoadsConceptFromDisk(t *testing.T) {
 	brainDir := t.TempDir()
 	conceptDir := filepath.Join(brainDir, "wiki", "concepts")
 	require.NoError(t, os.MkdirAll(conceptDir, 0o755))
 	require.NoError(t, os.WriteFile(
 		filepath.Join(conceptDir, "shape-up.md"),
 		[]byte("---\ntitle: Shape Up\n---\n\n## Definition\n\nA methodology.\n"),
 		0o644,
 	))
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/my-article.md",
 			Content: "---\ntitle: My Article\n---\n\n## Summary\n\nSee [[shape-up|Shape Up]].\n",
 		},
 	}
 	inv := makeInventory([]string{"shape-up"}, nil)
 	got := injectSourceRefs(pages, inv, brainDir)
 	// Should have loaded shape-up.md from disk and added it with source ref.
 	require.Len(t, got, 2)
 	var conceptPage wiki.Page
 	for _, p := range got {
 		if p.Path == "wiki/concepts/shape-up.md" {
 			conceptPage = p
 		}
 	}
 	assert.Contains(t, conceptPage.Content, "## Sources")
 	assert.Contains(t, conceptPage.Content, "[[my-article|My Article]]")
 	// Original content preserved.
 	assert.Contains(t, conceptPage.Content, "## Definition")
 }
 func TestInjectSourceRefs_NoSelfReference(t *testing.T) {
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/my-article.md",
 			Content: "---\ntitle: My Article\n---\n\n## Summary\n\nSelf-link [[my-article|My Article]].\n",
 		},
 	}
 	got := injectSourceRefs(pages, makeInventory(nil, nil), t.TempDir())
 	// Only one page — source should not reference itself.
 	assert.Len(t, got, 1)
 }
 func TestInjectSourceRefs_DeduplicatesOnReingestion(t *testing.T) {
 	// Concept already has source ref from a prior ingestion.
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/my-article.md",
 			Content: "---\ntitle: My Article\n---\n\n## Summary\n\nSee [[ddd|DDD]].\n",
 		},
 		{
 			Path:    "wiki/concepts/ddd.md",
 			Content: "---\ntitle: DDD\n---\n\n## Definition\n\nA thing.\n\n## Sources\n\n- [[my-article|My Article]]\n",
 		},
 	}
 	got := injectSourceRefs(pages, makeInventory(nil, nil), t.TempDir())
 	require.Len(t, got, 2)
 	// The source ref must appear exactly once.
 	count := 0
 	for _, line := range splitLines(got[1].Content) {
 		if line == "- [[my-article|My Article]]" {
 			count++
 		}
 	}
 	assert.Equal(t, 1, count, "source ref should appear exactly once")
 }
 func TestInjectSourceRefs_InjectsIntoEntity(t *testing.T) {
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/book.md",
 			Content: "---\ntitle: Book\n---\n\n## Summary\n\nBy [[ryan-singer|Ryan Singer]].\n",
 		},
 		{
 			Path:    "wiki/entities/ryan-singer.md",
 			Content: "---\ntitle: Ryan Singer\n---\n\n## Description\n\nA designer.\n",
 		},
 	}
 	got := injectSourceRefs(pages, makeInventory(nil, nil), t.TempDir())
 	require.Len(t, got, 2)
 	var entity wiki.Page
 	for _, p := range got {
 		if p.Path == "wiki/entities/ryan-singer.md" {
 			entity = p
 		}
 	}
 	assert.Contains(t, entity.Content, "[[book|Book]]")
 }
 func TestExtractWikilinks(t *testing.T) {
 	content := "See [[foo|Foo]] and [[bar|Bar]] and [[foo|Foo again]]."
 	got := extractWikilinks(content)
 	assert.True(t, got["foo"])
 	assert.True(t, got["bar"])
 	assert.Len(t, got, 2, "duplicate slugs should be deduplicated")
 }
 // splitLines is a test helper.
 func splitLines(s string) []string {
 	var out []string
 	for _, l := range splitNewlines(s) {
 		if l != "" {
 			out = append(out, l)
 		}
 	}
 	return out
 }
 func splitNewlines(s string) []string {
 	var lines []string
 	start := 0
 	for i, c := range s {
 		if c == '\n' {
 			lines = append(lines, s[start:i])
 			start = i + 1
 		}
 	}
 	lines = append(lines, s[start:])
 	return lines
 }
 ```
 - [ ] **Step 2: Run to verify they fail**
 ```bash
 cd /Users/mathias/Documents/local-dev/AI/hyperguild/.worktrees/feat-source-backrefs/ingestion && go test ./internal/pipeline/... -run "TestInjectSourceRefs|TestExtractWikilinks" -v
 ```
 Expected: compile error — `injectSourceRefs` and `extractWikilinks` not defined.
 - [ ] **Step 3: Implement refs.go**
 ```go
 // ingestion/internal/pipeline/refs.go
 package pipeline
 import (
 	"os"
 	"path/filepath"
 	"regexp"
 	"strings"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 var wikilinkRE = regexp.MustCompile(`\[\[([^|\]]+)\|`)
 // injectSourceRefs finds the source page in the proposed batch, extracts its wikilinks,
 // and injects a back-reference into every linked concept or entity page.
 // Pages that exist on disk but are not in the current batch are loaded and appended
 // so they will be updated on write.
 func injectSourceRefs(pages []wiki.Page, inventory map[wiki.PageType][]wiki.Entry, brainDir string) []wiki.Page {
 	sourceSlug, sourceTitle, found := findSourcePage(pages)
 	if !found {
 		return pages
 	}
 	// Locate source page content for wikilink extraction.
 	var sourceContent string
 	for _, p := range pages {
 		if strings.HasPrefix(p.Path, "wiki/sources/") &&
 			strings.TrimSuffix(filepath.Base(p.Path), ".md") == sourceSlug {
 			sourceContent = p.Content
 			break
 		}
 	}
 	linkedSlugs := extractWikilinks(sourceContent)
 	sourceRef := "- [[" + sourceSlug + "|" + sourceTitle + "]]"
 	// Build slug → index map for proposed pages (excluding wiki/sources/).
 	bySlug := make(map[string]int, len(pages))
 	for i, p := range pages {
 		if !strings.HasPrefix(p.Path, "wiki/sources/") {
 			bySlug[strings.TrimSuffix(filepath.Base(p.Path), ".md")] = i
 		}
 	}
 	for slug := range linkedSlugs {
 		if slug == sourceSlug {
 			continue // no self-reference
 		}
 		if idx, ok := bySlug[slug]; ok {
 			// Concept/entity is in the proposed batch — inject inline.
 			pages[idx] = addSourceRef(pages[idx], sourceRef)
 			continue
 		}
 		// Not in proposed batch — look for it in the inventory (exists on disk).
 		pt, ok := findInInventory(slug, inventory)
 		if !ok {
 			continue
 		}
 		diskPath := filepath.Join(brainDir, "wiki", string(pt), slug+".md")
 		b, err := os.ReadFile(diskPath)
 		if err != nil {
 			continue // page not found on disk; skip
 		}
 		page := wiki.Page{
 			Path:    "wiki/" + string(pt) + "/" + slug + ".md",
 			Content: string(b),
 		}
 		pages = append(pages, addSourceRef(page, sourceRef))
 	}
 	return pages
 }
 // addSourceRef injects sourceRef into the ## Sources bullet section of page.
 // Uses wiki.Merge so that existing Sources entries are deduplicated and all
 // other sections are preserved unchanged.
 func addSourceRef(page wiki.Page, sourceRef string) wiki.Page {
 	patch := wiki.Page{
 		Path:    page.Path,
 		Content: "\n## Sources\n\n" + sourceRef + "\n",
 	}
 	return wiki.Merge(page, patch)
 }
 // extractWikilinks returns the set of slugs referenced as [[slug|...]] in content.
 func extractWikilinks(content string) map[string]bool {
 	slugs := make(map[string]bool)
 	for _, m := range wikilinkRE.FindAllStringSubmatch(content, -1) {
 		slugs[m[1]] = true
 	}
 	return slugs
 }
 // findSourcePage returns the slug and title of the first wiki/sources/ page in pages.
 func findSourcePage(pages []wiki.Page) (slug, title string, found bool) {
 	for _, p := range pages {
 		if strings.HasPrefix(p.Path, "wiki/sources/") {
 			slug = strings.TrimSuffix(filepath.Base(p.Path), ".md")
 			title = extractTitle(p.Content)
 			if title == "" {
 				title = slug
 			}
 			return slug, title, true
 		}
 	}
 	return "", "", false
 }
 // findInInventory returns the PageType for a slug if it appears in the inventory.
 func findInInventory(slug string, inventory map[wiki.PageType][]wiki.Entry) (wiki.PageType, bool) {
 	for pt, entries := range inventory {
 		for _, e := range entries {
 			if e.Slug == slug {
 				return pt, true
 			}
 		}
 	}
 	return "", false
 }
 ```
 - [ ] **Step 4: Run all pipeline tests**
 ```bash
 cd /Users/mathias/Documents/local-dev/AI/hyperguild/.worktrees/feat-source-backrefs/ingestion && go test ./internal/pipeline/... -v
 ```
 Expected: all existing tests PASS + 7 new refs tests PASS.
 - [ ] **Step 5: Commit**
 ```bash
 cd /Users/mathias/Documents/local-dev/AI/hyperguild/.worktrees/feat-source-backrefs && git add ingestion/internal/pipeline/refs.go ingestion/internal/pipeline/refs_test.go && git commit -m "feat(pipeline): inject source back-references into concept and entity pages"
 ```
 ---
 ### Task 2: Wire injectSourceRefs into pipeline.Run
 **Files:**
 - Modify: `ingestion/internal/pipeline/pipeline.go`
 - [ ] **Step 1: Insert the call**
 In `pipeline.go`, locate:
 ```go
 	resolved := Resolve(allPages, inventory)
 	merged := mergeAll(resolved)
 ```
 Replace with:
 ```go
 	resolved := Resolve(allPages, inventory)
 	withRefs := injectSourceRefs(resolved, inventory, brainDir)
 	merged := mergeAll(withRefs)
 ```
 No import changes needed — same package.
 - [ ] **Step 2: Run all pipeline tests**
 ```bash
 cd /Users/mathias/Documents/local-dev/AI/hyperguild/.worktrees/feat-source-backrefs/ingestion && go test ./internal/pipeline/... -v
 ```
 Expected: all tests PASS. The existing `TestRun_WritesPages` and `TestRun_DryRunDoesNotWrite` use LLM mocks that return source pages with no wikilinks to concepts — `injectSourceRefs` is a no-op for them.
 - [ ] **Step 3: Run full test suite + lint**
 ```bash
 cd /Users/mathias/Documents/local-dev/AI/hyperguild/.worktrees/feat-source-backrefs/ingestion && go test ./... && golangci-lint run ./...
 ```
 Expected: all packages PASS, 0 lint issues.
 - [ ] **Step 4: Commit**
 ```bash
 cd /Users/mathias/Documents/local-dev/AI/hyperguild/.worktrees/feat-source-backrefs && git add ingestion/internal/pipeline/pipeline.go && git commit -m "feat(pipeline): wire source back-reference injection into Run"
 ```
 ---
 ## Self-Review
 **Spec coverage:**
 | Requirement | Task |
 |---|---|
 | Concepts get `## Sources` back-link to ingested source | Task 1 |
 | Entities get `## Sources` back-link | Task 1 (TestInjectSourceRefs_InjectsIntoEntity) |
 | Existing pages on disk get updated with new source | Task 1 (TestInjectSourceRefs_LoadsConceptFromDisk) |
 | Re-ingestion of same source does not duplicate the ref | Task 1 (TestInjectSourceRefs_DeduplicatesOnReingestion) |
 | Source page does not reference itself | Task 1 (TestInjectSourceRefs_NoSelfReference) |
 | No-op when batch has no source page | Task 1 (TestInjectSourceRefs_NoSourcePage) |
 | Wired into Run between Resolve and mergeAll | Task 2 |
 | Full test suite and lint pass | Task 2 Step 3 |
 **Placeholder scan:** None.
 **Type consistency:** `injectSourceRefs([]wiki.Page, map[wiki.PageType][]wiki.Entry, string) []wiki.Page` — used identically in refs.go (definition) and pipeline.go (call site).
--- a/docs/superpowers/specs/2026-04-23-level3-slug-authority-design.md
+++ b/docs/superpowers/specs/2026-04-23-level3-slug-authority-design.md
@@ -0,0 +1,148 @@
 # Level 3: Strip Slug Authority from LLM — Design Spec
 ## Problem
 The ingestion pipeline currently asks the LLM to produce full wiki pages including the file path (e.g. `wiki/sources/finbert-huggingface.md`). This causes two classes of bug:
 1. **Slug proliferation** — the LLM invents different slugs for the same concept across chunks or runs, producing duplicate pages that diverge in content.
 2. **Unstable paths** — the LLM may shorten, expand, or vary titles, making deduplication via `Resolve` unreliable because the slug mismatch is upstream of the normalizer.
 ## Solution
 Strip slug authority from the LLM entirely. The LLM returns a minimal structured object. The pipeline computes all slugs deterministically from titles using `wiki.Slug(title)`.
 ---
 ## LLM JSON Contract
 ### Output format (per page)
 ```json
 {
  "title": "FinBERT",
  "type": "concept",
  "subtype": "framework",
  "domain": "ai-llm",
  "content": "## Definition\n\nA BERT-based model fine-tuned for financial sentiment...\n\n## Related\n\n- [[Sentiment Analysis]]\n- [[Hugging Face]]\n"
 }
 ```
 **Fields:**
 | Field | Required | Values |
 |-------|----------|--------|
 | `title` | yes | Human-readable title, e.g. "FinBERT" |
 | `type` | yes | `"source"` \| `"concept"` \| `"entity"` |
 | `subtype` | for entity/source | entity: `person\|company\|tool\|model\|framework\|technology`; source: `article\|pdf\|book\|video\|note\|project` |
 | `domain` | no | tag string, e.g. `ai-llm`, `finance` |
 | `content` | yes | Markdown body sections only — no frontmatter, no path |
 **Wikilinks in content:** `[[Display Name]]` only. No slug. The pipeline canonicalizes to `[[slug|Display Name]]` in a post-processing step.
 **The LLM never writes slugs, paths, or frontmatter.**
 ---
 ## Pipeline Changes
 ### New type: `RawPage`
 ```go
 type RawPage struct {
    Title   string
    Type    string // "source" | "concept" | "entity"
    Subtype string
    Domain  string
    Content string
 }
 ```
 ### New step order
 ```
 ParseRawPages → BuildPages → Resolve → CanonicalizeLinks → injectSourceRefs → mergeAll → write
 ```
 ### Step descriptions
 **`ParseRawPages(output string) ([]RawPage, []string)`**
 Replaces `ParsePages`. Deserializes JSON objects with the new schema. Same truncation-recovery logic as today. Returns `(pages, warnings)`.
 **`BuildPages(rawPages []RawPage, sourceSlug, date string) []wiki.Page`**
 Converts `RawPage → wiki.Page`:
 - Computes slug: `wiki.Slug(page.Title)`
 - Computes path: `wiki/<type>/<slug>.md`
 - Assembles frontmatter:
  ```
  ---
  title: <Title>
  type: <type>
  subtype: <subtype>        # omitted if empty
  domain: <domain>          # omitted if empty
  created: <date>
  source: <sourceSlug>      # omitted for the source page itself
  ---
  ```
 - Concatenates frontmatter + content
 **`Resolve(pages []wiki.Page, inventory) []wiki.Page`**
 Unchanged. Normalizes near-duplicate titles to existing inventory slugs.
 **`CanonicalizeLinks(pages []wiki.Page, inventory) ([]wiki.Page, []string)`**
 New. Builds a title→slug map from inventory + current batch. Replaces `[[Display Name]]` with `[[slug|Display Name]]` in each page's content. Titles with no known slug are left as-is and returned as warnings.
 **`injectSourceRefs`**
 Unchanged. Reads `[[slug|...]]` links (post-canonicalization) to inject back-references.
 **`mergeAll → write`**
 Unchanged.
 ### `pipeline.Run` signature change
 ```go
 func Run(ctx context.Context, cfg Config, brainDir, content, source string, dryRun bool) (Result, error)
 ```
 `source` is already passed (it's the display name / filename). A new internal `sourceSlug` is derived from it via `wiki.Slug(source)` before calling `BuildPages`. No API change needed.
 ---
 ## Files Changed
 | File | Change |
 |------|--------|
 | `ingestion/internal/pipeline/parse.go` | Replace `ParsePages` with `ParseRawPages` + `RawPage` type |
 | `ingestion/internal/pipeline/build.go` | New file: `BuildPages` |
 | `ingestion/internal/pipeline/links.go` | New file: `CanonicalizeLinks` |
 | `ingestion/internal/pipeline/pipeline.go` | Wire new steps; derive `sourceSlug` from `source` |
 | `ingestion/internal/pipeline/prompt.go` | New system prompt + `BuildPrompt` for new JSON format |
 | `brain/schema.md` | Update wikilink format and JSON schema docs |
 `resolve.go`, `refs.go`, `backfill.go`, `merge.go` — no changes.
 ---
 ## Wikilink Format
 - **LLM output**: `[[Display Name]]`
 - **Stored on disk**: `[[slug|Display Name]]`
 - **`CanonicalizeLinks`** converts between the two using the inventory
 This matches Obsidian's display-alias syntax that the existing codebase already uses.
 ---
 ## Testing Strategy
 - `ParseRawPages`: table-driven, cover valid JSON, truncated output, unknown type, missing title
 - `BuildPages`: table-driven, cover slug computation, frontmatter assembly, source page (no `source:` field), entity with subtype
 - `CanonicalizeLinks`: cover known title → replaced, unknown title → left as-is + warning, multiple links in one page
 - Integration test: full `Run` call with mock LLM returning new JSON format, assert no slug duplication across two chunks of the same source
 ---
 ## Out of Scope
 - Re-ingesting existing pages (user will trigger manually after deploy)
 - Changing the `BackfillRefs` endpoint (already correct, slug-based)
 - Changing the `Resolve` fuzzy-match algorithm
--- a/ingestion/Dockerfile
+++ b/ingestion/Dockerfile
@@ -15,6 +15,8 @@ RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
 FROM alpine:3.21
 RUN apk add --no-cache poppler-utils
 COPY --from=builder /out/ingestion /usr/local/bin/ingestion
 RUN addgroup -S ingestion && adduser -S -G ingestion ingestion
--- a/ingestion/cmd/server/main.go
+++ b/ingestion/cmd/server/main.go
@@ -68,6 +68,7 @@ func main() {
 	mux.HandleFunc("POST /write", h.Write)
 	mux.HandleFunc("POST /ingest", h.Ingest)
 	mux.HandleFunc("POST /ingest-path", h.IngestPath)
 	mux.HandleFunc("POST /backfill-refs", h.BackfillRefs)
 	addr := ":" + port
 	watchIntervalLog := "disabled"
--- a/ingestion/internal/api/handler.go
+++ b/ingestion/internal/api/handler.go
@@ -11,6 +11,7 @@ import (
 	"strings"
 	"time"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/extract"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/search"
 )
@@ -214,16 +215,16 @@ func (h *Handler) IngestPath(w http.ResponseWriter, r *http.Request) {
 			if !supportedExtensions[ext] {
 				return nil
 			}
-			content, readErr := os.ReadFile(path)
+			content, readErr := extract.Text(path)
 			if readErr != nil {
-				allWarnings = append(allWarnings, fmt.Sprintf("read %s: %v", path, readErr))
+				allWarnings = append(allWarnings, fmt.Sprintf("extract %s: %v", path, readErr))
 				return nil
 			}
 			source := req.Source
 			if source == "" {
 				source = filepath.Base(path)
 			}
-			result, runErr := pipeline.Run(r.Context(), h.pipeline, h.brainDir, string(content), source, req.DryRun)
+			result, runErr := pipeline.Run(r.Context(), h.pipeline, h.brainDir, content, source, req.DryRun)
 			if runErr != nil {
 				allWarnings = append(allWarnings, fmt.Sprintf("ingest %s: %v", path, runErr))
 				return nil
@@ -243,16 +244,16 @@ func (h *Handler) IngestPath(w http.ResponseWriter, r *http.Request) {
 			writeError(w, http.StatusBadRequest, fmt.Sprintf("unsupported file extension: %s", ext))
 			return
 		}
-		content, readErr := os.ReadFile(req.Path)
+		content, readErr := extract.Text(req.Path)
 		if readErr != nil {
-			writeError(w, http.StatusInternalServerError, fmt.Sprintf("read file: %v", readErr))
+			writeError(w, http.StatusInternalServerError, fmt.Sprintf("extract text: %v", readErr))
 			return
 		}
 		source := req.Source
 		if source == "" {
 			source = filepath.Base(req.Path)
 		}
-		result, runErr := pipeline.Run(r.Context(), h.pipeline, h.brainDir, string(content), source, req.DryRun)
+		result, runErr := pipeline.Run(r.Context(), h.pipeline, h.brainDir, content, source, req.DryRun)
 		if runErr != nil {
 			h.logger.Error("ingest-path failed", "path", req.Path, "err", runErr)
 			writeError(w, http.StatusInternalServerError, "ingest error")
@@ -271,6 +272,18 @@ func (h *Handler) IngestPath(w http.ResponseWriter, r *http.Request) {
 	writeJSON(w, ingestResponse{Pages: allPages, Warnings: allWarnings})
 }
 // BackfillRefs handles POST /backfill-refs — injects source back-references
 // into all concept and entity pages based on existing wiki/sources/ pages.
 func (h *Handler) BackfillRefs(w http.ResponseWriter, r *http.Request) {
 	n, err := pipeline.BackfillRefs(r.Context(), h.brainDir)
 	if err != nil {
 		h.logger.Error("backfill-refs failed", "err", err)
 		writeError(w, http.StatusInternalServerError, "backfill error")
 		return
 	}
 	writeJSON(w, map[string]int{"updated": n})
 }
 func writeJSON(w http.ResponseWriter, v any) {
 	w.Header().Set("Content-Type", "application/json")
 	json.NewEncoder(w).Encode(v) //nolint:errcheck
--- a/ingestion/internal/api/handler_test.go
+++ b/ingestion/internal/api/handler_test.go
@@ -20,9 +20,9 @@ import (
 	"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
 )
-// stubComplete returns a fixed JSON page so tests never call a real LLM.
+// stubComplete returns a fixed JSON RawPage so tests never call a real LLM.
 func stubComplete(_ context.Context, _, _ string) (string, error) {
-	return `[{"path":"wiki/sources/test-source.md","content":"# Test Source\n\nSome content here.\n"}]`, nil
+	return `[{"title":"Test Source","type":"source","subtype":"article","content":"## Summary\n\nSome content here.\n"}]`, nil
 }
 func stubPipelineCfg() pipeline.Config {
--- a/ingestion/internal/extract/extract.go
+++ b/ingestion/internal/extract/extract.go
@@ -0,0 +1,39 @@
 // ingestion/internal/extract/extract.go
 package extract
 import (
 	"fmt"
 	"os"
 	"strings"
 )
 // Text reads the file at path and returns its plain-text content.
 // Supported extensions: .md, .txt (passthrough), .pdf (via pdftotext).
 func Text(path string) (string, error) {
 	ext := strings.ToLower(fileExt(path))
 	switch ext {
 	case ".md", ".txt":
 		b, err := os.ReadFile(path)
 		if err != nil {
 			return "", fmt.Errorf("read %s: %w", path, err)
 		}
 		return string(b), nil
 	case ".pdf":
 		return extractPDF(path)
 	default:
 		return "", fmt.Errorf("unsupported file extension: %s", ext)
 	}
 }
 // fileExt returns the file extension including the dot, lowercased.
 func fileExt(path string) string {
 	for i := len(path) - 1; i >= 0; i-- {
 		if path[i] == '.' {
 			return path[i:]
 		}
 		if path[i] == '/' || path[i] == '\\' {
 			break
 		}
 	}
 	return ""
 }
--- a/ingestion/internal/extract/extract_test.go
+++ b/ingestion/internal/extract/extract_test.go
@@ -0,0 +1,62 @@
 // ingestion/internal/extract/extract_test.go
 package extract
 import (
 	"os"
 	"os/exec"
 	"path/filepath"
 	"testing"
 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
 )
 func TestText_Markdown(t *testing.T) {
 	dir := t.TempDir()
 	path := filepath.Join(dir, "note.md")
 	require.NoError(t, os.WriteFile(path, []byte("# Hello\n\nWorld."), 0o644))
 	got, err := Text(path)
 	require.NoError(t, err)
 	assert.Equal(t, "# Hello\n\nWorld.", got)
 }
 func TestText_Txt(t *testing.T) {
 	dir := t.TempDir()
 	path := filepath.Join(dir, "note.txt")
 	require.NoError(t, os.WriteFile(path, []byte("plain text"), 0o644))
 	got, err := Text(path)
 	require.NoError(t, err)
 	assert.Equal(t, "plain text", got)
 }
 func TestText_UnsupportedExtension(t *testing.T) {
 	dir := t.TempDir()
 	path := filepath.Join(dir, "data.csv")
 	require.NoError(t, os.WriteFile(path, []byte("a,b,c"), 0o644))
 	_, err := Text(path)
 	assert.ErrorContains(t, err, "unsupported")
 }
 func TestText_PDF(t *testing.T) {
 	if _, err := exec.LookPath("pdftotext"); err != nil {
 		t.Skip("pdftotext not available")
 	}
 	dir := t.TempDir()
 	pdfPath := filepath.Join(dir, "test.pdf")
 	// Minimal valid PDF containing the text "Hello PDF".
 	minimalPDF := "%PDF-1.4\n1 0 obj<</Type/Catalog/Pages 2 0 R>>endobj\n" +
 		"2 0 obj<</Type/Pages/Kids[3 0 R]/Count 1>>endobj\n" +
 		"3 0 obj<</Type/Page/MediaBox[0 0 612 792]/Parent 2 0 R/Contents 4 0 R/Resources<</Font<</F1<</Type/Font/Subtype/Type1/BaseFont/Helvetica>>>>>>>>endobj\n" +
 		"4 0 obj<</Length 44>>\nstream\nBT /F1 12 Tf 100 700 Td (Hello PDF) Tj ET\nendstream\nendobj\n" +
 		"xref\n0 5\n0000000000 65535 f\n0000000009 00000 n\n0000000058 00000 n\n0000000115 00000 n\n0000000310 00000 n\n" +
 		"trailer<</Size 5/Root 1 0 R>>\nstartxref\n406\n%%EOF\n"
 	require.NoError(t, os.WriteFile(pdfPath, []byte(minimalPDF), 0o644))
 	got, err := Text(pdfPath)
 	require.NoError(t, err)
 	assert.Contains(t, got, "Hello PDF")
 }
--- a/ingestion/internal/extract/pdf.go
+++ b/ingestion/internal/extract/pdf.go
@@ -0,0 +1,28 @@
 // ingestion/internal/extract/pdf.go
 package extract
 import (
 	"bytes"
 	"fmt"
 	"os/exec"
 	"strings"
 )
 // extractPDF runs pdftotext on path and returns the extracted text.
 // pdftotext must be installed (package: poppler-utils on Alpine/Debian, poppler on Homebrew).
 func extractPDF(path string) (string, error) {
 	cmd := exec.Command("pdftotext", "-q", path, "-")
 	var stdout, stderr bytes.Buffer
 	cmd.Stdout = &stdout
 	cmd.Stderr = &stderr
 	if err := cmd.Run(); err != nil {
 		errMsg := strings.TrimSpace(stderr.String())
 		if errMsg == "" {
 			errMsg = err.Error()
 		}
 		return "", fmt.Errorf("pdftotext: %s", errMsg)
 	}
 	return strings.TrimSpace(stdout.String()), nil
 }
--- a/ingestion/internal/llm/client.go
+++ b/ingestion/internal/llm/client.go
@@ -81,7 +81,7 @@ func (c *Client) Complete(ctx context.Context, system, user string) (string, err
 	}
 	if resp.StatusCode == http.StatusTooManyRequests {
-		resp.Body.Close()
+		_ = resp.Body.Close()
 		wait := 5 * time.Second
 		if ra := resp.Header.Get("Retry-After"); ra != "" {
 			if secs, err := strconv.Atoi(ra); err == nil {
@@ -98,7 +98,7 @@ func (c *Client) Complete(ctx context.Context, system, user string) (string, err
 			return "", fmt.Errorf("retry LLM call: %w", err)
 		}
 	}
-	defer resp.Body.Close()
+	defer resp.Body.Close() //nolint:errcheck
 	out, err := io.ReadAll(resp.Body)
 	if err != nil {
--- a/ingestion/internal/llm/client_test.go
+++ b/ingestion/internal/llm/client_test.go
@@ -18,7 +18,7 @@ func mockServer(t *testing.T, response string) *httptest.Server {
 		assert.Equal(t, "/chat/completions", r.URL.Path)
 		assert.Equal(t, "application/json", r.Header.Get("Content-Type"))
 		w.Header().Set("Content-Type", "application/json")
-		json.NewEncoder(w).Encode(map[string]any{
+		_ = json.NewEncoder(w).Encode(map[string]any{
 			"choices": []map[string]any{
 				{"message": map[string]any{"role": "assistant", "content": response}},
 			},
@@ -51,7 +51,7 @@ func TestClient_SendsAuthHeader(t *testing.T) {
 	var gotAuth string
 	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
 		gotAuth = r.Header.Get("Authorization")
-		json.NewEncoder(w).Encode(map[string]any{
+		_ = json.NewEncoder(w).Encode(map[string]any{
 			"choices": []map[string]any{{"message": map[string]any{"content": "ok"}}},
 		})
 	}))
@@ -72,7 +72,7 @@ func TestClient_Retries429(t *testing.T) {
 			w.WriteHeader(http.StatusTooManyRequests)
 			return
 		}
-		json.NewEncoder(w).Encode(map[string]any{
+		_ = json.NewEncoder(w).Encode(map[string]any{
 			"choices": []map[string]any{{"message": map[string]any{"content": "retried"}}},
 		})
 	}))
--- a/ingestion/internal/pipeline/backfill.go
+++ b/ingestion/internal/pipeline/backfill.go
@@ -0,0 +1,91 @@
 // ingestion/internal/pipeline/backfill.go
 package pipeline
 import (
 	"context"
 	"fmt"
 	"os"
 	"path/filepath"
 	"strings"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 // BackfillRefs walks wiki/sources/ and injects source back-references into every
 // concept and entity page that each source links to.
 // Changes for all sources are accumulated in memory before writing, so multiple
 // sources referencing the same concept are merged in one pass.
 // Deduplication is handled by wiki.Merge — running this multiple times is safe.
 // Returns the number of concept/entity pages written.
 func BackfillRefs(ctx context.Context, brainDir string) (int, error) {
 	inventory, err := wiki.LoadInventory(brainDir)
 	if err != nil {
 		return 0, fmt.Errorf("load inventory: %w", err)
 	}
 	sourcesDir := filepath.Join(brainDir, "wiki", "sources")
 	entries, err := os.ReadDir(sourcesDir)
 	if err != nil {
 		if os.IsNotExist(err) {
 			return 0, nil
 		}
 		return 0, fmt.Errorf("read sources dir: %w", err)
 	}
 	// Accumulate all changes before writing: relPath → updated Page.
 	// Collecting first means two sources that both link the same concept
 	// get both refs merged before a single write.
 	pending := make(map[string]wiki.Page)
 	for _, e := range entries {
 		if ctx.Err() != nil {
 			return 0, ctx.Err()
 		}
 		if e.IsDir() || !strings.HasSuffix(e.Name(), ".md") {
 			continue
 		}
 		b, err := os.ReadFile(filepath.Join(sourcesDir, e.Name()))
 		if err != nil {
 			continue
 		}
 		sourceContent := string(b)
 		sourceSlug := strings.TrimSuffix(e.Name(), ".md")
 		sourceTitle := extractTitle(sourceContent)
 		if sourceTitle == "" {
 			sourceTitle = sourceSlug
 		}
 		sourceRef := "- [[" + sourceSlug + "|" + sourceTitle + "]]"
 		for slug := range extractWikilinks(sourceContent) {
 			if slug == sourceSlug {
 				continue
 			}
 			pt, ok := findInInventory(slug, inventory)
 			if !ok {
 				continue
 			}
 			relPath := "wiki/" + string(pt) + "/" + slug + ".md"
 			// Start from already-accumulated version if we've seen this page.
 			page, seen := pending[relPath]
 			if !seen {
 				raw, err := os.ReadFile(filepath.Join(brainDir, filepath.FromSlash(relPath)))
 				if err != nil {
 					continue
 				}
 				page = wiki.Page{Path: relPath, Content: string(raw)}
 			}
 			pending[relPath] = addSourceRef(page, sourceRef)
 		}
 	}
 	for relPath, page := range pending {
 		dest := filepath.Join(brainDir, filepath.FromSlash(relPath))
 		if err := os.WriteFile(dest, []byte(page.Content), 0o644); err != nil {
 			return 0, fmt.Errorf("write %s: %w", relPath, err)
 		}
 	}
 	return len(pending), nil
 }
--- a/ingestion/internal/pipeline/backfill_test.go
+++ b/ingestion/internal/pipeline/backfill_test.go
@@ -0,0 +1,107 @@
 // ingestion/internal/pipeline/backfill_test.go
 package pipeline
 import (
 	"context"
 	"os"
 	"path/filepath"
 	"testing"
 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
 )
 func setupBrainDir(t *testing.T) string {
 	t.Helper()
 	dir := t.TempDir()
 	for _, sub := range []string{"wiki/sources", "wiki/concepts", "wiki/entities"} {
 		require.NoError(t, os.MkdirAll(filepath.Join(dir, sub), 0o755))
 	}
 	return dir
 }
 func writeFile(t *testing.T, path, content string) {
 	t.Helper()
 	require.NoError(t, os.MkdirAll(filepath.Dir(path), 0o755))
 	require.NoError(t, os.WriteFile(path, []byte(content), 0o644))
 }
 func TestBackfillRefs_UpdatesConcept(t *testing.T) {
 	dir := setupBrainDir(t)
 	writeFile(t, filepath.Join(dir, "wiki/sources/shape-up.md"),
 		"---\ntitle: Shape Up\n---\n\n## Summary\n\nSee [[betting|Betting]].\n")
 	writeFile(t, filepath.Join(dir, "wiki/concepts/betting.md"),
 		"---\ntitle: Betting\n---\n\n## Definition\n\nA resource allocation technique.\n")
 	n, err := BackfillRefs(context.Background(), dir)
 	require.NoError(t, err)
 	assert.Equal(t, 1, n)
 	got, err := os.ReadFile(filepath.Join(dir, "wiki/concepts/betting.md"))
 	require.NoError(t, err)
 	assert.Contains(t, string(got), "## Sources")
 	assert.Contains(t, string(got), "[[shape-up|Shape Up]]")
 	assert.Contains(t, string(got), "## Definition") // original content preserved
 }
 func TestBackfillRefs_Deduplication(t *testing.T) {
 	dir := setupBrainDir(t)
 	writeFile(t, filepath.Join(dir, "wiki/sources/shape-up.md"),
 		"---\ntitle: Shape Up\n---\n\n## Summary\n\nSee [[betting|Betting]].\n")
 	writeFile(t, filepath.Join(dir, "wiki/concepts/betting.md"),
 		"---\ntitle: Betting\n---\n\n## Definition\n\nA technique.\n")
 	// Run twice — should not duplicate the ref.
 	_, err := BackfillRefs(context.Background(), dir)
 	require.NoError(t, err)
 	_, err = BackfillRefs(context.Background(), dir)
 	require.NoError(t, err)
 	got, err := os.ReadFile(filepath.Join(dir, "wiki/concepts/betting.md"))
 	require.NoError(t, err)
 	count := 0
 	for _, line := range splitLines(string(got)) {
 		if line == "- [[shape-up|Shape Up]]" {
 			count++
 		}
 	}
 	assert.Equal(t, 1, count, "ref should appear exactly once after two runs")
 }
 func TestBackfillRefs_MultipleSources(t *testing.T) {
 	dir := setupBrainDir(t)
 	writeFile(t, filepath.Join(dir, "wiki/sources/book-a.md"),
 		"---\ntitle: Book A\n---\n\n## Summary\n\nSee [[shaping|Shaping]].\n")
 	writeFile(t, filepath.Join(dir, "wiki/sources/book-b.md"),
 		"---\ntitle: Book B\n---\n\n## Summary\n\nAlso [[shaping|Shaping]].\n")
 	writeFile(t, filepath.Join(dir, "wiki/concepts/shaping.md"),
 		"---\ntitle: Shaping\n---\n\n## Definition\n\nA design activity.\n")
 	n, err := BackfillRefs(context.Background(), dir)
 	require.NoError(t, err)
 	assert.Equal(t, 1, n) // one concept page written
 	got, err := os.ReadFile(filepath.Join(dir, "wiki/concepts/shaping.md"))
 	require.NoError(t, err)
 	assert.Contains(t, string(got), "[[book-a|Book A]]")
 	assert.Contains(t, string(got), "[[book-b|Book B]]")
 }
 func TestBackfillRefs_NoSourcesDir(t *testing.T) {
 	dir := t.TempDir() // no wiki/sources subdir
 	n, err := BackfillRefs(context.Background(), dir)
 	require.NoError(t, err)
 	assert.Equal(t, 0, n)
 }
 func TestBackfillRefs_SkipsUnknownSlugs(t *testing.T) {
 	dir := setupBrainDir(t)
 	// Source links to a slug not in inventory and not on disk.
 	writeFile(t, filepath.Join(dir, "wiki/sources/article.md"),
 		"---\ntitle: Article\n---\n\n## Summary\n\nSee [[ghost-slug|Ghost]].\n")
 	n, err := BackfillRefs(context.Background(), dir)
 	require.NoError(t, err)
 	assert.Equal(t, 0, n)
 }
--- a/ingestion/internal/pipeline/build.go
+++ b/ingestion/internal/pipeline/build.go
@@ -0,0 +1,106 @@
 // ingestion/internal/pipeline/build.go
 package pipeline
 import (
 	"fmt"
 	"strings"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 // BuildPages converts RawPages from the LLM into wiki.Pages with computed slugs,
 // paths, and YAML frontmatter. sourceSlug is the slug of the source being ingested
 // (derived from the filename, not the LLM title). Pages whose title resolves to an
 // empty slug are skipped and returned as warnings instead.
 func BuildPages(rawPages []RawPage, sourceSlug, date string) ([]wiki.Page, []string) {
 	out := make([]wiki.Page, 0, len(rawPages))
 	var warnings []string
 	for _, rp := range rawPages {
 		slug := computeSlug(rp, sourceSlug)
 		if slug == "" {
 			warnings = append(warnings, fmt.Sprintf("skipped page with empty title (type: %s)", rp.Type))
 			continue
 		}
 		out = append(out, buildPage(rp, sourceSlug, date))
 	}
 	return out, warnings
 }
 func computeSlug(rp RawPage, sourceSlug string) string {
 	if rp.Type == "source" {
 		return sourceSlug
 	}
 	return wiki.Slug(rp.Title)
 }
 func buildPage(rp RawPage, sourceSlug, date string) wiki.Page {
 	var slug, dir string
 	switch rp.Type {
 	case "source":
 		slug = sourceSlug
 		dir = "wiki/sources"
 	case "concept":
 		slug = wiki.Slug(rp.Title)
 		dir = "wiki/concepts"
 	case "entity":
 		slug = wiki.Slug(rp.Title)
 		dir = "wiki/entities"
 	default:
 		slug = wiki.Slug(rp.Title)
 		dir = "wiki/" + rp.Type
 	}
 	path := dir + "/" + slug + ".md"
 	fm := buildFrontmatter(rp, date)
 	return wiki.Page{
 		Path:    path,
 		Content: fm + "\n" + rp.Content,
 	}
 }
 func buildFrontmatter(rp RawPage, date string) string {
 	var sb strings.Builder
 	sb.WriteString("---\n")
 	fmt.Fprintf(&sb, "title: %s\n", yamlScalar(rp.Title))
 	switch rp.Type {
 	case "source":
 		subtype := rp.Subtype
 		if subtype == "" {
 			subtype = "article"
 		}
 		fmt.Fprintf(&sb, "type: %s\n", yamlScalar(subtype))
 		if rp.Domain != "" {
 			fmt.Fprintf(&sb, "domain: %s\n", yamlScalar(rp.Domain))
 		}
 		fmt.Fprintf(&sb, "date_ingested: %s\n", date)
 		fmt.Fprintf(&sb, "last_updated: %s\n", date)
 	case "concept":
 		if rp.Domain != "" {
 			fmt.Fprintf(&sb, "domain: %s\n", yamlScalar(rp.Domain))
 		}
 		fmt.Fprintf(&sb, "last_updated: %s\n", date)
 	case "entity":
 		if rp.Subtype != "" {
 			fmt.Fprintf(&sb, "type: %s\n", yamlScalar(rp.Subtype))
 		}
 		if rp.Domain != "" {
 			fmt.Fprintf(&sb, "domain: %s\n", yamlScalar(rp.Domain))
 		}
 		fmt.Fprintf(&sb, "last_updated: %s\n", date)
 	default:
 		if rp.Domain != "" {
 			fmt.Fprintf(&sb, "domain: %s\n", yamlScalar(rp.Domain))
 		}
 		fmt.Fprintf(&sb, "last_updated: %s\n", date)
 	}
 	fmt.Fprintf(&sb, "aliases:\n  - %s\n", yamlScalar(rp.Title))
 	sb.WriteString("---\n")
 	return sb.String()
 }
 func yamlScalar(s string) string {
 	return "'" + strings.ReplaceAll(s, "'", "''") + "'"
 }
--- a/ingestion/internal/pipeline/build_test.go
+++ b/ingestion/internal/pipeline/build_test.go
@@ -0,0 +1,167 @@
 // ingestion/internal/pipeline/build_test.go
 package pipeline
 import (
 	"strings"
 	"testing"
 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
 )
 func TestBuildPages_SourcePage(t *testing.T) {
 	raw := []RawPage{
 		{
 			Title:   "Shape Up",
 			Type:    "source",
 			Subtype: "book",
 			Domain:  "product-strategy",
 			Content: "## Summary\n\nA book about shaping product work.\n",
 		},
 	}
 	pages, warnings := BuildPages(raw, "shape-up", "2026-04-23")
 	require.Len(t, pages, 1)
 	assert.Empty(t, warnings)
 	p := pages[0]
 	assert.Equal(t, "wiki/sources/shape-up.md", p.Path)
 	assert.Contains(t, p.Content, "title: 'Shape Up'")
 	assert.Contains(t, p.Content, "type: 'book'")
 	assert.Contains(t, p.Content, "domain: 'product-strategy'")
 	assert.Contains(t, p.Content, "date_ingested: 2026-04-23")
 	assert.Contains(t, p.Content, "last_updated: 2026-04-23")
 	assert.Contains(t, p.Content, "aliases:\n  - 'Shape Up'")
 	assert.Contains(t, p.Content, "## Summary")
 	assert.True(t, strings.HasPrefix(p.Content, "---\n"), "content must start with frontmatter")
 }
 func TestBuildPages_ConceptPage(t *testing.T) {
 	raw := []RawPage{
 		{
 			Title:   "Betting",
 			Type:    "concept",
 			Domain:  "product-strategy",
 			Content: "## Definition\n\nA resource allocation technique.\n",
 		},
 	}
 	pages, warnings := BuildPages(raw, "shape-up", "2026-04-23")
 	require.Len(t, pages, 1)
 	assert.Empty(t, warnings)
 	p := pages[0]
 	assert.Equal(t, "wiki/concepts/betting.md", p.Path)
 	assert.Contains(t, p.Content, "title: 'Betting'")
 	assert.Contains(t, p.Content, "domain: 'product-strategy'")
 	assert.Contains(t, p.Content, "last_updated: 2026-04-23")
 	assert.Contains(t, p.Content, "aliases:\n  - 'Betting'")
 	assert.NotContains(t, p.Content, "date_ingested")
 	assert.Contains(t, p.Content, "## Definition")
 }
 func TestBuildPages_EntityPage(t *testing.T) {
 	raw := []RawPage{
 		{
 			Title:   "Ryan Singer",
 			Type:    "entity",
 			Subtype: "person",
 			Domain:  "product-strategy",
 			Content: "## Description\n\nA product designer.\n",
 		},
 	}
 	pages, warnings := BuildPages(raw, "shape-up", "2026-04-23")
 	require.Len(t, pages, 1)
 	assert.Empty(t, warnings)
 	p := pages[0]
 	assert.Equal(t, "wiki/entities/ryan-singer.md", p.Path)
 	assert.Contains(t, p.Content, "title: 'Ryan Singer'")
 	assert.Contains(t, p.Content, "type: 'person'")
 	assert.Contains(t, p.Content, "domain: 'product-strategy'")
 	assert.Contains(t, p.Content, "last_updated: 2026-04-23")
 	assert.Contains(t, p.Content, "aliases:\n  - 'Ryan Singer'")
 	assert.NotContains(t, p.Content, "date_ingested")
 }
 func TestBuildPages_SourceSlugUsedForSourcePage(t *testing.T) {
 	// LLM title differs from filename — pipeline uses sourceSlug for the source page path.
 	raw := []RawPage{
 		{Title: "FinBERT: A Pretrained Model", Type: "source", Subtype: "article", Content: "## Summary\n\nA model.\n"},
 	}
 	pages, _ := BuildPages(raw, "finbert-huggingface", "2026-04-23")
 	require.Len(t, pages, 1)
 	assert.Equal(t, "wiki/sources/finbert-huggingface.md", pages[0].Path)
 }
 func TestBuildPages_ConceptSlugDerivedFromTitle(t *testing.T) {
 	raw := []RawPage{
 		{Title: "Domain-Driven Design", Type: "concept", Content: "## Definition\n\nFoo.\n"},
 	}
 	pages, _ := BuildPages(raw, "some-source", "2026-04-23")
 	require.Len(t, pages, 1)
 	assert.Equal(t, "wiki/concepts/domain-driven-design.md", pages[0].Path)
 }
 func TestBuildPages_SourceDefaultSubtype(t *testing.T) {
 	// If subtype is omitted for a source, default to "article"
 	raw := []RawPage{
 		{Title: "Some Post", Type: "source", Content: "## Summary\n\nA post.\n"},
 	}
 	pages, _ := BuildPages(raw, "some-post", "2026-04-23")
 	require.Len(t, pages, 1)
 	assert.Contains(t, pages[0].Content, "type: 'article'")
 }
 func TestBuildPages_OmitsDomainWhenEmpty(t *testing.T) {
 	raw := []RawPage{
 		{Title: "Betting", Type: "concept", Content: "## Definition\n\nFoo.\n"},
 	}
 	pages, _ := BuildPages(raw, "src", "2026-04-23")
 	require.Len(t, pages, 1)
 	assert.NotContains(t, pages[0].Content, "domain:")
 }
 func TestBuildPages_MultiplePages(t *testing.T) {
 	raw := []RawPage{
 		{Title: "Shape Up", Type: "source", Subtype: "book", Content: "## Summary\n\nA book.\n"},
 		{Title: "Betting", Type: "concept", Content: "## Definition\n\nA technique.\n"},
 		{Title: "Ryan Singer", Type: "entity", Subtype: "person", Content: "## Description\n\nA designer.\n"},
 	}
 	pages, _ := BuildPages(raw, "shape-up", "2026-04-23")
 	require.Len(t, pages, 3)
 	assert.Equal(t, "wiki/sources/shape-up.md", pages[0].Path)
 	assert.Equal(t, "wiki/concepts/betting.md", pages[1].Path)
 	assert.Equal(t, "wiki/entities/ryan-singer.md", pages[2].Path)
 }
 func TestBuildPages_TitleWithColon(t *testing.T) {
 	raw := []RawPage{
 		{Title: "Shape Up: The Basecamp Method", Type: "source", Subtype: "book", Content: "## Summary\n\nA book.\n"},
 	}
 	pages, _ := BuildPages(raw, "shape-up", "2026-04-23")
 	require.Len(t, pages, 1)
 	// Title with colon must be quoted in YAML
 	assert.Contains(t, pages[0].Content, "title: 'Shape Up: The Basecamp Method'")
 	assert.Contains(t, pages[0].Content, "aliases:\n  - 'Shape Up: The Basecamp Method'")
 }
 func TestBuildPages_EntityNoSubtype(t *testing.T) {
 	raw := []RawPage{
 		{Title: "Basecamp", Type: "entity", Content: "## Description\n\nA company.\n"},
 	}
 	pages, _ := BuildPages(raw, "src", "2026-04-23")
 	require.Len(t, pages, 1)
 	assert.NotContains(t, pages[0].Content, "type:")
 	assert.Contains(t, pages[0].Content, "title: 'Basecamp'")
 }
 func TestBuildPages_EmptyTitleSkippedWithWarning(t *testing.T) {
 	raw := []RawPage{
 		{Title: "", Type: "concept", Content: "## Definition\n\nFoo.\n"},
 		{Title: "Betting", Type: "concept", Content: "## Definition\n\nA technique.\n"},
 	}
 	pages, warnings := BuildPages(raw, "src", "2026-04-23")
 	require.Len(t, pages, 1, "empty-title page should be skipped")
 	assert.Equal(t, "wiki/concepts/betting.md", pages[0].Path)
 	assert.Len(t, warnings, 1)
 	assert.Contains(t, warnings[0], "empty title")
 }
--- a/ingestion/internal/pipeline/links.go
+++ b/ingestion/internal/pipeline/links.go
@@ -0,0 +1,70 @@
 // ingestion/internal/pipeline/links.go
 package pipeline
 import (
 	"fmt"
 	"path/filepath"
 	"regexp"
 	"strings"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 // plainLinkRE matches [[Display Name]] — wikilinks without a slug pipe.
 // It does NOT match [[slug|Display]] (those already have a pipe).
 var plainLinkRE = regexp.MustCompile(`\[\[([^\]|]+)\]\]`)
 // CanonicalizeLinks converts [[Display Name]] wikilinks to [[slug|Display Name]]
 // using a title→slug map built from the inventory and current batch.
 // Unknown titles are left as-is and returned as warnings.
 func CanonicalizeLinks(pages []wiki.Page, inventory map[wiki.PageType][]wiki.Entry) ([]wiki.Page, []string) {
 	titleToSlug := buildTitleMap(pages, inventory)
 	var allWarnings []string
 	out := make([]wiki.Page, len(pages))
 	for i, p := range pages {
 		newContent, warnings := canonicalizeContent(p.Content, titleToSlug)
 		p.Content = newContent
 		out[i] = p
 		allWarnings = append(allWarnings, warnings...)
 	}
 	return out, allWarnings
 }
 // buildTitleMap builds a lowercase-title → slug map from inventory and current batch.
 // Current batch entries take precedence over inventory (they may be updates).
 func buildTitleMap(pages []wiki.Page, inventory map[wiki.PageType][]wiki.Entry) map[string]string {
 	m := make(map[string]string)
 	for _, entries := range inventory {
 		for _, e := range entries {
 			m[strings.ToLower(e.Title)] = e.Slug
 		}
 	}
 	// Current batch overrides inventory
 	for _, p := range pages {
 		title := extractTitle(p.Content)
 		slug := strings.TrimSuffix(filepath.Base(p.Path), ".md")
 		if title != "" && slug != "" {
 			m[strings.ToLower(title)] = slug
 		}
 	}
 	return m
 }
 func canonicalizeContent(content string, titleToSlug map[string]string) (string, []string) {
 	var warnings []string
 	result := plainLinkRE.ReplaceAllStringFunc(content, func(match string) string {
 		sub := plainLinkRE.FindStringSubmatch(match)
 		if len(sub) < 2 {
 			return match
 		}
 		displayName := sub[1]
 		slug, ok := titleToSlug[strings.ToLower(displayName)]
 		if !ok {
 			warnings = append(warnings, fmt.Sprintf("unknown wikilink: [[%s]]", displayName))
 			return match
 		}
 		return "[[" + slug + "|" + displayName + "]]"
 	})
 	return result, warnings
 }
--- a/ingestion/internal/pipeline/links_test.go
+++ b/ingestion/internal/pipeline/links_test.go
@@ -0,0 +1,125 @@
 // ingestion/internal/pipeline/links_test.go
 package pipeline
 import (
 	"testing"
 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 func TestCanonicalizeLinks_KnownTitle(t *testing.T) {
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/shape-up.md",
 			Content: "---\ntitle: 'Shape Up'\n---\n\n## Summary\n\nSee [[Betting]].\n",
 		},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeConcept: {
 			{Slug: "betting", Title: "Betting"},
 		},
 	}
 	got, warnings := CanonicalizeLinks(pages, inventory)
 	require.Len(t, got, 1)
 	assert.Empty(t, warnings)
 	assert.Contains(t, got[0].Content, "[[betting|Betting]]")
 	assert.NotContains(t, got[0].Content, "[[Betting]]")
 }
 func TestCanonicalizeLinks_UnknownTitleLeftAsIs(t *testing.T) {
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/shape-up.md",
 			Content: "---\ntitle: 'Shape Up'\n---\n\n## Summary\n\nSee [[Ghost Concept]].\n",
 		},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{}
 	got, warnings := CanonicalizeLinks(pages, inventory)
 	require.Len(t, got, 1)
 	assert.NotEmpty(t, warnings)
 	assert.Contains(t, got[0].Content, "[[Ghost Concept]]")
 }
 func TestCanonicalizeLinks_AlreadyCanonicalLinkUntouched(t *testing.T) {
 	// Links already in [[slug|Display]] format must not be double-converted
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/shape-up.md",
 			Content: "---\ntitle: 'Shape Up'\n---\n\n## Summary\n\nSee [[betting|Betting]].\n",
 		},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeConcept: {
 			{Slug: "betting", Title: "Betting"},
 		},
 	}
 	got, warnings := CanonicalizeLinks(pages, inventory)
 	require.Len(t, got, 1)
 	assert.Empty(t, warnings)
 	// Should remain exactly as-is — not double-wrapped
 	assert.Contains(t, got[0].Content, "[[betting|Betting]]")
 	assert.NotContains(t, got[0].Content, "[[betting|[[betting|Betting]]]]")
 }
 func TestCanonicalizeLinks_CaseInsensitiveMatch(t *testing.T) {
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/foo.md",
 			Content: "---\ntitle: 'Foo'\n---\n\n## Summary\n\nSee [[domain driven design]].\n",
 		},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeConcept: {
 			{Slug: "domain-driven-design", Title: "Domain Driven Design"},
 		},
 	}
 	got, warnings := CanonicalizeLinks(pages, inventory)
 	require.Len(t, got, 1)
 	assert.Empty(t, warnings)
 	assert.Contains(t, got[0].Content, "[[domain-driven-design|domain driven design]]")
 }
 func TestCanonicalizeLinks_CurrentBatchPagesResolved(t *testing.T) {
 	// A concept created in the same batch should be canonicalizable
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/shape-up.md",
 			Content: "---\ntitle: 'Shape Up'\n---\n\n## Summary\n\nSee [[Betting]].\n",
 		},
 		{
 			Path:    "wiki/concepts/betting.md",
 			Content: "---\ntitle: 'Betting'\n---\n\n## Definition\n\nA technique.\n",
 		},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{} // empty — Betting is in the batch, not inventory
 	got, warnings := CanonicalizeLinks(pages, inventory)
 	require.Len(t, got, 2)
 	assert.Empty(t, warnings)
 	assert.Contains(t, got[0].Content, "[[betting|Betting]]")
 }
 func TestCanonicalizeLinks_MultipleLinksInOnePage(t *testing.T) {
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/foo.md",
 			Content: "---\ntitle: 'Foo'\n---\n\n## Summary\n\nSee [[Betting]] and [[Shape Up]].\n",
 		},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeConcept: {
 			{Slug: "betting", Title: "Betting"},
 		},
 		wiki.PageTypeSource: {
 			{Slug: "shape-up", Title: "Shape Up"},
 		},
 	}
 	got, warnings := CanonicalizeLinks(pages, inventory)
 	require.Len(t, got, 1)
 	assert.Empty(t, warnings)
 	assert.Contains(t, got[0].Content, "[[betting|Betting]]")
 	assert.Contains(t, got[0].Content, "[[shape-up|Shape Up]]")
 }
--- a/ingestion/internal/pipeline/parse.go
+++ b/ingestion/internal/pipeline/parse.go
@@ -5,13 +5,21 @@ import (
 	"encoding/json"
 	"fmt"
 	"strings"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
-// ParsePages parses LLM output as a JSON array of {path, content} objects.
+// RawPage is the LLM's output format — minimal structured data with no path or frontmatter.
 // The pipeline derives slugs, paths, and frontmatter from these fields.
 type RawPage struct {
 	Title   string `json:"title"`
 	Type    string `json:"type"`    // "source" | "concept" | "entity"
 	Subtype string `json:"subtype"` // entity: person|company|tool|model|framework|technology; source: article|pdf|book|video|note|project
 	Domain  string `json:"domain"`
 	Content string `json:"content"` // Markdown body only — no frontmatter
 }
 // ParseRawPages parses LLM output as a JSON array of RawPage objects.
 // If the array is truncated mid-object (token limit), it salvages all complete objects.
-func ParsePages(output string) ([]wiki.Page, []string) {
+func ParseRawPages(output string) ([]RawPage, []string) {
 	output = strings.TrimSpace(output)
 	if output == "" {
 		return nil, []string{"LLM returned empty output"}
@@ -19,7 +27,7 @@ func ParsePages(output string) ([]wiki.Page, []string) {
 	output = stripFences(output)
-	var pages []wiki.Page
+	var pages []RawPage
 	if err := json.Unmarshal([]byte(output), &pages); err == nil {
 		return pages, nil
 	}
--- a/ingestion/internal/pipeline/parse_test.go
+++ b/ingestion/internal/pipeline/parse_test.go
@@ -8,39 +8,54 @@ import (
 	"github.com/stretchr/testify/require"
 )
-func TestParsePages_ValidJSON(t *testing.T) {
+func TestParseRawPages_ValidJSON(t *testing.T) {
-	input := `[{"path":"wiki/sources/foo.md","content":"# Foo"},{"path":"wiki/concepts/bar.md","content":"# Bar"}]`
+	input := `[{"title":"Shape Up","type":"source","subtype":"book","domain":"product-strategy","content":"## Summary\n\nFoo."},{"title":"Betting","type":"concept","content":"## Definition\n\nA technique."}]`
-	pages, warnings := ParsePages(input)
+	pages, warnings := ParseRawPages(input)
 	require.Len(t, pages, 2)
 	assert.Empty(t, warnings)
-	assert.Equal(t, "wiki/sources/foo.md", pages[0].Path)
+	assert.Equal(t, "Shape Up", pages[0].Title)
-	assert.Equal(t, "wiki/concepts/bar.md", pages[1].Path)
+	assert.Equal(t, "source", pages[0].Type)
 	assert.Equal(t, "book", pages[0].Subtype)
 	assert.Equal(t, "product-strategy", pages[0].Domain)
 	assert.Equal(t, "Betting", pages[1].Title)
 	assert.Equal(t, "concept", pages[1].Type)
 	assert.Empty(t, pages[1].Subtype)
 }
-func TestParsePages_StripsFences(t *testing.T) {
+func TestParseRawPages_StripsFences(t *testing.T) {
-	input := "```json\n[{\"path\":\"wiki/sources/foo.md\",\"content\":\"# Foo\"}]\n```"
+	input := "```json\n[{\"title\":\"Foo\",\"type\":\"concept\",\"content\":\"## Definition\\n\\nFoo.\"}]\n```"
-	pages, warnings := ParsePages(input)
+	pages, warnings := ParseRawPages(input)
 	assert.Len(t, pages, 1)
 	assert.Empty(t, warnings)
 }
 func TestParsePages_TruncationRecovery(t *testing.T) {
 	input := `[{"path":"wiki/sources/foo.md","content":"# Foo"},{"path":"wiki/concepts/bar.md","content":"trunc`
 	pages, warnings := ParsePages(input)
 	require.Len(t, pages, 1)
-	assert.Equal(t, "wiki/sources/foo.md", pages[0].Path)
+	assert.Empty(t, warnings)
 	assert.Equal(t, "Foo", pages[0].Title)
 }
 func TestParseRawPages_TruncationRecovery(t *testing.T) {
 	input := `[{"title":"Foo","type":"concept","content":"## Definition\n\nFoo."},{"title":"Bar","type":"concept","content":"trunc`
 	pages, warnings := ParseRawPages(input)
 	require.Len(t, pages, 1)
 	assert.Equal(t, "Foo", pages[0].Title)
 	assert.NotEmpty(t, warnings)
 }
-func TestParsePages_EmptyInput(t *testing.T) {
+func TestParseRawPages_EmptyInput(t *testing.T) {
-	pages, warnings := ParsePages("")
+	pages, warnings := ParseRawPages("")
 	assert.Empty(t, pages)
 	assert.NotEmpty(t, warnings)
 }
-func TestParsePages_PlainFence(t *testing.T) {
+func TestParseRawPages_PlainFence(t *testing.T) {
-	input := "```\n[{\"path\":\"wiki/sources/foo.md\",\"content\":\"ok\"}]\n```"
+	input := "```\n[{\"title\":\"Foo\",\"type\":\"concept\",\"content\":\"ok\"}]\n```"
-	pages, warnings := ParsePages(input)
+	pages, warnings := ParseRawPages(input)
-	assert.Len(t, pages, 1)
+	require.Len(t, pages, 1)
 	assert.Empty(t, warnings)
 }
 func TestParseRawPages_MissingTitle(t *testing.T) {
 	// Missing title — still parsed, Title is empty string
 	input := `[{"type":"concept","content":"## Definition\n\nFoo."}]`
 	pages, warnings := ParseRawPages(input)
 	require.Len(t, pages, 1)
 	assert.Empty(t, warnings)
 	assert.Empty(t, pages[0].Title)
 }
--- a/ingestion/internal/pipeline/pipeline.go
+++ b/ingestion/internal/pipeline/pipeline.go
@@ -41,9 +41,11 @@ func Run(ctx context.Context, cfg Config, brainDir, content, source string, dryR
 		schema = loadSchema(brainDir)
 	}
 	sourceSlug := wiki.Slug(source)
 	date := time.Now().UTC().Format("2006-01-02")
 	chunks := Chunk(content, cfg.ChunkSize)
-	var allPages []wiki.Page
+	var allRaw []RawPage
 	var allWarnings []string
 	for _, chunk := range chunks {
@@ -52,16 +54,20 @@ func Run(ctx context.Context, cfg Config, brainDir, content, source string, dryR
 		if err != nil {
 			return Result{}, fmt.Errorf("LLM call: %w", err)
 		}
-		pages, warnings := ParsePages(output)
+		raw, warnings := ParseRawPages(output)
-		allPages = append(allPages, pages...)
+		allRaw = append(allRaw, raw...)
 		allWarnings = append(allWarnings, warnings...)
 	}
-	merged := mergeAll(allPages)
+	pages, buildWarnings := BuildPages(allRaw, sourceSlug, date)
 	allWarnings = append(allWarnings, buildWarnings...)
 	resolved := Resolve(pages, inventory)
 	canonicalized, linkWarnings := CanonicalizeLinks(resolved, inventory)
 	allWarnings = append(allWarnings, linkWarnings...)
 	withRefs := injectSourceRefs(canonicalized, inventory, brainDir)
 	merged := mergeAll(withRefs)
 	date := time.Now().UTC().Format("2006-01-02")
 	var written []string
 	for _, page := range merged {
 		if !dryRun {
 			dest := filepath.Join(brainDir, filepath.FromSlash(page.Path))
--- a/ingestion/internal/pipeline/pipeline_test.go
+++ b/ingestion/internal/pipeline/pipeline_test.go
@@ -15,7 +15,6 @@ import (
 	"github.com/stretchr/testify/require"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/llm"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 func TestRun_WritesPages(t *testing.T) {
@@ -24,20 +23,25 @@ func TestRun_WritesPages(t *testing.T) {
 		require.NoError(t, os.MkdirAll(filepath.Join(brainDir, sub), 0o755))
 	}
-	llmResponse := mustJSON([]wiki.Page{
+	llmResponse := mustJSON([]RawPage{
 		{
-			Path:    "wiki/sources/test-article.md",
+			Title:   "Test Article",
-			Content: "---\ntitle: Test Article\ntype: article\ndomain: software-engineering\ndate_ingested: 2026-04-22\nlast_updated: 2026-04-22\naliases:\n  - Test Article\n---\n\n## Summary\n\nA test article.\n\n## Key Claims\n\n- It tests things.\n\n## Concepts Introduced or Reinforced\n\n## Entities Mentioned\n\n## Open Questions Raised\n",
+			Type:    "source",
 			Subtype: "article",
 			Domain:  "software-engineering",
 			Content: "## Summary\n\nA test article.\n\n## Key Claims\n\n- It tests things.\n\n## Concepts Introduced or Reinforced\n\n[[Testing]]\n\n## Entities Mentioned\n\n## Open Questions Raised\n",
 		},
 		{
-			Path:    "wiki/concepts/testing.md",
+			Title:   "Testing",
-			Content: "---\ntitle: Testing\ndomain: software-engineering\nlast_updated: 2026-04-22\naliases:\n  - Testing\n---\n\n## Definition\n\nThe practice of verifying software.\n\n## Why It Matters\n\nCatches bugs.\n\n## Related Concepts\n\n## Related Entities\n\n## Sources\n\n## Evolving Notes\n",
+			Type:    "concept",
 			Domain:  "software-engineering",
 			Content: "## Definition\n\nThe practice of verifying software.\n\n## Why It Matters\n\nCatches bugs.\n\n## Related Concepts\n\n## Related Entities\n\n## Sources\n\n## Evolving Notes\n",
 		},
 	})
 	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
 		w.Header().Set("Content-Type", "application/json")
-		json.NewEncoder(w).Encode(map[string]any{
+		_ = json.NewEncoder(w).Encode(map[string]any{
 			"choices": []map[string]any{
 				{"message": map[string]any{"role": "assistant", "content": llmResponse}},
 			},
@@ -53,7 +57,6 @@ func TestRun_WritesPages(t *testing.T) {
 	result, err := Run(context.Background(), cfg, brainDir, "An article about testing.", "test-article", false)
 	require.NoError(t, err)
 	assert.Len(t, result.Pages, 2)
 	assert.Empty(t, result.Warnings)
 	_, err = os.Stat(filepath.Join(brainDir, "wiki", "sources", "test-article.md"))
 	require.NoError(t, err)
@@ -71,13 +74,15 @@ func TestRun_DryRunDoesNotWrite(t *testing.T) {
 		require.NoError(t, os.MkdirAll(filepath.Join(brainDir, sub), 0o755))
 	}
-	llmResponse := mustJSON([]wiki.Page{{
+	llmResponse := mustJSON([]RawPage{{
-		Path:    "wiki/sources/foo.md",
+		Title:   "Foo",
-		Content: "---\ntitle: Foo\n---\n\n## Summary\n\nFoo.\n",
+		Type:    "source",
 		Subtype: "article",
 		Content: "## Summary\n\nFoo.\n",
 	}})
 	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		json.NewEncoder(w).Encode(map[string]any{
+		_ = json.NewEncoder(w).Encode(map[string]any{
 			"choices": []map[string]any{{"message": map[string]any{"content": llmResponse}}},
 		})
 	}))
@@ -98,14 +103,14 @@ func TestRun_MergesDuplicatePaths(t *testing.T) {
 		require.NoError(t, os.MkdirAll(filepath.Join(brainDir, sub), 0o755))
 	}
-	// LLM returns same path twice (simulates multi-chunk merge)
+	// LLM returns same title twice (simulates multi-chunk duplicate)
-	llmResponse := mustJSON([]wiki.Page{
+	llmResponse := mustJSON([]RawPage{
-		{Path: "wiki/concepts/foo.md", Content: "---\ntitle: Foo\n---\n\n## Definition\n\nFirst.\n\n## Related Concepts\n\n- [[bar|Bar]]\n"},
+		{Title: "Foo", Type: "concept", Content: "## Definition\n\nFirst.\n\n## Related Concepts\n\n[[Bar]]\n"},
-		{Path: "wiki/concepts/foo.md", Content: "---\ntitle: Foo\n---\n\n## Definition\n\nSecond.\n\n## Related Concepts\n\n- [[baz|Baz]]\n"},
+		{Title: "Foo", Type: "concept", Content: "## Definition\n\nSecond.\n\n## Related Concepts\n\n[[Baz]]\n"},
 	})
 	srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		json.NewEncoder(w).Encode(map[string]any{
+		_ = json.NewEncoder(w).Encode(map[string]any{
 			"choices": []map[string]any{{"message": map[string]any{"content": llmResponse}}},
 		})
 	}))
@@ -120,8 +125,9 @@ func TestRun_MergesDuplicatePaths(t *testing.T) {
 	require.NoError(t, err)
 	// keep-first for Definition, union for Related Concepts
 	assert.Contains(t, string(content), "First.")
-	assert.Contains(t, string(content), "[[bar|Bar]]")
+	// Bar and Baz unknown in empty inventory → left as plain [[links]]
-	assert.Contains(t, string(content), "[[baz|Baz]]")
+	assert.Contains(t, string(content), "[[Bar]]")
 	assert.Contains(t, string(content), "[[Baz]]")
 }
 func mustJSON(v any) string {
--- a/ingestion/internal/pipeline/prompt.go
+++ b/ingestion/internal/pipeline/prompt.go
@@ -12,12 +12,15 @@ import (
 const systemPrompt = `You are a wiki agent. Read the source material and produce structured wiki pages following the schema provided.
 Output ONLY a valid JSON array — no markdown fences, no other text before or after.
-Each element must have:
+Each element must have exactly these fields:
-  "path"    — relative path within the wiki, e.g. "wiki/sources/foo.md"
+  "title"   — exact page title (e.g. "FinBERT", "Ryan Singer", "Shape Up")
-  "content" — full markdown content of the page including YAML frontmatter
+  "type"    — exactly one of: "source", "concept", "entity"
  "subtype" — for source: article|pdf|book|video|note|project; for entity: person|company|tool|model|framework|technology; omit for concept
  "domain"  — one of the domains in the schema (omit if none fits)
  "content" — Markdown body only — NO frontmatter, NO path, NO slug
-Follow the schema strictly: correct frontmatter fields, wikilinks as [[slug|Display Text]],
+Wikilinks in content: [[Display Name]] — just the display name, no slug, no pipe separator.
-dates in YYYY-MM-DD format, and paraphrase rather than quoting verbatim.`
+Only link to pages listed in the inventory or pages you are creating in this response.`
 // BuildPrompt constructs the user prompt for a single chunk.
 func BuildPrompt(schema, source, content string, inventory map[wiki.PageType][]wiki.Entry) string {
@@ -30,7 +33,7 @@ func BuildPrompt(schema, source, content string, inventory map[wiki.PageType][]w
 	sb.WriteString("\n\n")
 	sb.WriteString("## Existing wiki pages\n\n")
-	sb.WriteString("Link ONLY to pages in this inventory or pages you are creating in this response.\n\n")
+	sb.WriteString("Reference these pages by display name only — [[Display Name]] — in your content.\n\n")
 	for _, pt := range []wiki.PageType{wiki.PageTypeConcept, wiki.PageTypeEntity, wiki.PageTypeSource} {
 		entries := inventory[pt]
@@ -39,19 +42,19 @@ func BuildPrompt(schema, source, content string, inventory map[wiki.PageType][]w
 			fmt.Fprintf(&sb, "%s — (none yet)\n\n", label)
 			continue
 		}
-		fmt.Fprintf(&sb, "%s — link ONLY under the matching section:\n", label)
+		fmt.Fprintf(&sb, "%s:\n", label)
 		for _, e := range entries {
-			fmt.Fprintf(&sb, "  - [[%s|%s]]\n", e.Slug, e.Title)
+			fmt.Fprintf(&sb, "  - %s\n", e.Title)
 		}
 		sb.WriteString("\n")
 	}
 	sb.WriteString("## Non-negotiable rules\n\n")
 	sb.WriteString("1. Output ONLY a valid JSON array — no prose, no fences.\n")
-	sb.WriteString("2. Slugs are kebab-case: lowercase, spaces→hyphens, no special chars.\n")
+	sb.WriteString("2. Fields: title, type, subtype (if applicable), domain (if applicable), content.\n")
-	sb.WriteString("3. Wikilinks: [[slug|Display Text]] — the pipe is required.\n")
+	sb.WriteString("3. Wikilinks: [[Display Name]] — no slug, no pipe. The pipeline handles slugs.\n")
-	sb.WriteString("4. Section links must match their section type.\n")
+	sb.WriteString("4. Section links must match their section type (Related Concepts → concepts only, etc.).\n")
-	sb.WriteString("5. One source page per book — update it if inventory shows it exists.\n\n")
+	sb.WriteString("5. One source page per book — if inventory shows it exists, return it as an UPDATE.\n\n")
 	fmt.Fprintf(&sb, "## Source: %s\n\n", source)
 	sb.WriteString(content)
--- a/ingestion/internal/pipeline/refs.go
+++ b/ingestion/internal/pipeline/refs.go
@@ -0,0 +1,115 @@
 // ingestion/internal/pipeline/refs.go
 package pipeline
 import (
 	"os"
 	"path/filepath"
 	"regexp"
 	"strings"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 var wikilinkRE = regexp.MustCompile(`\[\[([^|\]]+)\|`)
 // injectSourceRefs finds the source page in the proposed batch, extracts its
 // wikilinks, and injects a back-reference into every linked concept or entity page.
 // Pages that exist on disk but are not in the current batch are loaded and
 // appended so they will be updated on write.
 func injectSourceRefs(pages []wiki.Page, inventory map[wiki.PageType][]wiki.Entry, brainDir string) []wiki.Page {
 	sourceSlug, sourceTitle, found := findSourcePage(pages)
 	if !found {
 		return pages
 	}
 	var sourceContent string
 	for _, p := range pages {
 		if strings.HasPrefix(p.Path, "wiki/sources/") &&
 			strings.TrimSuffix(filepath.Base(p.Path), ".md") == sourceSlug {
 			sourceContent = p.Content
 			break
 		}
 	}
 	linkedSlugs := extractWikilinks(sourceContent)
 	sourceRef := "- [[" + sourceSlug + "|" + sourceTitle + "]]"
 	bySlug := make(map[string]int, len(pages))
 	for i, p := range pages {
 		if !strings.HasPrefix(p.Path, "wiki/sources/") {
 			bySlug[strings.TrimSuffix(filepath.Base(p.Path), ".md")] = i
 		}
 	}
 	for slug := range linkedSlugs {
 		if slug == sourceSlug {
 			continue
 		}
 		if idx, ok := bySlug[slug]; ok {
 			pages[idx] = addSourceRef(pages[idx], sourceRef)
 			continue
 		}
 		pt, ok := findInInventory(slug, inventory)
 		if !ok {
 			continue
 		}
 		diskPath := filepath.Join(brainDir, "wiki", string(pt), slug+".md")
 		b, err := os.ReadFile(diskPath)
 		if err != nil {
 			continue
 		}
 		page := wiki.Page{
 			Path:    "wiki/" + string(pt) + "/" + slug + ".md",
 			Content: string(b),
 		}
 		pages = append(pages, addSourceRef(page, sourceRef))
 	}
 	return pages
 }
 // addSourceRef injects sourceRef into the ## Sources bullet section of page
 // using wiki.Merge, which deduplicates bullets automatically.
 func addSourceRef(page wiki.Page, sourceRef string) wiki.Page {
 	patch := wiki.Page{
 		Path:    page.Path,
 		Content: "\n## Sources\n\n" + sourceRef + "\n",
 	}
 	return wiki.Merge(page, patch)
 }
 // extractWikilinks returns the set of slugs referenced as [[slug|...]] in content.
 func extractWikilinks(content string) map[string]bool {
 	slugs := make(map[string]bool)
 	for _, m := range wikilinkRE.FindAllStringSubmatch(content, -1) {
 		slugs[m[1]] = true
 	}
 	return slugs
 }
 // findSourcePage returns the slug and title of the first wiki/sources/ page in pages.
 func findSourcePage(pages []wiki.Page) (slug, title string, found bool) {
 	for _, p := range pages {
 		if strings.HasPrefix(p.Path, "wiki/sources/") {
 			slug = strings.TrimSuffix(filepath.Base(p.Path), ".md")
 			title = extractTitle(p.Content)
 			if title == "" {
 				title = slug
 			}
 			return slug, title, true
 		}
 	}
 	return "", "", false
 }
 // findInInventory returns the PageType for a slug if it appears in the inventory.
 func findInInventory(slug string, inventory map[wiki.PageType][]wiki.Entry) (wiki.PageType, bool) {
 	for pt, entries := range inventory {
 		for _, e := range entries {
 			if e.Slug == slug {
 				return pt, true
 			}
 		}
 	}
 	return "", false
 }
--- a/ingestion/internal/pipeline/refs_test.go
+++ b/ingestion/internal/pipeline/refs_test.go
@@ -0,0 +1,172 @@
 // ingestion/internal/pipeline/refs_test.go
 package pipeline
 import (
 	"os"
 	"path/filepath"
 	"testing"
 	"github.com/stretchr/testify/assert"
 	"github.com/stretchr/testify/require"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 func makeInventory(concepts, entities []string) map[wiki.PageType][]wiki.Entry {
 	inv := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeConcept: {},
 		wiki.PageTypeEntity:  {},
 		wiki.PageTypeSource:  {},
 	}
 	for _, slug := range concepts {
 		inv[wiki.PageTypeConcept] = append(inv[wiki.PageTypeConcept], wiki.Entry{Slug: slug, Title: slug})
 	}
 	for _, slug := range entities {
 		inv[wiki.PageTypeEntity] = append(inv[wiki.PageTypeEntity], wiki.Entry{Slug: slug, Title: slug})
 	}
 	return inv
 }
 func TestInjectSourceRefs_NoSourcePage(t *testing.T) {
 	pages := []wiki.Page{
 		{Path: "wiki/concepts/foo.md", Content: "---\ntitle: Foo\n---\n\n## Definition\n\nFoo.\n"},
 	}
 	got := injectSourceRefs(pages, makeInventory(nil, nil), t.TempDir())
 	assert.Equal(t, pages, got)
 }
 func TestInjectSourceRefs_InjectsIntoProposedConcept(t *testing.T) {
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/my-article.md",
 			Content: "---\ntitle: My Article\n---\n\n## Summary\n\nSee [[domain-driven-design|Domain Driven Design]].\n",
 		},
 		{
 			Path:    "wiki/concepts/domain-driven-design.md",
 			Content: "---\ntitle: Domain Driven Design\n---\n\n## Definition\n\nA methodology.\n",
 		},
 	}
 	got := injectSourceRefs(pages, makeInventory(nil, nil), t.TempDir())
 	require.Len(t, got, 2)
 	assert.Contains(t, got[1].Content, "## Sources")
 	assert.Contains(t, got[1].Content, "[[my-article|My Article]]")
 }
 func TestInjectSourceRefs_LoadsConceptFromDisk(t *testing.T) {
 	brainDir := t.TempDir()
 	conceptDir := filepath.Join(brainDir, "wiki", "concepts")
 	require.NoError(t, os.MkdirAll(conceptDir, 0o755))
 	require.NoError(t, os.WriteFile(
 		filepath.Join(conceptDir, "shape-up.md"),
 		[]byte("---\ntitle: Shape Up\n---\n\n## Definition\n\nA methodology.\n"),
 		0o644,
 	))
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/my-article.md",
 			Content: "---\ntitle: My Article\n---\n\n## Summary\n\nSee [[shape-up|Shape Up]].\n",
 		},
 	}
 	inv := makeInventory([]string{"shape-up"}, nil)
 	got := injectSourceRefs(pages, inv, brainDir)
 	require.Len(t, got, 2)
 	var conceptPage wiki.Page
 	for _, p := range got {
 		if p.Path == "wiki/concepts/shape-up.md" {
 			conceptPage = p
 		}
 	}
 	assert.Contains(t, conceptPage.Content, "## Sources")
 	assert.Contains(t, conceptPage.Content, "[[my-article|My Article]]")
 	assert.Contains(t, conceptPage.Content, "## Definition")
 }
 func TestInjectSourceRefs_NoSelfReference(t *testing.T) {
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/my-article.md",
 			Content: "---\ntitle: My Article\n---\n\n## Summary\n\nSelf-link [[my-article|My Article]].\n",
 		},
 	}
 	got := injectSourceRefs(pages, makeInventory(nil, nil), t.TempDir())
 	assert.Len(t, got, 1)
 }
 func TestInjectSourceRefs_DeduplicatesOnReingestion(t *testing.T) {
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/my-article.md",
 			Content: "---\ntitle: My Article\n---\n\n## Summary\n\nSee [[ddd|DDD]].\n",
 		},
 		{
 			Path:    "wiki/concepts/ddd.md",
 			Content: "---\ntitle: DDD\n---\n\n## Definition\n\nA thing.\n\n## Sources\n\n- [[my-article|My Article]]\n",
 		},
 	}
 	got := injectSourceRefs(pages, makeInventory(nil, nil), t.TempDir())
 	require.Len(t, got, 2)
 	count := 0
 	for _, line := range splitLines(got[1].Content) {
 		if line == "- [[my-article|My Article]]" {
 			count++
 		}
 	}
 	assert.Equal(t, 1, count, "source ref should appear exactly once")
 }
 func TestInjectSourceRefs_InjectsIntoEntity(t *testing.T) {
 	pages := []wiki.Page{
 		{
 			Path:    "wiki/sources/book.md",
 			Content: "---\ntitle: Book\n---\n\n## Summary\n\nBy [[ryan-singer|Ryan Singer]].\n",
 		},
 		{
 			Path:    "wiki/entities/ryan-singer.md",
 			Content: "---\ntitle: Ryan Singer\n---\n\n## Description\n\nA designer.\n",
 		},
 	}
 	got := injectSourceRefs(pages, makeInventory(nil, nil), t.TempDir())
 	require.Len(t, got, 2)
 	var entity wiki.Page
 	for _, p := range got {
 		if p.Path == "wiki/entities/ryan-singer.md" {
 			entity = p
 		}
 	}
 	assert.Contains(t, entity.Content, "[[book|Book]]")
 }
 func TestExtractWikilinks(t *testing.T) {
 	content := "See [[foo|Foo]] and [[bar|Bar]] and [[foo|Foo again]]."
 	got := extractWikilinks(content)
 	assert.True(t, got["foo"])
 	assert.True(t, got["bar"])
 	assert.Len(t, got, 2, "duplicate slugs should be deduplicated")
 }
 func splitLines(s string) []string {
 	var out []string
 	start := 0
 	for i := 0; i < len(s); i++ {
 		if s[i] == '\n' {
 			if line := s[start:i]; line != "" {
 				out = append(out, line)
 			}
 			start = i + 1
 		}
 	}
 	if last := s[start:]; last != "" {
 		out = append(out, last)
 	}
 	return out
 }
--- a/ingestion/internal/pipeline/resolve.go
+++ b/ingestion/internal/pipeline/resolve.go
@@ -0,0 +1,88 @@
 // ingestion/internal/pipeline/resolve.go
 package pipeline
 import (
 	"path/filepath"
 	"strings"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 // Resolve remaps proposed pages to existing slugs when a fuzzy title match is found.
 // It only matches within the same page type (entities→entities, concepts→concepts).
 // Pages with no inventory match are returned unchanged.
 func Resolve(proposed []wiki.Page, inventory map[wiki.PageType][]wiki.Entry) []wiki.Page {
 	type key struct {
 		pt         wiki.PageType
 		normalized string
 	}
 	lookup := make(map[key]string) // key → canonical slug
 	for pt, entries := range inventory {
 		for _, e := range entries {
 			k := key{pt: pt, normalized: normalizeTitle(e.Title)}
 			lookup[k] = e.Slug
 			for _, alias := range e.Aliases {
 				ak := key{pt: pt, normalized: normalizeTitle(alias)}
 				if _, exists := lookup[ak]; !exists {
 					lookup[ak] = e.Slug
 				}
 			}
 		}
 	}
 	out := make([]wiki.Page, 0, len(proposed))
 	for _, page := range proposed {
 		pt := pageTypeFromPath(page.Path)
 		title := extractTitle(page.Content)
 		k := key{pt: pt, normalized: normalizeTitle(title)}
 		if canonicalSlug, ok := lookup[k]; ok {
 			dir := filepath.Dir(page.Path)
 			page.Path = dir + "/" + canonicalSlug + ".md"
 		}
 		out = append(out, page)
 	}
 	return out
 }
 // normalizeTitle lowercases, removes leading articles, collapses whitespace.
 // "The Shape Up Method" → "shape up method"
 func normalizeTitle(s string) string {
 	s = strings.ToLower(strings.TrimSpace(s))
 	for _, article := range []string{"the ", "a ", "an "} {
 		s = strings.TrimPrefix(s, article)
 	}
 	s = strings.ReplaceAll(s, "-", " ")
 	return strings.Join(strings.Fields(s), " ")
 }
 // pageTypeFromPath extracts the wiki.PageType from a path like "wiki/entities/foo.md".
 func pageTypeFromPath(path string) wiki.PageType {
 	parts := strings.Split(filepath.ToSlash(path), "/")
 	if len(parts) >= 2 {
 		return wiki.PageType(parts[1])
 	}
 	return ""
 }
 // extractTitle reads the title field from YAML frontmatter in content.
 // Falls back to empty string if not found.
 func extractTitle(content string) string {
 	lines := strings.SplitN(content, "\n", 30)
 	inFM := false
 	for _, line := range lines {
 		if strings.TrimSpace(line) == "---" {
 			if !inFM {
 				inFM = true
 				continue
 			}
 			break
 		}
 		if inFM {
 			key, val, ok := strings.Cut(line, ":")
 			if ok && strings.TrimSpace(key) == "title" {
 				return strings.Trim(strings.TrimSpace(val), `"'`)
 			}
 		}
 	}
 	return ""
 }
--- a/ingestion/internal/pipeline/resolve_test.go
+++ b/ingestion/internal/pipeline/resolve_test.go
@@ -0,0 +1,90 @@
 // ingestion/internal/pipeline/resolve_test.go
 package pipeline
 import (
 	"testing"
 	"github.com/stretchr/testify/assert"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
 func TestResolve_NoMatch(t *testing.T) {
 	proposed := []wiki.Page{
 		{Path: "wiki/entities/new-person.md", Content: "---\ntitle: New Person\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeEntity: {
 			{Slug: "ryan-singer", Title: "Ryan Singer", Aliases: []string{"Singer"}},
 		},
 	}
 	got := Resolve(proposed, inventory)
 	assert.Len(t, got, 1)
 	assert.Equal(t, "wiki/entities/new-person.md", got[0].Path)
 }
 func TestResolve_TitleMatchRedirectsSlug(t *testing.T) {
 	proposed := []wiki.Page{
 		{Path: "wiki/entities/ryan-singer-the-designer.md", Content: "---\ntitle: Ryan Singer\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeEntity: {
 			{Slug: "ryan-singer", Title: "Ryan Singer", Aliases: nil},
 		},
 	}
 	got := Resolve(proposed, inventory)
 	assert.Len(t, got, 1)
 	assert.Equal(t, "wiki/entities/ryan-singer.md", got[0].Path)
 }
 func TestResolve_AliasMatchRedirectsSlug(t *testing.T) {
 	proposed := []wiki.Page{
 		{Path: "wiki/entities/singer.md", Content: "---\ntitle: Singer\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeEntity: {
 			{Slug: "ryan-singer", Title: "Ryan Singer", Aliases: []string{"Singer", "R. Singer"}},
 		},
 	}
 	got := Resolve(proposed, inventory)
 	assert.Len(t, got, 1)
 	assert.Equal(t, "wiki/entities/ryan-singer.md", got[0].Path)
 }
 func TestResolve_NormalizationCaseAndArticles(t *testing.T) {
 	proposed := []wiki.Page{
 		{Path: "wiki/concepts/the-shape-up-method.md", Content: "---\ntitle: The Shape Up Method\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeConcept: {
 			{Slug: "shape-up-method", Title: "Shape Up Method", Aliases: nil},
 		},
 	}
 	got := Resolve(proposed, inventory)
 	assert.Len(t, got, 1)
 	assert.Equal(t, "wiki/concepts/shape-up-method.md", got[0].Path)
 }
 func TestResolve_OnlyMatchesSamePageType(t *testing.T) {
 	proposed := []wiki.Page{
 		{Path: "wiki/concepts/ryan-singer.md", Content: "---\ntitle: Ryan Singer\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{
 		wiki.PageTypeEntity: {
 			{Slug: "ryan-singer", Title: "Ryan Singer", Aliases: nil},
 		},
 		wiki.PageTypeConcept: {},
 	}
 	got := Resolve(proposed, inventory)
 	assert.Len(t, got, 1)
 	assert.Equal(t, "wiki/concepts/ryan-singer.md", got[0].Path)
 }
 func TestResolve_EmptyInventory(t *testing.T) {
 	proposed := []wiki.Page{
 		{Path: "wiki/entities/first.md", Content: "---\ntitle: First\n---\n"},
 	}
 	inventory := map[wiki.PageType][]wiki.Entry{}
 	got := Resolve(proposed, inventory)
 	assert.Equal(t, proposed, got)
 }
--- a/ingestion/internal/watcher/watcher.go
+++ b/ingestion/internal/watcher/watcher.go
@@ -4,6 +4,7 @@ package watcher
 import (
 	"context"
 	"fmt"
 	"io"
 	"log/slog"
 	"os"
 	"path/filepath"
@@ -11,6 +12,7 @@ import (
 	"time"
 	"unicode"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/extract"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
 )
@@ -72,6 +74,14 @@ func processDir(ctx context.Context, cfg Config, date string) []error {
 			return nil
 		}
 		// Skip files that have already been processed or permanently failed.
 		if _, err := os.Stat(path + ".processed"); err == nil {
 			return nil
 		}
 		if _, err := os.Stat(path + ".failed"); err == nil {
 			return nil
 		}
 		if err := processFile(ctx, cfg, path, date); err != nil {
 			errs = append(errs, fmt.Errorf("process %s: %w", filepath.Base(path), err))
 		}
@@ -83,58 +93,85 @@ func processDir(ctx context.Context, cfg Config, date string) []error {
 	return errs
 }
-// processFile reads a file, calls pipeline.Run, moves it to processed/ or failed/.
+// processFile reads a file, calls pipeline.Run, copies it to processed/ or failed/,
 // and writes a marker file next to the original so the watcher skips it next poll.
 // The original file is never deleted, keeping Syncthing-connected vaults (e.g. Obsidian) intact.
 func processFile(ctx context.Context, cfg Config, path, date string) error {
 	filename := filepath.Base(path)
 	source := deriveSource(filename)
-	content, err := os.ReadFile(path)
+	content, err := extract.Text(path)
 	if err != nil {
-		return fmt.Errorf("read file: %w", err)
+		return fmt.Errorf("extract text: %w", err)
 	}
-	_, runErr := pipeline.Run(ctx, cfg.Pipeline, cfg.BrainDir, string(content), source, false)
+	_, runErr := pipeline.Run(ctx, cfg.Pipeline, cfg.BrainDir, content, source, false)
 	if runErr != nil {
-		// Move to failed/.
+		// Copy to failed/ and leave a .failed marker so we don't retry.
 		failedDir := filepath.Join(cfg.BrainDir, "raw", "failed")
 		if mkErr := os.MkdirAll(failedDir, 0o755); mkErr != nil {
 			return fmt.Errorf("mkdir failed dir: %w", mkErr)
 		}
 		dest := filepath.Join(failedDir, filename)
-		if mvErr := os.Rename(path, dest); mvErr != nil {
+		if cpErr := copyFile(path, dest); cpErr != nil {
-			return fmt.Errorf("move to failed: %w", mvErr)
+			return fmt.Errorf("copy to failed: %w", cpErr)
 		}
 		if mkErr := os.WriteFile(path+".failed", []byte(runErr.Error()), 0o644); mkErr != nil {
 			slog.Error("watcher: failed to write .failed marker", "error", mkErr)
 		}
-		slog.Warn("watcher: file failed, moved to failed/", "file", filename, "error", runErr)
+		slog.Warn("watcher: file failed", "file", filename, "error", runErr)
 		if logErr := appendWatcherLog(cfg.BrainDir, filename, runErr, date); logErr != nil {
 			slog.Error("watcher: failed to write log entry", "error", logErr)
 		}
-		// Return nil: the file was quarantined successfully; the error was already
+		// Return nil: quarantine succeeded; error already logged.
 		// logged. Returning runErr would cause processDir to log it again at Error level.
 		return nil
 	}
-	// Move to processed/YYYY-MM-DD/.
+	// Copy to processed/YYYY-MM-DD/ and leave a .processed marker so we don't re-ingest.
 	processedDir := filepath.Join(cfg.BrainDir, "raw", "processed", date)
 	if err := os.MkdirAll(processedDir, 0o755); err != nil {
 		return fmt.Errorf("mkdir processed dir: %w", err)
 	}
 	dest := filepath.Join(processedDir, filename)
 	if _, err := os.Stat(dest); err == nil {
-		// File already exists in processed; append timestamp to avoid overwriting the archive.
+		// Archive copy already exists; append timestamp to avoid overwriting.
 		ext := filepath.Ext(filename)
 		base := strings.TrimSuffix(filename, ext)
 		dest = filepath.Join(processedDir, base+"-"+time.Now().UTC().Format("150405")+ext)
 	}
-	if err := os.Rename(path, dest); err != nil {
+	if err := copyFile(path, dest); err != nil {
-		return fmt.Errorf("move to processed: %w", err)
+		return fmt.Errorf("copy to processed: %w", err)
 	}
 	if err := os.WriteFile(path+".processed", []byte(date), 0o644); err != nil {
 		slog.Error("watcher: failed to write .processed marker", "error", err)
 	}
 	slog.Info("watcher: file processed", "file", filename, "source", source)
 	return nil
 }
 // copyFile copies src to dst, creating dst if it doesn't exist.
 func copyFile(src, dst string) error {
 	in, err := os.Open(src)
 	if err != nil {
 		return fmt.Errorf("open src: %w", err)
 	}
 	defer in.Close() //nolint:errcheck
 	out, err := os.Create(dst)
 	if err != nil {
 		return fmt.Errorf("create dst: %w", err)
 	}
 	if _, err := io.Copy(out, in); err != nil {
 		out.Close() //nolint:errcheck
 		return fmt.Errorf("copy: %w", err)
 	}
 	return out.Close()
 }
 // deriveSource turns a filename into a human-readable source name.
 // "shape-up-book.md" → "Shape Up Book"
 func deriveSource(filename string) string {
@@ -164,10 +201,10 @@ func appendWatcherLog(brainDir, filename string, runErr error, date string) erro
 	if err != nil {
 		return fmt.Errorf("open log: %w", err)
 	}
 	defer f.Close()
 	if _, err = f.WriteString(entry); err != nil {
 		f.Close() //nolint:errcheck
 		return fmt.Errorf("write log: %w", err)
 	}
-	return nil
+	return f.Close()
 }
--- a/ingestion/internal/watcher/watcher_test.go
+++ b/ingestion/internal/watcher/watcher_test.go
@@ -14,13 +14,12 @@ import (
 	"github.com/stretchr/testify/require"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/wiki"
 )
-// successComplete returns a valid JSON-encoded page array for any call.
+// successComplete returns a valid JSON-encoded RawPage array for any call.
-func successComplete(page wiki.Page) pipeline.CompleteFunc {
+func successComplete(raw pipeline.RawPage) pipeline.CompleteFunc {
 	return func(ctx context.Context, system, user string) (string, error) {
-		b, err := json.Marshal([]wiki.Page{page})
+		b, err := json.Marshal([]pipeline.RawPage{raw})
 		if err != nil {
 			return "", err
 		}
@@ -50,16 +49,19 @@ func TestStart_ProcessesFile(t *testing.T) {
 	require.NoError(t, os.WriteFile(rawFile, []byte("Content about Shape Up."), 0o644))
 	date := time.Now().UTC().Format("2006-01-02")
-	wikiPage := wiki.Page{
+	rawPage := pipeline.RawPage{
-		Path:    "wiki/sources/shape-up-book.md",
+		Title:   "Shape Up Book",
-		Content: "---\ntitle: Shape Up Book\ntype: article\ndomain: product-management\ndate_ingested: " + date + "\nlast_updated: " + date + "\naliases:\n  - Shape Up Book\n---\n\n## Summary\n\nA book about Shape Up.\n",
+		Type:    "source",
 		Subtype: "article",
 		Domain:  "product-management",
 		Content: "## Summary\n\nA book about Shape Up.\n",
 	}
 	cfg := Config{
 		BrainDir: brainDir,
 		Interval: 50 * time.Millisecond,
 		Pipeline: pipeline.Config{
-			Complete:  successComplete(wikiPage),
+			Complete:  successComplete(rawPage),
 			ChunkSize: 0,
 			Schema:    "# Schema\nThree page types.",
 		},
@@ -81,11 +83,15 @@ func TestStart_ProcessesFile(t *testing.T) {
 		}
 		time.Sleep(20 * time.Millisecond)
 	}
-	require.True(t, found, "file should be moved to processed/")
+	require.True(t, found, "file should be copied to processed/")
-	// Original file should be gone.
+	// Original file should still exist (copy, not move — keeps Obsidian vault intact).
 	_, err := os.Stat(rawFile)
-	assert.True(t, os.IsNotExist(err), "original file should be gone from raw/")
+	assert.NoError(t, err, "original file should remain in raw/")
 	// A .processed marker should exist next to the original.
 	_, err = os.Stat(rawFile + ".processed")
 	assert.NoError(t, err, ".processed marker should be written")
 	// Wiki page should exist.
 	wikiPath := filepath.Join(brainDir, "wiki", "sources", "shape-up-book.md")
@@ -130,11 +136,15 @@ func TestStart_MovesToFailedOnError(t *testing.T) {
 		}
 		time.Sleep(20 * time.Millisecond)
 	}
-	require.True(t, found, "file should be moved to failed/")
+	require.True(t, found, "file should be copied to failed/")
-	// Original file should be gone from raw/.
+	// Original file should still exist (copy, not move — keeps Obsidian vault intact).
 	_, err := os.Stat(rawFile)
-	assert.True(t, os.IsNotExist(err), "original file should be gone from raw/")
+	assert.NoError(t, err, "original file should remain in raw/")
 	// A .failed marker should exist next to the original.
 	_, err = os.Stat(rawFile + ".failed")
 	assert.NoError(t, err, ".failed marker should be written")
 	// log.md should contain a watcher error entry.
 	logContent, err := os.ReadFile(filepath.Join(brainDir, "log.md"))
@@ -185,12 +195,14 @@ func TestProcessDir_SkipsSubdirs(t *testing.T) {
 	// Track which sources were passed to Complete.
 	var processedSources []string
 	completeFn := func(ctx context.Context, system, user string) (string, error) {
-		// Record that this was called; return a minimal valid page.
+		// Record that this was called; return a minimal valid RawPage.
-		page := wiki.Page{
+		raw := pipeline.RawPage{
-			Path:    "wiki/sources/valid.md",
+			Title:   "Valid",
-			Content: "---\ntitle: Valid\n---\n\n## Summary\n\nValid.\n",
+			Type:    "source",
 			Subtype: "article",
 			Content: "## Summary\n\nValid.\n",
 		}
-		b, _ := json.Marshal([]wiki.Page{page})
+		b, _ := json.Marshal([]pipeline.RawPage{raw})
 		processedSources = append(processedSources, "called")
 		return string(b), nil
 	}
--- a/ingestion/internal/wiki/inventory.go
+++ b/ingestion/internal/wiki/inventory.go
@@ -32,23 +32,26 @@ func LoadInventory(brainDir string) (map[PageType][]Entry, error) {
 			}
 			slug := strings.TrimSuffix(e.Name(), ".md")
 			path := filepath.Join(dir, e.Name())
-			title := readTitle(path, slug)
+			title, aliases := readFrontmatter(path, slug)
-			result[pt] = append(result[pt], Entry{Slug: slug, Title: title, Type: pt})
+			result[pt] = append(result[pt], Entry{Slug: slug, Title: title, Aliases: aliases, Type: pt})
 		}
 	}
 	return result, nil
 }
-// readTitle extracts the title from YAML frontmatter, falling back to slug.
+// readFrontmatter extracts title and aliases from YAML frontmatter.
-func readTitle(path, fallback string) string {
+// Falls back to slug for title and empty aliases on any error.
 func readFrontmatter(path, fallbackSlug string) (title string, aliases []string) {
 	title = fallbackSlug
 	f, err := os.Open(path)
 	if err != nil {
-		return fallback
+		return
 	}
-	defer f.Close()
+	defer f.Close() //nolint:errcheck
 	scanner := bufio.NewScanner(f)
 	inFM := false
 	inAliases := false
 	for scanner.Scan() {
 		line := scanner.Text()
 		if strings.TrimSpace(line) == "---" {
@@ -56,14 +59,32 @@ func readTitle(path, fallback string) string {
 				inFM = true
 				continue
 			}
-			break
+			break // end of frontmatter
 		}
-		if inFM {
+		if !inFM {
-			key, val, ok := strings.Cut(line, ":")
+			continue
-			if ok && strings.TrimSpace(key) == "title" {
+		}
-				return strings.Trim(strings.TrimSpace(val), `"'`)
+
 		// Detect alias list items (lines starting with "  - ").
 		if inAliases {
 			trimmed := strings.TrimSpace(line)
 			if strings.HasPrefix(trimmed, "- ") {
 				aliases = append(aliases, strings.TrimPrefix(trimmed, "- "))
 				continue
 			}
 			inAliases = false // end of alias block
 		}
 		key, val, ok := strings.Cut(line, ":")
 		if !ok {
 			continue
 		}
 		switch strings.TrimSpace(key) {
 		case "title":
 			title = strings.Trim(strings.TrimSpace(val), `"'`)
 		case "aliases":
 			inAliases = true
 		}
 	}
-	return fallback
+	return
 }
--- a/ingestion/internal/wiki/inventory_test.go
+++ b/ingestion/internal/wiki/inventory_test.go
@@ -60,3 +60,24 @@ func TestLoadInventory_MissingDirsOk(t *testing.T) {
 	require.NoError(t, err)
 	assert.NotNil(t, inv)
 }
 func TestLoadInventory_ReadsAliases(t *testing.T) {
 	dir := t.TempDir()
 	require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki", "entities"), 0o755))
 	require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki", "concepts"), 0o755))
 	require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki", "sources"), 0o755))
 	require.NoError(t, os.WriteFile(
 		filepath.Join(dir, "wiki", "entities", "ryan-singer.md"),
 		[]byte("---\ntitle: Ryan Singer\naliases:\n  - Singer\n  - R. Singer\n---\n\n## Description\n\nDesigner.\n"),
 		0o644,
 	))
 	inv, err := LoadInventory(dir)
 	require.NoError(t, err)
 	require.Len(t, inv[PageTypeEntity], 1)
 	e := inv[PageTypeEntity][0]
 	assert.Equal(t, "Ryan Singer", e.Title)
 	assert.Equal(t, []string{"Singer", "R. Singer"}, e.Aliases)
 }
--- a/ingestion/internal/wiki/log.go
+++ b/ingestion/internal/wiki/log.go
@@ -32,9 +32,9 @@ func AppendLog(brainDir, source string, pages, warnings []string, date string) e
 	if err != nil {
 		return fmt.Errorf("open log: %w", err)
 	}
 	defer f.Close()
 	if _, err = f.WriteString(sb.String()); err != nil {
 		f.Close() //nolint:errcheck
 		return fmt.Errorf("write log: %w", err)
 	}
-	return nil
+	return f.Close()
 }
--- a/ingestion/internal/wiki/slug.go
+++ b/ingestion/internal/wiki/slug.go
@@ -21,7 +21,7 @@ func Slug(title string) string {
 		case unicode.IsLetter(r) || unicode.IsDigit(r):
 			b.WriteRune(r)
 			prevHyphen = false
-		// all other characters (apostrophes, colons, dots, etc.) are dropped
+			// all other characters (apostrophes, colons, dots, etc.) are dropped
 		}
 	}
 	return strings.TrimRight(b.String(), "-")
--- a/ingestion/internal/wiki/types.go
+++ b/ingestion/internal/wiki/types.go
@@ -18,7 +18,8 @@ type Page struct {
 // Entry is a summary of an existing wiki page used to build the inventory.
 type Entry struct {
-	Slug  string
+	Slug    string
-	Title string
+	Title   string
-	Type  PageType
+	Aliases []string
 	Type    PageType
 }
Author	SHA1	Message	Date
Mathias Bergqvist	923a665365	fix(pipeline): skip RawPages with empty title in BuildPages instead of producing broken paths All checks were successful CI / Lint / Test / Vet (push) Successful in 9s Details CI / Mirror to GitHub (push) Has been skipped Details Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 19:55:37 +02:00
Mathias Bergqvist	537aebc302	feat(pipeline): update system prompt for new LLM JSON contract (no slugs) - Change prompt to reflect new output format: title, type, subtype, domain, content - Remove slug/path generation responsibility from LLM — pipeline now handles it - Wikilinks change from [[slug\|Display Name]] to [[Display Name]] only - LLM no longer includes frontmatter or paths in output docs(schema): update LLM output format and wikilink convention for Level 3 - Specify JSON schema: title, type, subtype, domain, content fields - Remove frontmatter requirements from schema output (handled by pipeline) - Simplify wikilink format to [[Display Name]] — no slug or pipe - Pipeline now responsible for slug generation and frontmatter construction These changes shift slug/frontmatter generation from LLM to pipeline, reducing cognitive load on the model and improving control over output. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 19:45:21 +02:00
Mathias Bergqvist	de35d4dbb0	feat(pipeline): wire ParseRawPages+BuildPages+CanonicalizeLinks into Run Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 19:07:33 +02:00
Mathias Bergqvist	26855f69b0	feat(pipeline): add CanonicalizeLinks — convert [[Display Name]] to [[slug\|Display Name]]	2026-04-23 18:59:10 +02:00
Mathias Bergqvist	a7b363d589	fix(pipeline): quote YAML scalar fields in buildFrontmatter to prevent injection	2026-04-23 18:56:39 +02:00
Mathias Bergqvist	7b57051af8	feat(pipeline): add BuildPages — compute slugs/paths/frontmatter from RawPage	2026-04-23 18:50:37 +02:00
Mathias Bergqvist	a620f6cb01	fix(pipeline): guard empty-title bridge + skip stale integration tests until task4 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 18:46:07 +02:00
Mathias Bergqvist	26b5636b43	feat(pipeline): replace ParsePages with ParseRawPages + RawPage type Strips slug authority from the LLM. The new RawPage type carries only {title, type, subtype, domain, content} — no paths or frontmatter. Pipeline will derive slugs deterministically (Task 4). pipeline.go gets a temporary bridge stub (TODO task4) to keep the package compiling between tasks. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 18:41:33 +02:00
Mathias Bergqvist	989f375aec	docs: add Level 3 implementation plan	2026-04-23 17:37:45 +02:00
Mathias Bergqvist	6403d5e444	docs: add Level 3 slug authority design spec	2026-04-23 17:23:22 +02:00
Mathias Bergqvist	ab19968ae2	feat: POST /backfill-refs — retroactive source back-reference injection All checks were successful CI / Lint / Test / Vet (push) Successful in 10s Details CI / Mirror to GitHub (push) Successful in 3s Details Walks wiki/sources/, extracts wikilinks from each source page, and injects ## Sources back-refs into all linked concept and entity pages. All refs from all sources are accumulated in memory before writing, so multiple sources referencing the same concept are merged in a single write. Running the endpoint multiple times is safe — wiki.Merge deduplicates bullet items.	2026-04-23 16:50:11 +02:00
Mathias Bergqvist	1605624668	feat(pipeline): add POST /backfill-refs endpoint to retroactively inject source back-references	2026-04-23 16:50:00 +02:00
Mathias Bergqvist	55fa0b503a	feat: source back-references on concept and entity pages All checks were successful CI / Lint / Test / Vet (push) Successful in 10s Details CI / Mirror to GitHub (push) Successful in 3s Details After each ingestion, every concept and entity page linked from the source page gains a ## Sources entry pointing back to that source. Pages already on disk (from prior ingestions) are loaded and updated, so re-ingesting a new source accumulates references over time. Deduplication is handled by wiki.Merge's existing bullet-section logic.	2026-04-23 16:36:40 +02:00
Mathias Bergqvist	3c2bd9268c	feat(pipeline): wire source back-reference injection into Run	2026-04-23 16:36:22 +02:00
Mathias Bergqvist	29727ec2a5	feat(pipeline): inject source back-references into concept and entity pages	2026-04-23 16:35:47 +02:00
Mathias Bergqvist	0a075088b2	docs: add source back-references implementation plan	2026-04-23 16:33:41 +02:00
Mathias Bergqvist	1bfe501d09	fix(cd): only deploy when CI passes on main All checks were successful CI / Lint / Test / Vet (push) Successful in 10s Details CI / Mirror to GitHub (push) Successful in 3s Details	2026-04-23 16:24:59 +02:00
Mathias Bergqvist	3607920601	fix(lint): resolve all errcheck violations in ingestion module All checks were successful cd / Build and deploy (push) Successful in 10s Details CI / Lint / Test / Vet (push) Successful in 10s Details CI / Mirror to GitHub (push) Successful in 3s Details	2026-04-23 16:20:59 +02:00
Mathias Bergqvist	a6c39e8691	feat: PDF extraction and fuzzy entity resolution Some checks failed cd / Build and deploy (push) Successful in 11s Details CI / Lint / Test / Vet (push) Failing after 5s Details CI / Mirror to GitHub (push) Has been skipped Details - New extract package: Text() dispatcher for .md/.txt passthrough and PDF extraction via pdftotext subprocess - wiki.Entry gains Aliases []string, loaded from YAML frontmatter - Fuzzy entity resolution in pipeline: normalizes titles (lowercase, strip articles, collapse hyphens) and matches proposed pages against existing inventory slugs and aliases to prevent proliferation - Watcher and API handler now use extract.Text() instead of os.ReadFile - Dockerfile: apk add poppler-utils in Alpine runtime stage	2026-04-23 16:03:02 +02:00
Mathias Bergqvist	a37d18bf7a	chore(docker): add poppler-utils for PDF text extraction	2026-04-23 16:02:12 +02:00
Mathias Bergqvist	2975eadc87	feat(watcher,api): use extract.Text() for file reading — fixes PDF ingestion	2026-04-23 16:01:36 +02:00
Mathias Bergqvist	53e46781b1	feat(pipeline): resolve proposed pages against inventory before writing	2026-04-23 16:00:31 +02:00
Mathias Bergqvist	e9b5cc401c	feat(pipeline): add fuzzy entity resolution to prevent slug proliferation	2026-04-23 15:59:36 +02:00
Mathias Bergqvist	bf6f497d9d	feat(wiki): add Aliases to Entry and read from YAML frontmatter	2026-04-23 15:57:16 +02:00
Mathias Bergqvist	9cc6c2d053	feat(extract): implement PDF extraction via pdftotext	2026-04-23 15:53:46 +02:00
Mathias Bergqvist	43a46d07e5	feat(extract): add Text() dispatcher with md/txt passthrough	2026-04-23 15:45:20 +02:00
Mathias Bergqvist	820d1c93a7	docs: add implementation plan for PDF extraction and entity resolution	2026-04-23 15:44:13 +02:00
Mathias Bergqvist	6928907d79	fix(watcher): copy files instead of moving them, leave originals for Obsidian Some checks failed cd / Build and deploy (push) Successful in 10s Details CI / Lint / Test / Vet (push) Failing after 5s Details CI / Mirror to GitHub (push) Has been skipped Details Files dropped into brain/raw/ are now copied to processed/ or failed/ rather than moved. A .processed or .failed marker is written next to the original so the watcher skips it on subsequent polls without deleting it. This keeps Syncthing-synced Obsidian vaults intact after ingestion. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-23 14:47:50 +02:00