merge: client-name scrubber rule (refs hyperguild#27)

feat(claudewatcher): client-name guard via RegisterRule + env
Pre-rollout guard. Source code stays clean — client identities come from CLAUDE_INGEST_CLIENT_BLOCK env (sourced from a SOPS-encrypted k8s secret in infra repo). Env value is a regex alternation; main wraps it with `(?i)\b(...)\b` so word-boundary matching avoids false hits inside longer identifiers (e.g. "Sebastian" doesn't trigger on "SEB"). DefaultRules (credential shapes) still take precedence so any leak that's BOTH a client mention AND a credential shape logs as the credential — strictly more dangerous, points triage at the right thing. Tests cover precedence + case variations + word-boundary respect + invalid-pattern rejection. Refs: infra#73 Track E.1 pre-rollout grill (option B). Bump-Type: minor
2026-05-26 07:10:05 +02:00 · 2026-05-26 07:10:05 +02:00 · 2026-05-25 19:59:13 +02:00 · 2026-05-25 19:59:07 +02:00 · 2026-05-25 19:58:58 +02:00 · 2026-05-25 18:53:14 +02:00
51 changed files with 4981 additions and 680 deletions
--- a/brain/eval/baseline-pre-fix.txt
+++ b/brain/eval/baseline-pre-fix.txt
@@ -0,0 +1,167 @@
+# baseline-pre-fix — 20 questions, k=5
+
+top-1 hit rate: 4/20 = 20%
+top-3 hit rate: 13/20 = 65%
+
+## per-question detail
+
+· rank=3  expected=dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+     q: how do I stop dex from logging users out on every pod restart?
+     1. homelab-network-perimeter-model
+     2. 2026-05-12-koala-machine-state
+     3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart  <-- expected
+     4. infra-litellm-absorption-2026-05-16
+     5. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
+
+★ rank=1  expected=postgres-least-privilege-migration-tenant-grant-bypass-2026-05
+     q: my postgres-exporter broke after revoking PUBLIC CONNECT — why?
+     1. postgres-least-privilege-migration-tenant-grant-bypass-2026-05  <-- expected
+     2. infra-litellm-absorption-2026-05-16
+     3. brain-mcp-activation-runbook
+     4. extension-version-lags-platform-major-upgrade
+     5. ntfy-deny-all-rollout-ordering-keep-alert-pipeline-live-during-auth-flip
+
+★ rank=1  expected=homelab-network-perimeter-model
+     q: when is a NodePort acceptable vs needing a public ingress with bearer gate?
+     1. homelab-network-perimeter-model  <-- expected
+     2. qwen3-thinking-model-empty-content-trap
+     3. mcpclient-empty-token-silent-401-envfrom-missing-key
+     4. 2026-05-12-koala-machine-state
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=3  expected=exit-255-unknown-reason-not-oom
+     q: what does container exit code 255 with reason Unknown mean?
+     1. qwen3-thinking-model-empty-content-trap
+     2. infra-litellm-absorption-2026-05-16
+     3. exit-255-unknown-reason-not-oom  <-- expected
+     4. mcpclient-empty-token-silent-401-envfrom-missing-key
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=3  expected=gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+     q: can gitea push-mirror create the github repo automatically?
+     1. infra-litellm-absorption-2026-05-16
+     2. Autoresearch
+     3. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo  <-- expected
+     4. adr-new-project-gitea-first-github-mirror
+     5. adr-github-as-primary-remote
+
+✗ rank=0  expected=flux-healthcheck-stale-on-resource-removal
+     q: a flux kustomization is stuck after I removed a resource — why?
+     1. qwen3-thinking-model-empty-content-trap
+     2. 2026-05-12-koala-machine-state
+     3. homelab-architecture-principles-2026-05
+     4. gitea-mcp: full stack shipped end-to-end (2026-05-05)
+     5. k8s-configmap-mount-no-reload-needs-pod-restart
+
+· rank=2  expected=go-bytes-buffer-bytes-reset-aliasing-trap
+     q: the bytes buffer aliasing trap with Reset in a loop — what's the bug?
+     1. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
+     2. go-bytes-buffer-bytes-reset-aliasing-trap  <-- expected
+     3. homelab-security-chains-not-bugs
+     4. training-on-rtx-5070-pretraining-vs-finetuning
+     5. Hash Encoding
+
+★ rank=1  expected=homelab-architecture-principles-2026-05
+     q: what are the homelab architecture principles from may 2026?
+     1. homelab-architecture-principles-2026-05  <-- expected
+     2. homelab-network-perimeter-model
+     3. Claude Managed Agents — architecture notes relevant to homelab agent platform
+     4. homelab-core-glossary
+     5. 2026-05-12-koala-machine-state
+
+✗ rank=0  expected=2026-05-04-sops-age-key-from-flux-cluster
+     q: where does the sops age private key live in the cluster?
+     1. 2026-05-12-koala-machine-state
+     2. homelab-network-perimeter-model
+     3. postgres-least-privilege-migration-tenant-grant-bypass-2026-05
+     4. brain-mcp-activation-runbook
+     5. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+
+✗ rank=0  expected=grafana-dashboards-as-code-not-ui-state
+     q: why do my grafana dashboards disappear after a pod restart?
+     1. infra-litellm-absorption-2026-05-16
+     2. 2026-05-12-koala-machine-state
+     3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
+     4. brain-mcp-activation-runbook
+     5. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+
+· rank=2  expected=double-diamond-methodology
+     q: what is the double diamond methodology?
+     1. Harnessing the Power of Hash Encoding for Categorical Data in Data Science
+     2. double-diamond-methodology  <-- expected
+     3. unified-methodology-diamond-futures-autoresearch
+     4. futures-thinking-extended-double-diamond
+     5. insight-exploration-as-diamond-1
+
+· rank=3  expected=2026-05-04-mcp-transport-version-claude-ai-strict
+     q: my MCP server works from claude code but fails on claude.ai — what's different?
+     1. qwen3-thinking-model-empty-content-trap
+     2. mcp-resource-url-empty-breaks-claude-ai-discovery-silently
+     3. 2026-05-04-mcp-transport-version-claude-ai-strict  <-- expected
+     4. 2026-05-04-claude-ai-custom-mcp-connectors
+     5. finding-github-mcp-claudeai-vs-claudecode
+
+· rank=2  expected=homelab-security-chains-not-bugs
+     q: how should I rate security findings — isolated bugs or exploit chains?
+     1. homelab-network-perimeter-model
+     2. homelab-security-chains-not-bugs  <-- expected
+     3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
+     4. policy-audit-mode-blocks-nothing
+     5. homelab-document-accepted-risk-to-break-audit-cycle
+
+· rank=2  expected=2026-05-03-canonical-vs-derived-context-flow
+     q: how should canonical context files relate to derived adapter files?
+     1. qwen3-thinking-model-empty-content-trap
+     2. 2026-05-03-canonical-vs-derived-context-flow  <-- expected
+     3. 2026-05-12-koala-machine-state
+     4. 2026-05-04-claude-ai-custom-mcp-connectors
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=2  expected=homelab-core-glossary
+     q: what is the homelab core vocabulary glossary?
+     1. homelab-architecture-principles-2026-05
+     2. homelab-core-glossary  <-- expected
+     3. Claude Managed Agents — architecture notes relevant to homelab agent platform
+     4. 2026-05-12-koala-machine-state
+     5. Autoresearch
+
+★ rank=1  expected=koala-llama-swap-native-tool-calls-survey-2026-05
+     q: which models on koala llama-swap actually emit native tool_calls correctly?
+     1. koala-llama-swap-native-tool-calls-survey-2026-05  <-- expected
+     2. 2026-05-12-koala-machine-state
+     3. infra-litellm-absorption-2026-05-16
+     4. training-on-rtx-5070-pretraining-vs-finetuning
+     5. qwen3-thinking-model-empty-content-trap
+
+✗ rank=0  expected=qwen35-9b-fast
+     q: what is qwen35-9b-fast and what's it used for?
+     1. koala-llama-swap-native-tool-calls-survey-2026-05
+     2. qwen3-thinking-model-empty-content-trap
+     3. Qwen35-9b-fast
+     4. infra-litellm-absorption-2026-05-16
+     5. 2026-05-12-koala-machine-state
+
+✗ rank=0  expected=go-defer-errcheck-body-close
+     q: in go, how do I prevent defer body close from silently dropping errors?
+     1. infra-litellm-absorption-2026-05-16
+     2. homelab-network-perimeter-model
+     3. go-bytes-buffer-bytes-reset-aliasing-trap
+     4. mcpclient-empty-token-silent-401-envfrom-missing-key
+     5. brain-mcp-activation-runbook
+
+✗ rank=0  expected=hyperguild-level3-pipeline-rewrite
+     q: what was the level 3 rewrite of hyperguild's ingestion pipeline?
+     1. 2026-05-12-koala-machine-state
+     2. homelab-core-glossary
+     3. brain-mcp-activation-runbook
+     4. koala-llama-swap-native-tool-calls-survey-2026-05
+     5. infra-litellm-absorption-2026-05-16
+
+? rank=4  expected=adr-new-project-gitea-first-github-mirror
+     q: what's the new-project ADR — is it gitea-first or github-first?
+     1. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+     2. gitea-mcp: full stack shipped end-to-end (2026-05-05)
+     3. mcp-tool-design-get-needs-list-partner
+     4. adr-new-project-gitea-first-github-mirror  <-- expected
+     5. 2026-05-04-gitea-mcp-build-session
+
--- a/brain/eval/post-fix.txt
+++ b/brain/eval/post-fix.txt
@@ -0,0 +1,167 @@
+# post-fix — 20 questions, k=5
+
+top-1 hit rate: 4/20 = 20%
+top-3 hit rate: 14/20 = 70%
+
+## per-question detail
+
+· rank=3  expected=dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+     q: how do I stop dex from logging users out on every pod restart?
+     1. homelab-network-perimeter-model
+     2. 2026-05-12-koala-machine-state
+     3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart  <-- expected
+     4. infra-litellm-absorption-2026-05-16
+     5. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
+
+★ rank=1  expected=postgres-least-privilege-migration-tenant-grant-bypass-2026-05
+     q: my postgres-exporter broke after revoking PUBLIC CONNECT — why?
+     1. postgres-least-privilege-migration-tenant-grant-bypass-2026-05  <-- expected
+     2. infra-litellm-absorption-2026-05-16
+     3. brain-mcp-activation-runbook
+     4. extension-version-lags-platform-major-upgrade
+     5. ntfy-deny-all-rollout-ordering-keep-alert-pipeline-live-during-auth-flip
+
+★ rank=1  expected=homelab-network-perimeter-model
+     q: when is a NodePort acceptable vs needing a public ingress with bearer gate?
+     1. homelab-network-perimeter-model  <-- expected
+     2. qwen3-thinking-model-empty-content-trap
+     3. mcpclient-empty-token-silent-401-envfrom-missing-key
+     4. 2026-05-12-koala-machine-state
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=3  expected=exit-255-unknown-reason-not-oom
+     q: what does container exit code 255 with reason Unknown mean?
+     1. qwen3-thinking-model-empty-content-trap
+     2. infra-litellm-absorption-2026-05-16
+     3. exit-255-unknown-reason-not-oom  <-- expected
+     4. mcpclient-empty-token-silent-401-envfrom-missing-key
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=3  expected=gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+     q: can gitea push-mirror create the github repo automatically?
+     1. infra-litellm-absorption-2026-05-16
+     2. Autoresearch
+     3. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo  <-- expected
+     4. adr-new-project-gitea-first-github-mirror
+     5. adr-github-as-primary-remote
+
+✗ rank=0  expected=flux-healthcheck-stale-on-resource-removal
+     q: a flux kustomization is stuck after I removed a resource — why?
+     1. qwen3-thinking-model-empty-content-trap
+     2. 2026-05-12-koala-machine-state
+     3. homelab-architecture-principles-2026-05
+     4. gitea-mcp: full stack shipped end-to-end (2026-05-05)
+     5. k8s-configmap-mount-no-reload-needs-pod-restart
+
+· rank=2  expected=go-bytes-buffer-bytes-reset-aliasing-trap
+     q: the bytes buffer aliasing trap with Reset in a loop — what's the bug?
+     1. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
+     2. go-bytes-buffer-bytes-reset-aliasing-trap  <-- expected
+     3. homelab-security-chains-not-bugs
+     4. training-on-rtx-5070-pretraining-vs-finetuning
+     5. Hash Encoding
+
+★ rank=1  expected=homelab-architecture-principles-2026-05
+     q: what are the homelab architecture principles from may 2026?
+     1. homelab-architecture-principles-2026-05  <-- expected
+     2. homelab-network-perimeter-model
+     3. Claude Managed Agents — architecture notes relevant to homelab agent platform
+     4. homelab-core-glossary
+     5. 2026-05-12-koala-machine-state
+
+✗ rank=0  expected=2026-05-04-sops-age-key-from-flux-cluster
+     q: where does the sops age private key live in the cluster?
+     1. 2026-05-12-koala-machine-state
+     2. homelab-network-perimeter-model
+     3. postgres-least-privilege-migration-tenant-grant-bypass-2026-05
+     4. brain-mcp-activation-runbook
+     5. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+
+✗ rank=0  expected=grafana-dashboards-as-code-not-ui-state
+     q: why do my grafana dashboards disappear after a pod restart?
+     1. infra-litellm-absorption-2026-05-16
+     2. 2026-05-12-koala-machine-state
+     3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
+     4. brain-mcp-activation-runbook
+     5. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+
+· rank=2  expected=double-diamond-methodology
+     q: what is the double diamond methodology?
+     1. Harnessing the Power of Hash Encoding for Categorical Data in Data Science
+     2. double-diamond-methodology  <-- expected
+     3. unified-methodology-diamond-futures-autoresearch
+     4. futures-thinking-extended-double-diamond
+     5. insight-exploration-as-diamond-1
+
+· rank=3  expected=2026-05-04-mcp-transport-version-claude-ai-strict
+     q: my MCP server works from claude code but fails on claude.ai — what's different?
+     1. qwen3-thinking-model-empty-content-trap
+     2. mcp-resource-url-empty-breaks-claude-ai-discovery-silently
+     3. 2026-05-04-mcp-transport-version-claude-ai-strict  <-- expected
+     4. 2026-05-04-claude-ai-custom-mcp-connectors
+     5. finding-github-mcp-claudeai-vs-claudecode
+
+· rank=2  expected=homelab-security-chains-not-bugs
+     q: how should I rate security findings — isolated bugs or exploit chains?
+     1. homelab-network-perimeter-model
+     2. homelab-security-chains-not-bugs  <-- expected
+     3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
+     4. policy-audit-mode-blocks-nothing
+     5. homelab-document-accepted-risk-to-break-audit-cycle
+
+· rank=2  expected=2026-05-03-canonical-vs-derived-context-flow
+     q: how should canonical context files relate to derived adapter files?
+     1. qwen3-thinking-model-empty-content-trap
+     2. 2026-05-03-canonical-vs-derived-context-flow  <-- expected
+     3. 2026-05-12-koala-machine-state
+     4. 2026-05-04-claude-ai-custom-mcp-connectors
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=2  expected=homelab-core-glossary
+     q: what is the homelab core vocabulary glossary?
+     1. homelab-architecture-principles-2026-05
+     2. homelab-core-glossary  <-- expected
+     3. Claude Managed Agents — architecture notes relevant to homelab agent platform
+     4. 2026-05-12-koala-machine-state
+     5. Autoresearch
+
+★ rank=1  expected=koala-llama-swap-native-tool-calls-survey-2026-05
+     q: which models on koala llama-swap actually emit native tool_calls correctly?
+     1. koala-llama-swap-native-tool-calls-survey-2026-05  <-- expected
+     2. 2026-05-12-koala-machine-state
+     3. infra-litellm-absorption-2026-05-16
+     4. training-on-rtx-5070-pretraining-vs-finetuning
+     5. qwen3-thinking-model-empty-content-trap
+
+· rank=2  expected=qwen35-9b-fast
+     q: what is qwen35-9b-fast and what's it used for?
+     1. koala-llama-swap-native-tool-calls-survey-2026-05
+     2. qwen35-9b-fast  <-- expected
+     3. qwen3-thinking-model-empty-content-trap
+     4. infra-litellm-absorption-2026-05-16
+     5. 2026-05-12-koala-machine-state
+
+✗ rank=0  expected=go-defer-errcheck-body-close
+     q: in go, how do I prevent defer body close from silently dropping errors?
+     1. infra-litellm-absorption-2026-05-16
+     2. homelab-network-perimeter-model
+     3. go-bytes-buffer-bytes-reset-aliasing-trap
+     4. mcpclient-empty-token-silent-401-envfrom-missing-key
+     5. brain-mcp-activation-runbook
+
+✗ rank=0  expected=hyperguild-level3-pipeline-rewrite
+     q: what was the level 3 rewrite of hyperguild's ingestion pipeline?
+     1. 2026-05-12-koala-machine-state
+     2. homelab-core-glossary
+     3. brain-mcp-activation-runbook
+     4. koala-llama-swap-native-tool-calls-survey-2026-05
+     5. infra-litellm-absorption-2026-05-16
+
+? rank=4  expected=adr-new-project-gitea-first-github-mirror
+     q: what's the new-project ADR — is it gitea-first or github-first?
+     1. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+     2. gitea-mcp: full stack shipped end-to-end (2026-05-05)
+     3. mcp-tool-design-get-needs-list-partner
+     4. adr-new-project-gitea-first-github-mirror  <-- expected
+     5. 2026-05-04-gitea-mcp-build-session
+
--- a/brain/eval/post-m4.txt
+++ b/brain/eval/post-m4.txt
@@ -0,0 +1,167 @@
+# post-m4-tier-weighting — 20 questions, k=5
+
+top-1 hit rate: 6/20 = 30%
+top-3 hit rate: 15/20 = 75%
+
+## per-question detail
+
+· rank=3  expected=dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+     q: how do I stop dex from logging users out on every pod restart?
+     1. homelab-network-perimeter-model
+     2. 2026-05-12-koala-machine-state
+     3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart  <-- expected
+     4. infra-litellm-absorption-2026-05-16
+     5. k8s-configmap-mount-no-reload-needs-pod-restart
+
+· rank=2  expected=postgres-least-privilege-migration-tenant-grant-bypass-2026-05
+     q: my postgres-exporter broke after revoking PUBLIC CONNECT — why?
+     1. infra-litellm-absorption-2026-05-16
+     2. postgres-least-privilege-migration-tenant-grant-bypass-2026-05  <-- expected
+     3. extension-version-lags-platform-major-upgrade
+     4. ntfy-deny-all-rollout-ordering-keep-alert-pipeline-live-during-auth-flip
+     5. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+
+★ rank=1  expected=homelab-network-perimeter-model
+     q: when is a NodePort acceptable vs needing a public ingress with bearer gate?
+     1. homelab-network-perimeter-model  <-- expected
+     2. qwen3-thinking-model-empty-content-trap
+     3. mcpclient-empty-token-silent-401-envfrom-missing-key
+     4. 2026-05-12-koala-machine-state
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=3  expected=exit-255-unknown-reason-not-oom
+     q: what does container exit code 255 with reason Unknown mean?
+     1. qwen3-thinking-model-empty-content-trap
+     2. infra-litellm-absorption-2026-05-16
+     3. exit-255-unknown-reason-not-oom  <-- expected
+     4. mcpclient-empty-token-silent-401-envfrom-missing-key
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=2  expected=gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+     q: can gitea push-mirror create the github repo automatically?
+     1. infra-litellm-absorption-2026-05-16
+     2. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo  <-- expected
+     3. adr-new-project-gitea-first-github-mirror
+     4. adr-github-as-primary-remote
+     5. 2026-05-12-koala-machine-state
+
+✗ rank=0  expected=flux-healthcheck-stale-on-resource-removal
+     q: a flux kustomization is stuck after I removed a resource — why?
+     1. qwen3-thinking-model-empty-content-trap
+     2. 2026-05-12-koala-machine-state
+     3. homelab-architecture-principles-2026-05
+     4. k8s-configmap-mount-no-reload-needs-pod-restart
+     5. training-on-rtx-5070-pretraining-vs-finetuning
+
+★ rank=1  expected=go-bytes-buffer-bytes-reset-aliasing-trap
+     q: the bytes buffer aliasing trap with Reset in a loop — what's the bug?
+     1. go-bytes-buffer-bytes-reset-aliasing-trap  <-- expected
+     2. homelab-security-chains-not-bugs
+     3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
+     4. training-on-rtx-5070-pretraining-vs-finetuning
+     5. flux-healthcheck-stale-on-resource-removal
+
+★ rank=1  expected=homelab-architecture-principles-2026-05
+     q: what are the homelab architecture principles from may 2026?
+     1. homelab-architecture-principles-2026-05  <-- expected
+     2. homelab-network-perimeter-model
+     3. homelab-core-glossary
+     4. 2026-05-12-koala-machine-state
+     5. pattern-reddit-tmux-multiagent-conductor
+
+? rank=4  expected=2026-05-04-sops-age-key-from-flux-cluster
+     q: where does the sops age private key live in the cluster?
+     1. 2026-05-12-koala-machine-state
+     2. homelab-network-perimeter-model
+     3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+     4. 2026-05-04-sops-age-key-from-flux-cluster  <-- expected
+     5. homelab-security-chains-not-bugs
+
+★ rank=1  expected=grafana-dashboards-as-code-not-ui-state
+     q: why do my grafana dashboards disappear after a pod restart?
+     1. grafana-dashboards-as-code-not-ui-state  <-- expected
+     2. infra-litellm-absorption-2026-05-16
+     3. 2026-05-12-koala-machine-state
+     4. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+     5. k8s-configmap-mount-no-reload-needs-pod-restart
+
+★ rank=1  expected=double-diamond-methodology
+     q: what is the double diamond methodology?
+     1. double-diamond-methodology  <-- expected
+     2. unified-methodology-diamond-futures-autoresearch
+     3. futures-thinking-extended-double-diamond
+     4. insight-exploration-as-diamond-1
+     5. workflow-idea-to-running-service
+
+· rank=3  expected=2026-05-04-mcp-transport-version-claude-ai-strict
+     q: my MCP server works from claude code but fails on claude.ai — what's different?
+     1. qwen3-thinking-model-empty-content-trap
+     2. mcp-resource-url-empty-breaks-claude-ai-discovery-silently
+     3. 2026-05-04-mcp-transport-version-claude-ai-strict  <-- expected
+     4. 2026-05-04-claude-ai-custom-mcp-connectors
+     5. finding-github-mcp-claudeai-vs-claudecode
+
+· rank=2  expected=homelab-security-chains-not-bugs
+     q: how should I rate security findings — isolated bugs or exploit chains?
+     1. homelab-network-perimeter-model
+     2. homelab-security-chains-not-bugs  <-- expected
+     3. policy-audit-mode-blocks-nothing
+     4. homelab-document-accepted-risk-to-break-audit-cycle
+     5. audit-shortcut-tls-blocks-zero-equals-edge-only
+
+· rank=2  expected=2026-05-03-canonical-vs-derived-context-flow
+     q: how should canonical context files relate to derived adapter files?
+     1. qwen3-thinking-model-empty-content-trap
+     2. 2026-05-03-canonical-vs-derived-context-flow  <-- expected
+     3. 2026-05-12-koala-machine-state
+     4. 2026-05-04-claude-ai-custom-mcp-connectors
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=2  expected=homelab-core-glossary
+     q: what is the homelab core vocabulary glossary?
+     1. homelab-architecture-principles-2026-05
+     2. homelab-core-glossary  <-- expected
+     3. 2026-05-12-koala-machine-state
+     4. flux-kustomization-depends-on-bootstrap-ordering
+     5. brain-ingest-ntfy-service
+
+★ rank=1  expected=koala-llama-swap-native-tool-calls-survey-2026-05
+     q: which models on koala llama-swap actually emit native tool_calls correctly?
+     1. koala-llama-swap-native-tool-calls-survey-2026-05  <-- expected
+     2. 2026-05-12-koala-machine-state
+     3. infra-litellm-absorption-2026-05-16
+     4. training-on-rtx-5070-pretraining-vs-finetuning
+     5. qwen3-thinking-model-empty-content-trap
+
+✗ rank=0  expected=qwen35-9b-fast
+     q: what is qwen35-9b-fast and what's it used for?
+     1. koala-llama-swap-native-tool-calls-survey-2026-05
+     2. qwen3-thinking-model-empty-content-trap
+     3. infra-litellm-absorption-2026-05-16
+     4. 2026-05-12-koala-machine-state
+     5. index
+
+✗ rank=0  expected=go-defer-errcheck-body-close
+     q: in go, how do I prevent defer body close from silently dropping errors?
+     1. homelab-network-perimeter-model
+     2. infra-litellm-absorption-2026-05-16
+     3. go-bytes-buffer-bytes-reset-aliasing-trap
+     4. mcpclient-empty-token-silent-401-envfrom-missing-key
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+✗ rank=0  expected=hyperguild-level3-pipeline-rewrite
+     q: what was the level 3 rewrite of hyperguild's ingestion pipeline?
+     1. 2026-05-12-koala-machine-state
+     2. homelab-core-glossary
+     3. koala-llama-swap-native-tool-calls-survey-2026-05
+     4. infra-litellm-absorption-2026-05-16
+     5. homelab-architecture-principles-2026-05
+
+· rank=3  expected=adr-new-project-gitea-first-github-mirror
+     q: what's the new-project ADR — is it gitea-first or github-first?
+     1. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+     2. mcp-tool-design-get-needs-list-partner
+     3. adr-new-project-gitea-first-github-mirror  <-- expected
+     4. 2026-05-04-gitea-mcp-build-session
+     5. adr-local-dev-vs-hyperguild-new-project
+
--- a/brain/eval/post-m4b.txt
+++ b/brain/eval/post-m4b.txt
@@ -0,0 +1,167 @@
+# post-m4b-entities-promoted — 20 questions, k=5
+
+top-1 hit rate: 7/20 = 35%
+top-3 hit rate: 16/20 = 80%
+
+## per-question detail
+
+· rank=3  expected=dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+     q: how do I stop dex from logging users out on every pod restart?
+     1. homelab-network-perimeter-model
+     2. 2026-05-12-koala-machine-state
+     3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart  <-- expected
+     4. infra-litellm-absorption-2026-05-16
+     5. k8s-configmap-mount-no-reload-needs-pod-restart
+
+· rank=2  expected=postgres-least-privilege-migration-tenant-grant-bypass-2026-05
+     q: my postgres-exporter broke after revoking PUBLIC CONNECT — why?
+     1. infra-litellm-absorption-2026-05-16
+     2. postgres-least-privilege-migration-tenant-grant-bypass-2026-05  <-- expected
+     3. extension-version-lags-platform-major-upgrade
+     4. ntfy-deny-all-rollout-ordering-keep-alert-pipeline-live-during-auth-flip
+     5. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+
+★ rank=1  expected=homelab-network-perimeter-model
+     q: when is a NodePort acceptable vs needing a public ingress with bearer gate?
+     1. homelab-network-perimeter-model  <-- expected
+     2. qwen3-thinking-model-empty-content-trap
+     3. mcpclient-empty-token-silent-401-envfrom-missing-key
+     4. 2026-05-12-koala-machine-state
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=3  expected=exit-255-unknown-reason-not-oom
+     q: what does container exit code 255 with reason Unknown mean?
+     1. qwen3-thinking-model-empty-content-trap
+     2. infra-litellm-absorption-2026-05-16
+     3. exit-255-unknown-reason-not-oom  <-- expected
+     4. mcpclient-empty-token-silent-401-envfrom-missing-key
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=2  expected=gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+     q: can gitea push-mirror create the github repo automatically?
+     1. infra-litellm-absorption-2026-05-16
+     2. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo  <-- expected
+     3. adr-new-project-gitea-first-github-mirror
+     4. adr-github-as-primary-remote
+     5. 2026-05-12-koala-machine-state
+
+✗ rank=0  expected=flux-healthcheck-stale-on-resource-removal
+     q: a flux kustomization is stuck after I removed a resource — why?
+     1. qwen3-thinking-model-empty-content-trap
+     2. 2026-05-12-koala-machine-state
+     3. homelab-architecture-principles-2026-05
+     4. k8s-configmap-mount-no-reload-needs-pod-restart
+     5. training-on-rtx-5070-pretraining-vs-finetuning
+
+★ rank=1  expected=go-bytes-buffer-bytes-reset-aliasing-trap
+     q: the bytes buffer aliasing trap with Reset in a loop — what's the bug?
+     1. go-bytes-buffer-bytes-reset-aliasing-trap  <-- expected
+     2. homelab-security-chains-not-bugs
+     3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
+     4. training-on-rtx-5070-pretraining-vs-finetuning
+     5. flux-healthcheck-stale-on-resource-removal
+
+★ rank=1  expected=homelab-architecture-principles-2026-05
+     q: what are the homelab architecture principles from may 2026?
+     1. homelab-architecture-principles-2026-05  <-- expected
+     2. homelab-network-perimeter-model
+     3. homelab-core-glossary
+     4. 2026-05-12-koala-machine-state
+     5. pattern-reddit-tmux-multiagent-conductor
+
+? rank=4  expected=2026-05-04-sops-age-key-from-flux-cluster
+     q: where does the sops age private key live in the cluster?
+     1. 2026-05-12-koala-machine-state
+     2. homelab-network-perimeter-model
+     3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+     4. 2026-05-04-sops-age-key-from-flux-cluster  <-- expected
+     5. homelab-security-chains-not-bugs
+
+★ rank=1  expected=grafana-dashboards-as-code-not-ui-state
+     q: why do my grafana dashboards disappear after a pod restart?
+     1. grafana-dashboards-as-code-not-ui-state  <-- expected
+     2. infra-litellm-absorption-2026-05-16
+     3. 2026-05-12-koala-machine-state
+     4. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+     5. k8s-configmap-mount-no-reload-needs-pod-restart
+
+★ rank=1  expected=double-diamond-methodology
+     q: what is the double diamond methodology?
+     1. double-diamond-methodology  <-- expected
+     2. unified-methodology-diamond-futures-autoresearch
+     3. futures-thinking-extended-double-diamond
+     4. insight-exploration-as-diamond-1
+     5. workflow-idea-to-running-service
+
+· rank=3  expected=2026-05-04-mcp-transport-version-claude-ai-strict
+     q: my MCP server works from claude code but fails on claude.ai — what's different?
+     1. qwen3-thinking-model-empty-content-trap
+     2. mcp-resource-url-empty-breaks-claude-ai-discovery-silently
+     3. 2026-05-04-mcp-transport-version-claude-ai-strict  <-- expected
+     4. 2026-05-04-claude-ai-custom-mcp-connectors
+     5. finding-github-mcp-claudeai-vs-claudecode
+
+· rank=2  expected=homelab-security-chains-not-bugs
+     q: how should I rate security findings — isolated bugs or exploit chains?
+     1. homelab-network-perimeter-model
+     2. homelab-security-chains-not-bugs  <-- expected
+     3. policy-audit-mode-blocks-nothing
+     4. homelab-document-accepted-risk-to-break-audit-cycle
+     5. audit-shortcut-tls-blocks-zero-equals-edge-only
+
+· rank=2  expected=2026-05-03-canonical-vs-derived-context-flow
+     q: how should canonical context files relate to derived adapter files?
+     1. qwen3-thinking-model-empty-content-trap
+     2. 2026-05-03-canonical-vs-derived-context-flow  <-- expected
+     3. 2026-05-12-koala-machine-state
+     4. 2026-05-04-claude-ai-custom-mcp-connectors
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+· rank=2  expected=homelab-core-glossary
+     q: what is the homelab core vocabulary glossary?
+     1. homelab-architecture-principles-2026-05
+     2. homelab-core-glossary  <-- expected
+     3. 2026-05-12-koala-machine-state
+     4. qwen35-9b-fast
+     5. flux-kustomization-depends-on-bootstrap-ordering
+
+★ rank=1  expected=koala-llama-swap-native-tool-calls-survey-2026-05
+     q: which models on koala llama-swap actually emit native tool_calls correctly?
+     1. koala-llama-swap-native-tool-calls-survey-2026-05  <-- expected
+     2. 2026-05-12-koala-machine-state
+     3. infra-litellm-absorption-2026-05-16
+     4. training-on-rtx-5070-pretraining-vs-finetuning
+     5. qwen3-thinking-model-empty-content-trap
+
+★ rank=1  expected=qwen35-9b-fast
+     q: what is qwen35-9b-fast and what's it used for?
+     1. qwen35-9b-fast  <-- expected
+     2. koala-llama-swap-native-tool-calls-survey-2026-05
+     3. qwen3-thinking-model-empty-content-trap
+     4. infra-litellm-absorption-2026-05-16
+     5. 2026-05-12-koala-machine-state
+
+✗ rank=0  expected=go-defer-errcheck-body-close
+     q: in go, how do I prevent defer body close from silently dropping errors?
+     1. homelab-network-perimeter-model
+     2. infra-litellm-absorption-2026-05-16
+     3. go-bytes-buffer-bytes-reset-aliasing-trap
+     4. mcpclient-empty-token-silent-401-envfrom-missing-key
+     5. koala-llama-swap-native-tool-calls-survey-2026-05
+
+✗ rank=0  expected=hyperguild-level3-pipeline-rewrite
+     q: what was the level 3 rewrite of hyperguild's ingestion pipeline?
+     1. 2026-05-12-koala-machine-state
+     2. homelab-core-glossary
+     3. koala-llama-swap-native-tool-calls-survey-2026-05
+     4. infra-litellm-absorption-2026-05-16
+     5. homelab-architecture-principles-2026-05
+
+· rank=3  expected=adr-new-project-gitea-first-github-mirror
+     q: what's the new-project ADR — is it gitea-first or github-first?
+     1. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+     2. mcp-tool-design-get-needs-list-partner
+     3. adr-new-project-gitea-first-github-mirror  <-- expected
+     4. 2026-05-04-gitea-mcp-build-session
+     5. adr-local-dev-vs-hyperguild-new-project
+
--- a/brain/eval/qa-2026-05.md
+++ b/brain/eval/qa-2026-05.md
@@ -0,0 +1,76 @@
+# Brain retrieval eval set — 2026-05-24
+
+20 hand-authored Q→expected-top-1-slug pairs. Used by `score.sh` to
+measure brain_query top-1 + top-3 hit rate against the live brain.
+
+Authoring rules:
+- Each question maps to **one** clear-best entry. Avoid ambiguous
+  questions where multiple slugs could be the right answer.
+- Questions are phrased the way a future-me would actually ask, not
+  the way the entry's title reads. Some lexical distance is the point.
+- `expected` is the slug as stored in `brain_entities.slug`. Update
+  if the slug renames.
+
+## Pairs
+
+```
+q: how do I stop dex from logging users out on every pod restart?
+expected: dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
+
+q: my postgres-exporter broke after revoking PUBLIC CONNECT — why?
+expected: postgres-least-privilege-migration-tenant-grant-bypass-2026-05
+
+q: when is a NodePort acceptable vs needing a public ingress with bearer gate?
+expected: homelab-network-perimeter-model
+
+q: what does container exit code 255 with reason Unknown mean?
+expected: exit-255-unknown-reason-not-oom
+
+q: can gitea push-mirror create the github repo automatically?
+expected: gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
+
+q: a flux kustomization is stuck after I removed a resource — why?
+expected: flux-healthcheck-stale-on-resource-removal
+
+q: the bytes buffer aliasing trap with Reset in a loop — what's the bug?
+expected: go-bytes-buffer-bytes-reset-aliasing-trap
+
+q: what are the homelab architecture principles from may 2026?
+expected: homelab-architecture-principles-2026-05
+
+q: where does the sops age private key live in the cluster?
+expected: 2026-05-04-sops-age-key-from-flux-cluster
+
+q: why do my grafana dashboards disappear after a pod restart?
+expected: grafana-dashboards-as-code-not-ui-state
+
+q: what is the double diamond methodology?
+expected: double-diamond-methodology
+
+q: my MCP server works from claude code but fails on claude.ai — what's different?
+expected: 2026-05-04-mcp-transport-version-claude-ai-strict
+
+q: how should I rate security findings — isolated bugs or exploit chains?
+expected: homelab-security-chains-not-bugs
+
+q: how should canonical context files relate to derived adapter files?
+expected: 2026-05-03-canonical-vs-derived-context-flow
+
+q: what is the homelab core vocabulary glossary?
+expected: homelab-core-glossary
+
+q: which models on koala llama-swap actually emit native tool_calls correctly?
+expected: koala-llama-swap-native-tool-calls-survey-2026-05
+
+q: what is qwen35-9b-fast and what's it used for?
+expected: qwen35-9b-fast
+
+q: in go, how do I prevent defer body close from silently dropping errors?
+expected: go-defer-errcheck-body-close
+
+q: what was the level 3 rewrite of hyperguild's ingestion pipeline?
+expected: hyperguild-level3-pipeline-rewrite
+
+q: what's the new-project ADR — is it gitea-first or github-first?
+expected: adr-new-project-gitea-first-github-mirror
+```
--- a/brain/eval/score.py
+++ b/brain/eval/score.py
@@ -0,0 +1,131 @@
+#!/usr/bin/env python3
+"""Score brain_query against the qa-2026-05.md eval set.
+
+Reads `q:` / `expected:` pairs, calls brain_query MCP for each, records
+top-1 + top-3 hit rate. Run:
+
+    BRAIN_MCP_TOKEN=$(grep '^export BRAIN_MCP_TOKEN=' ~/.llmkeys | cut -d= -f2-) \\
+      python3 score.py qa-2026-05.md
+
+Optionally pass --baseline <name> to save the result as a labeled run.
+"""
+import argparse
+import json
+import os
+import re
+import sys
+import time
+import urllib.request
+
+ENDPOINT = "https://brain-mcp.d-ma.be/mcp"
+
+
+def load_pairs(path):
+    pairs = []
+    q = None
+    with open(path) as f:
+        for line in f:
+            line = line.rstrip()
+            if line.startswith("q:"):
+                q = line[2:].strip()
+            elif line.startswith("expected:") and q is not None:
+                expected = line[len("expected:"):].strip()
+                pairs.append((q, expected))
+                q = None
+    return pairs
+
+
+def brain_query(token, query, k=5):
+    body = json.dumps({
+        "jsonrpc": "2.0",
+        "id": 1,
+        "method": "tools/call",
+        "params": {"name": "brain_query", "arguments": {"query": query, "k": k}},
+    }).encode()
+    req = urllib.request.Request(
+        ENDPOINT,
+        data=body,
+        headers={
+            "Authorization": f"Bearer {token}",
+            "Content-Type": "application/json",
+            "Accept": "application/json, text/event-stream",
+        },
+        method="POST",
+    )
+    with urllib.request.urlopen(req, timeout=30) as r:
+        raw = r.read().decode()
+    for line in raw.splitlines():
+        if line.startswith("data:"):
+            raw = line[5:].strip()
+            break
+    d = json.loads(raw)
+    if "error" in d:
+        raise RuntimeError(d["error"])
+    text = d["result"]["content"][0]["text"]
+    return json.loads(text).get("results", [])
+
+
+def slug_of(result):
+    # `title` mirrors the slug in brain_entities for normal entries.
+    # Fall back to basename(path) if title is missing.
+    t = result.get("title", "")
+    if t:
+        return t
+    p = result.get("path", "")
+    return re.sub(r"\.md$", "", os.path.basename(p))
+
+
+def main():
+    ap = argparse.ArgumentParser()
+    ap.add_argument("evalset")
+    ap.add_argument("--baseline", default="run")
+    ap.add_argument("--k", type=int, default=5)
+    args = ap.parse_args()
+
+    token = os.environ.get("BRAIN_MCP_TOKEN")
+    if not token:
+        sys.exit("BRAIN_MCP_TOKEN not set")
+
+    pairs = load_pairs(args.evalset)
+    if not pairs:
+        sys.exit(f"no pairs in {args.evalset}")
+
+    print(f"# {args.baseline} — {len(pairs)} questions, k={args.k}")
+    print()
+    hits1 = 0
+    hits3 = 0
+    detail = []
+    for q, expected in pairs:
+        try:
+            results = brain_query(token, q, k=args.k)
+        except Exception as e:
+            detail.append((q, expected, [], f"ERR {e}"))
+            continue
+        slugs = [slug_of(r) for r in results]
+        rank = slugs.index(expected) + 1 if expected in slugs else 0
+        h1 = 1 if rank == 1 else 0
+        h3 = 1 if 0 < rank <= 3 else 0
+        hits1 += h1
+        hits3 += h3
+        detail.append((q, expected, slugs, rank))
+
+    total = len(pairs)
+    print(f"top-1 hit rate: {hits1}/{total} = {100*hits1/total:.0f}%")
+    print(f"top-3 hit rate: {hits3}/{total} = {100*hits3/total:.0f}%")
+    print()
+    print("## per-question detail")
+    print()
+    for q, expected, slugs, rank in detail:
+        marker = {0: "✗", 1: "★", 2: "·", 3: "·"}.get(rank, "?")
+        if isinstance(rank, str):
+            marker = "!"
+        print(f"{marker} rank={rank}  expected={expected}")
+        print(f"     q: {q}")
+        for i, s in enumerate(slugs[:args.k], 1):
+            mark = "  <-- expected" if s == expected else ""
+            print(f"     {i}. {s}{mark}")
+        print()
+
+
+if __name__ == "__main__":
+    main()
--- a/ingestion/Dockerfile
+++ b/ingestion/Dockerfile
@@ -5,6 +5,15 @@ FROM golang:1.26-bookworm AS builder
 ARG VERSION=dev
 WORKDIR /src

+# Fetch internal gitea-hosted Go modules (mcp-chassis) without going through
+# proxy.golang.org and without HTTP→HTTPS surprises. The Gitea server returns
+# http:// in its go-import meta tag (config-level limitation), so rewrite to
+# https here and bypass the module proxy + sumdb.
+RUN git config --global url."https://gitea.d-ma.be/".insteadOf "http://gitea.d-ma.be/"
+ENV GOPRIVATE=gitea.d-ma.be
+ENV GOPROXY=direct
+ENV GOSUMDB=off
+
 COPY go.mod go.sum ./
 RUN go mod download

--- a/ingestion/cmd/server/main.go
+++ b/ingestion/cmd/server/main.go
@@ -12,11 +12,16 @@ import (
 	"strings"
 	"time"

+	chassisauth "gitea.d-ma.be/mathias/mcp-chassis/auth"
+
 	"github.com/mathiasbq/hyperguild/ingestion/internal/api"
-	"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
+	"github.com/mathiasbq/hyperguild/ingestion/internal/claudewatcher"
+	"github.com/mathiasbq/hyperguild/ingestion/internal/embed"
+	"github.com/mathiasbq/hyperguild/ingestion/internal/graphstore"
+	"github.com/mathiasbq/hyperguild/ingestion/internal/graphsync"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/llm"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
-	"github.com/mathiasbq/hyperguild/ingestion/internal/embed"
+	"github.com/mathiasbq/hyperguild/ingestion/internal/metrics"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/oauth"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/reranker"
@@ -25,6 +30,50 @@ import (
 	"github.com/mathiasbq/hyperguild/ingestion/internal/watcher"
 )

+// claudeSink converts each claudewatcher.Batch into one wiki note under
+// brain/wiki/claude-sessions/facts/. v1 emits one note per session
+// keyed by host + session id; classifier-driven hall routing is a
+// follow-up (hyperguild#27 v2).
+type claudeSink struct {
+	brainDir string
+	logger   *slog.Logger
+}
+
+func (s *claudeSink) Ingest(ctx context.Context, b claudewatcher.Batch) error {
+	if len(b.Turns) == 0 {
+		return nil
+	}
+	var sb strings.Builder
+	fmt.Fprintf(&sb, "# Claude session %s (%s)\n\n", b.SessionID, b.Host)
+	fmt.Fprintf(&sb, "_Project: `%s`. File: `%s`. Turns: %d._\n\n", b.ProjectID, b.FilePath, len(b.Turns))
+	for _, t := range b.Turns {
+		fmt.Fprintf(&sb, "## %s — %s\n\n", t.Type, t.Timestamp.UTC().Format(time.RFC3339))
+		if t.ToolName != "" {
+			fmt.Fprintf(&sb, "_tool: `%s`_\n\n", t.ToolName)
+		}
+		// Cap per-turn excerpt to keep page size bounded; the full
+		// transcript lives on disk under ~/.claude/projects/ already.
+		content := t.Content
+		if len(content) > 2000 {
+			content = content[:2000] + "…"
+		}
+		sb.WriteString(content)
+		sb.WriteString("\n\n")
+	}
+	slug := "session-" + b.Host + "-" + b.SessionID
+	if _, err := api.WriteNote(s.brainDir, api.WriteNoteOptions{
+		Filename: slug,
+		Wing:     "claude-sessions",
+		Hall:     "facts",
+		Type:     "source",
+		Domain:   b.ProjectID,
+		Content:  sb.String(),
+	}); err != nil {
+		return fmt.Errorf("write claude session note: %w", err)
+	}
+	return nil
+}
+
 // redactDSN parses a Postgres URL and replaces its password with `***`
 // for safe inclusion in logs. Falls back to a non-leaking placeholder
 // if parsing fails — we never log a raw DSN.
@@ -69,6 +118,16 @@ func envInt(key string, fallback int) int {
 	return fallback
 }

+// systemHostname returns os.Hostname() with a "unknown" fallback so the
+// caller never has to handle the rare error path.
+func systemHostname() string {
+	h, err := os.Hostname()
+	if err != nil || h == "" {
+		return "unknown"
+	}
+	return h
+}
+
 func main() {
 	logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))

@@ -140,6 +199,32 @@ func main() {
 		logger.Info("brain hybrid retrieval enabled",
 			"pg", redactDSN(pgDSN),
 			"embed_url", embedURL, "embed_model", embedModel)
+
+		// Graph store shares the same postgres18 DSN as the vector
+		// store and is opt-in via BRAIN_GRAPH_ENABLED=true. Defaults
+		// to off so first rollout doesn't surprise — flip on after
+		// the migration completes and the backfill finishes.
+		if envOr("BRAIN_GRAPH_ENABLED", "false") == "true" {
+			gstore, gerr := graphstore.New(context.Background(), pgDSN)
+			if gerr != nil {
+				logger.Error("graph store init", "err", gerr)
+				os.Exit(1)
+			}
+			if gerr := gstore.Init(context.Background()); gerr != nil {
+				logger.Error("graph store migrate", "err", gerr)
+				os.Exit(1)
+			}
+			mcpSrv = mcpSrv.WithGraph(gstore)
+			if envOr("BRAIN_GRAPH_BACKFILL", "false") == "true" {
+				n, berr := graphsync.BackfillFromBrainDir(context.Background(), gstore, brainDir)
+				if berr != nil {
+					logger.Warn("graph backfill incomplete", "indexed", n, "err", berr)
+				} else {
+					logger.Info("graph backfill complete", "indexed", n)
+				}
+			}
+			logger.Info("brain graph enabled", "pg", redactDSN(pgDSN))
+		}
 	case pgDSN == "" && embedURL == "":
 		// disabled — fine
 	default:
@@ -161,6 +246,56 @@ func main() {
 			Pipeline: pipelineCfg,
 		})
 	}
+
+	// Claude Code session ingestion (hyperguild#27 / infra#73 Track E.1).
+	// Off by default — explicitly opt in by setting CLAUDE_SESSIONS_DIR
+	// to the ~/.claude/projects path. Requires BRAIN_PG_DSN for the
+	// cursor table (resumable offsets across restarts).
+	if claudeDir := os.Getenv("CLAUDE_SESSIONS_DIR"); claudeDir != "" {
+		if pgDSN == "" {
+			logger.Error("CLAUDE_SESSIONS_DIR set but BRAIN_PG_DSN missing — claudewatcher needs the cursor table")
+			os.Exit(1)
+		}
+		// Client-name guard. The env value is a regex alternation
+		// (e.g. "SEB|Mastercard"); we wrap it with word boundaries
+		// and case-insensitive flag so substrings inside longer
+		// identifiers don't false-match. Sourced from a SOPS secret
+		// so client identities never live in source.
+		if clientBlock := os.Getenv("CLAUDE_INGEST_CLIENT_BLOCK"); clientBlock != "" {
+			pattern := `(?i)\b(` + clientBlock + `)\b`
+			if err := claudewatcher.RegisterRule("client-name", pattern); err != nil {
+				logger.Error("claudewatcher client-block rule invalid", "err", err)
+				os.Exit(1)
+			}
+			logger.Info("claudewatcher client-block guard registered")
+		}
+		cursorStore, cerr := claudewatcher.NewCursorStore(ctx, pgDSN)
+		if cerr != nil {
+			logger.Error("claudewatcher cursor init", "err", cerr)
+			os.Exit(1)
+		}
+		if cerr := cursorStore.Init(ctx); cerr != nil {
+			logger.Error("claudewatcher cursor migrate", "err", cerr)
+			os.Exit(1)
+		}
+		host := envOr("CLAUDE_INGEST_HOST", systemHostname())
+		interval := time.Duration(envInt("CLAUDE_INGEST_INTERVAL", 60)) * time.Second
+		sink := &claudeSink{brainDir: brainDir, logger: logger}
+		go func() {
+			if err := claudewatcher.Watch(ctx, claudewatcher.Config{
+				SessionsDir: claudeDir,
+				Host:        host,
+				Interval:    interval,
+				Sink:        sink,
+				Cursors:     cursorStore,
+				Logger:      logger,
+			}); err != nil && err != context.Canceled {
+				logger.Error("claudewatcher exited", "err", err)
+			}
+		}()
+		logger.Info("claudewatcher started",
+			"sessions_dir", claudeDir, "host", host, "interval", interval)
+	}
 	if vectorStore != nil {
 		embedSyncInterval := envInt("BRAIN_EMBED_SYNC_INTERVAL", 300)
 		vectorstore.StartSync(ctx, brainDir, vectorStore,
@@ -180,16 +315,13 @@ func main() {
 	mux.HandleFunc("POST /backfill-refs", h.BackfillRefs)
 	mux.HandleFunc("POST /backfill-embeddings", h.BackfillEmbeddings)
 	mux.HandleFunc("GET /pass-rate", h.PassRate)
-	var jwtValidator *auth.Validator
-	if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
-		audience := os.Getenv("MCP_AUDIENCE")
-		v, err := auth.NewValidator(dexURL, audience)
-		if err != nil {
-			logger.Error("build jwt validator", "err", err)
-			os.Exit(1)
-		}
-		jwtValidator = v
-		logger.Info("jwt auth enabled", "issuer", dexURL)
+	jwtValidator, err := chassisauth.NewJWTValidator(ctx, os.Getenv("DEX_ISSUER_URL"), os.Getenv("MCP_AUDIENCE"))
+	if err != nil {
+		logger.Error("build jwt validator", "err", err)
+		os.Exit(1)
+	}
+	if jwtValidator != nil {
+		logger.Info("jwt auth enabled", "issuer", os.Getenv("DEX_ISSUER_URL"))
 	}

 	// Resource-metadata URL is only emitted on 401 when Dex OAuth is
@@ -199,13 +331,13 @@ func main() {
 	if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
 		resourceURL := os.Getenv("MCP_RESOURCE_URL")
 		mux.HandleFunc("GET /.well-known/oauth-protected-resource",
-			auth.ProtectedResourceHandler(resourceURL, dexURL))
+			chassisauth.ProtectedResourceHandler(resourceURL, dexURL))
 		if resourceURL != "" {
 			resourceMetadataURL = strings.TrimRight(resourceURL, "/") + "/.well-known/oauth-protected-resource"
 		}
 	}

-	mux.Handle("/mcp", mcp.BearerAuth(mcpToken, jwtValidator, resourceMetadataURL, mcpSrv))
+	mux.Handle("/mcp", chassisauth.BearerMiddleware(mcpToken, jwtValidator, "brain", resourceMetadataURL, mcpSrv))

 	// Opt-in OAuth 2.0 client_credentials flow for claude.ai's custom-MCP
 	// integration UI, which has no static-Bearer field. Setting both
@@ -235,6 +367,15 @@ func main() {
 		os.Exit(1)
 	}

+	// /metrics — unauthenticated Prometheus endpoint. kube-prometheus-stack
+	// scrapes it via the ServiceMonitor in k3s/apps/supervisor/. The metrics
+	// middleware below wraps every other registered handler so it observes
+	// real request latency. /metrics itself is excluded from its own
+	// observation by registering it on the outer mux (post-wrap).
+	reg := metrics.New()
+	mux.HandleFunc("GET /metrics", reg.Handler())
+	logger.Info("metrics endpoint registered", "path", "/metrics")
+
 	addr := ":" + port
 	watchIntervalLog := "disabled"
 	if watchInterval > 0 {
@@ -249,7 +390,7 @@ func main() {
 		"watch_interval", watchIntervalLog,
 		"mcp_enabled", true,
 	)
-	if err := http.ListenAndServe(addr, mux); err != nil {
+	if err := http.ListenAndServe(addr, reg.Middleware(mux)); err != nil {
 		logger.Error("server stopped", "err", err)
 		os.Exit(1)
 	}
--- a/ingestion/go.mod
+++ b/ingestion/go.mod
@@ -8,6 +8,7 @@ require (
 )

 require (
+	gitea.d-ma.be/mathias/mcp-chassis v0.1.0 // indirect
 	github.com/davecgh/go-spew v1.1.1 // indirect
 	github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 // indirect
 	github.com/goccy/go-json v0.10.3 // indirect
--- a/ingestion/go.sum
+++ b/ingestion/go.sum
@@ -1,3 +1,5 @@
+gitea.d-ma.be/mathias/mcp-chassis v0.1.0 h1:8RXO34+n7Vu8HnUMagars6fc4oemqRpMu7MVtjaj4qY=
+gitea.d-ma.be/mathias/mcp-chassis v0.1.0/go.mod h1:ajbLlwr2L7FAN3TBU39KucZkKJM02wTbKbDKDEW2YvE=
 github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
 github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
--- a/ingestion/internal/auth/jwt.go
+++ b/ingestion/internal/auth/jwt.go
@@ -1,84 +0,0 @@
-package auth
-
-import (
-	"context"
-	"encoding/json"
-	"fmt"
-	"net/http"
-	"time"
-
-	"github.com/lestrrat-go/jwx/v2/jwk"
-	"github.com/lestrrat-go/jwx/v2/jwt"
-)
-
-// Validator validates Bearer JWTs issued by a Dex (OIDC) authorization server.
-// Audience is optional; leave empty to skip audience validation.
-type Validator struct {
-	issuer   string
-	audience string
-	jwksURI  string
-	cache    *jwk.Cache
-}
-
-// NewValidator fetches the OIDC discovery document from issuerURL, extracts
-// jwks_uri, seeds the JWKS cache, and returns a ready Validator.
-// If DEX_ISSUER_URL is not set the caller should pass "" and skip construction.
-func NewValidator(issuerURL, audience string) (*Validator, error) {
-	resp, err := http.Get(issuerURL + "/.well-known/openid-configuration") //nolint:noctx
-	if err != nil {
-		return nil, fmt.Errorf("fetch oidc discovery: %w", err)
-	}
-	defer resp.Body.Close() //nolint:errcheck
-	if resp.StatusCode != http.StatusOK {
-		return nil, fmt.Errorf("oidc discovery: status %d", resp.StatusCode)
-	}
-
-	var doc struct {
-		JWKSURI string `json:"jwks_uri"`
-	}
-	if err := json.NewDecoder(resp.Body).Decode(&doc); err != nil {
-		return nil, fmt.Errorf("decode oidc discovery: %w", err)
-	}
-	if doc.JWKSURI == "" {
-		return nil, fmt.Errorf("oidc discovery: empty jwks_uri")
-	}
-
-	ctx := context.Background()
-	cache := jwk.NewCache(ctx)
-	if err := cache.Register(doc.JWKSURI, jwk.WithMinRefreshInterval(time.Hour)); err != nil {
-		return nil, fmt.Errorf("register jwks cache: %w", err)
-	}
-	if _, err := cache.Refresh(ctx, doc.JWKSURI); err != nil {
-		return nil, fmt.Errorf("initial jwks fetch: %w", err)
-	}
-
-	return &Validator{
-		issuer:   issuerURL,
-		audience: audience,
-		jwksURI:  doc.JWKSURI,
-		cache:    cache,
-	}, nil
-}
-
-// Validate parses and validates rawToken. Returns the subject claim on success.
-func (v *Validator) Validate(ctx context.Context, rawToken string) (string, error) {
-	keySet, err := v.cache.Get(ctx, v.jwksURI)
-	if err != nil {
-		return "", fmt.Errorf("get jwks: %w", err)
-	}
-
-	opts := []jwt.ParseOption{
-		jwt.WithKeySet(keySet),
-		jwt.WithValidate(true),
-		jwt.WithIssuer(v.issuer),
-	}
-	if v.audience != "" {
-		opts = append(opts, jwt.WithAudience(v.audience))
-	}
-
-	tok, err := jwt.ParseString(rawToken, opts...)
-	if err != nil {
-		return "", fmt.Errorf("validate jwt: %w", err)
-	}
-	return tok.Subject(), nil
-}
--- a/ingestion/internal/auth/jwt_test.go
+++ b/ingestion/internal/auth/jwt_test.go
@@ -1,169 +0,0 @@
-package auth_test
-
-import (
-	"context"
-	"crypto/rand"
-	"crypto/rsa"
-	"encoding/json"
-	"net/http"
-	"net/http/httptest"
-	"testing"
-	"time"
-
-	"github.com/lestrrat-go/jwx/v2/jwa"
-	"github.com/lestrrat-go/jwx/v2/jwk"
-	"github.com/lestrrat-go/jwx/v2/jwt"
-	"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
-	"github.com/stretchr/testify/assert"
-	"github.com/stretchr/testify/require"
-)
-
-type testKeys struct {
-	priv jwk.Key
-	pub  jwk.Key
-}
-
-func generateRSAKeys(t *testing.T) testKeys {
-	t.Helper()
-	raw, err := rsa.GenerateKey(rand.Reader, 2048)
-	require.NoError(t, err)
-
-	priv, err := jwk.FromRaw(raw)
-	require.NoError(t, err)
-	require.NoError(t, priv.Set(jwk.KeyIDKey, "test-kid"))
-	require.NoError(t, priv.Set(jwk.AlgorithmKey, jwa.RS256))
-
-	pub, err := jwk.PublicKeyOf(priv)
-	require.NoError(t, err)
-
-	return testKeys{priv: priv, pub: pub}
-}
-
-func mockOIDCServer(t *testing.T, keys testKeys) *httptest.Server {
-	t.Helper()
-	set := jwk.NewSet()
-	require.NoError(t, set.AddKey(keys.pub))
-	jwksBytes, err := json.Marshal(set)
-	require.NoError(t, err)
-
-	mux := http.NewServeMux()
-	var srv *httptest.Server
-	mux.HandleFunc("/.well-known/openid-configuration", func(w http.ResponseWriter, _ *http.Request) {
-		w.Header().Set("Content-Type", "application/json")
-		_ = json.NewEncoder(w).Encode(map[string]string{
-			"issuer":   srv.URL,
-			"jwks_uri": srv.URL + "/jwks",
-		})
-	})
-	mux.HandleFunc("/jwks", func(w http.ResponseWriter, _ *http.Request) {
-		w.Header().Set("Content-Type", "application/json")
-		_, _ = w.Write(jwksBytes)
-	})
-	srv = httptest.NewServer(mux)
-	t.Cleanup(srv.Close)
-	return srv
-}
-
-func signToken(t *testing.T, keys testKeys, issuer, audience, subject string, exp time.Time) string {
-	t.Helper()
-	b := jwt.NewBuilder().
-		Issuer(issuer).
-		Subject(subject).
-		Expiration(exp)
-	if audience != "" {
-		b = b.Audience([]string{audience})
-	}
-	tok, err := b.Build()
-	require.NoError(t, err)
-	signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
-	require.NoError(t, err)
-	return string(signed)
-}
-
-func TestValidator(t *testing.T) {
-	keys := generateRSAKeys(t)
-	srv := mockOIDCServer(t, keys)
-	ctx := context.Background()
-
-	v, err := auth.NewValidator(srv.URL, "brain")
-	require.NoError(t, err)
-
-	tests := []struct {
-		name      string
-		token     string
-		wantSub   string
-		wantErr   bool
-	}{
-		{
-			name:    "valid jwt",
-			token:   signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)),
-			wantSub: "test-user",
-		},
-		{
-			name:    "expired jwt",
-			token:   signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(-time.Hour)),
-			wantErr: true,
-		},
-		{
-			name:    "wrong issuer",
-			token:   signToken(t, keys, "https://evil.example.com", "brain", "test-user", time.Now().Add(time.Hour)),
-			wantErr: true,
-		},
-		{
-			name:    "wrong audience",
-			token:   signToken(t, keys, srv.URL, "other-service", "test-user", time.Now().Add(time.Hour)),
-			wantErr: true,
-		},
-		{
-			name:    "tampered token",
-			token:   signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)) + "tampered",
-			wantErr: true,
-		},
-		{
-			name:    "not a jwt",
-			token:   "not-a-jwt",
-			wantErr: true,
-		},
-	}
-
-	for _, tc := range tests {
-		t.Run(tc.name, func(t *testing.T) {
-			sub, err := v.Validate(ctx, tc.token)
-			if tc.wantErr {
-				assert.Error(t, err)
-				assert.Empty(t, sub)
-			} else {
-				require.NoError(t, err)
-				assert.Equal(t, tc.wantSub, sub)
-			}
-		})
-	}
-}
-
-func TestNewValidator_NoAudience(t *testing.T) {
-	keys := generateRSAKeys(t)
-	srv := mockOIDCServer(t, keys)
-	ctx := context.Background()
-
-	v, err := auth.NewValidator(srv.URL, "")
-	require.NoError(t, err)
-
-	// Token without audience passes when audience validation is disabled.
-	tok, err := jwt.NewBuilder().
-		Issuer(srv.URL).
-		Subject("sub").
-		Expiration(time.Now().Add(time.Hour)).
-		Build()
-	require.NoError(t, err)
-	signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
-	require.NoError(t, err)
-
-	sub, err := v.Validate(ctx, string(signed))
-	require.NoError(t, err)
-	assert.Equal(t, "sub", sub)
-}
-
-func TestNewValidator_BadDiscoveryURL(t *testing.T) {
-	_, err := auth.NewValidator("http://127.0.0.1:1", "brain")
-	assert.Error(t, err)
-}
--- a/ingestion/internal/auth/protected_resource.go
+++ b/ingestion/internal/auth/protected_resource.go
@@ -1,23 +0,0 @@
-package auth
-
-import (
-	"encoding/json"
-	"net/http"
-)
-
-// ProtectedResourceHandler returns an RFC 9728 oauth-protected-resource metadata
-// handler. Mount at GET /.well-known/oauth-protected-resource (no auth required).
-func ProtectedResourceHandler(resourceURL, issuerURL string) http.HandlerFunc {
-	type metadata struct {
-		Resource             string   `json:"resource"`
-		AuthorizationServers []string `json:"authorization_servers"`
-	}
-	body, _ := json.Marshal(metadata{
-		Resource:             resourceURL,
-		AuthorizationServers: []string{issuerURL},
-	})
-	return func(w http.ResponseWriter, _ *http.Request) {
-		w.Header().Set("Content-Type", "application/json")
-		_, _ = w.Write(body)
-	}
-}
--- a/ingestion/internal/auth/protected_resource_test.go
+++ b/ingestion/internal/auth/protected_resource_test.go
@@ -1,28 +0,0 @@
-package auth_test
-
-import (
-	"encoding/json"
-	"net/http"
-	"net/http/httptest"
-	"testing"
-
-	"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
-	"github.com/stretchr/testify/assert"
-	"github.com/stretchr/testify/require"
-)
-
-func TestProtectedResourceHandler(t *testing.T) {
-	h := auth.ProtectedResourceHandler("https://brain-mcp.d-ma.be", "https://auth.d-ma.be")
-	req := httptest.NewRequest(http.MethodGet, "/.well-known/oauth-protected-resource", nil)
-	rr := httptest.NewRecorder()
-	h(rr, req)
-
-	assert.Equal(t, http.StatusOK, rr.Code)
-	assert.Equal(t, "application/json", rr.Header().Get("Content-Type"))
-
-	var body map[string]any
-	require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &body))
-	assert.Equal(t, "https://brain-mcp.d-ma.be", body["resource"])
-	servers := body["authorization_servers"].([]any)
-	assert.Equal(t, "https://auth.d-ma.be", servers[0])
-}
--- a/ingestion/internal/claudewatcher/cursor.go
+++ b/ingestion/internal/claudewatcher/cursor.go
@@ -0,0 +1,110 @@
+package claudewatcher
+
+import (
+	"context"
+	"errors"
+	"fmt"
+
+	"github.com/jackc/pgx/v5"
+	"github.com/jackc/pgx/v5/pgxpool"
+)
+
+// CursorStore tracks how far the watcher has ingested into each
+// session JSONL file. Keyed by (host, file_path) so the same `~/.claude`
+// path on different hosts doesn't collide and resumability survives
+// pod restarts. Idempotent Init lives alongside the rest of the
+// claudewatcher schema; no separate migration framework.
+type CursorStore struct {
+	pool *pgxpool.Pool
+}
+
+// NewCursorStore opens a pool against dsn. Caller closes the store.
+func NewCursorStore(ctx context.Context, dsn string) (*CursorStore, error) {
+	pool, err := pgxpool.New(ctx, dsn)
+	if err != nil {
+		return nil, fmt.Errorf("pgxpool: %w", err)
+	}
+	if err := pool.Ping(ctx); err != nil {
+		pool.Close()
+		return nil, fmt.Errorf("ping: %w", err)
+	}
+	return &CursorStore{pool: pool}, nil
+}
+
+// NewCursorStoreFromPool wraps an existing pool (so the watcher can
+// share the brain DSN pool with vectorstore/graphstore without a
+// second connection set). Caller must NOT close the wrapped pool via
+// the store — close the pool directly.
+func NewCursorStoreFromPool(pool *pgxpool.Pool) *CursorStore {
+	return &CursorStore{pool: pool}
+}
+
+// Close releases the underlying connection pool when this store owns
+// it. No-op when the pool was injected via NewCursorStoreFromPool —
+// pgxpool.Close is idempotent so we lean on that.
+func (s *CursorStore) Close() {
+	if s.pool != nil {
+		s.pool.Close()
+	}
+}
+
+// Init creates the claude_session_cursors table when missing.
+func (s *CursorStore) Init(ctx context.Context) error {
+	const ddl = `
+CREATE TABLE IF NOT EXISTS claude_session_cursors (
+    host         TEXT NOT NULL,
+    file_path    TEXT NOT NULL,
+    byte_offset  BIGINT NOT NULL DEFAULT 0,
+    last_seen_at TIMESTAMPTZ NOT NULL DEFAULT now(),
+    PRIMARY KEY (host, file_path)
+);
+CREATE INDEX IF NOT EXISTS claude_session_cursors_host_idx
+    ON claude_session_cursors (host);
+`
+	_, err := s.pool.Exec(ctx, ddl)
+	return err
+}
+
+// GetOffset returns the last recorded byte offset for (host, filePath).
+// Missing rows are reported as offset=0, ok=false so the caller can
+// distinguish "never ingested" from "ingested at the start of the
+// file" (both produce identical behaviour but the metric is useful).
+func (s *CursorStore) GetOffset(ctx context.Context, host, filePath string) (int64, bool, error) {
+	if host == "" || filePath == "" {
+		return 0, false, errors.New("host and file_path are required")
+	}
+	var offset int64
+	err := s.pool.QueryRow(ctx, `
+        SELECT byte_offset FROM claude_session_cursors WHERE host = $1 AND file_path = $2
+    `, host, filePath).Scan(&offset)
+	if errors.Is(err, pgx.ErrNoRows) {
+		return 0, false, nil
+	}
+	if err != nil {
+		return 0, false, fmt.Errorf("query: %w", err)
+	}
+	return offset, true, nil
+}
+
+// SetOffset writes the new offset for (host, filePath). Used after
+// every successful parse + ingest batch so a crash mid-file rewinds
+// only to the last committed checkpoint.
+func (s *CursorStore) SetOffset(ctx context.Context, host, filePath string, offset int64) error {
+	if host == "" || filePath == "" {
+		return errors.New("host and file_path are required")
+	}
+	if offset < 0 {
+		return errors.New("offset must be >= 0")
+	}
+	_, err := s.pool.Exec(ctx, `
+        INSERT INTO claude_session_cursors (host, file_path, byte_offset, last_seen_at)
+        VALUES ($1, $2, $3, now())
+        ON CONFLICT (host, file_path) DO UPDATE
+        SET byte_offset = EXCLUDED.byte_offset,
+            last_seen_at = now()
+    `, host, filePath, offset)
+	if err != nil {
+		return fmt.Errorf("upsert offset: %w", err)
+	}
+	return nil
+}
--- a/ingestion/internal/claudewatcher/parser.go
+++ b/ingestion/internal/claudewatcher/parser.go
@@ -0,0 +1,305 @@
+// Package claudewatcher ingests Claude Code session transcripts
+// (`~/.claude/projects/*/<uuid>.jsonl`) into the brain corpus.
+//
+// Schema (observed 2026-05-25 across ~30 session files on koala):
+//
+//	type=user            — user prompts + tool results
+//	type=assistant       — model turns; tool_use blocks live in message.content
+//	type=attachment      — hook outputs, ingested files
+//	type=system          — turn-boundary metadata
+//	type=file-history-snapshot — git-style snapshot of edited files
+//	type=queue-operation, last-prompt, permission-mode, ai-title,
+//	     bridge-session — internal bookkeeping, ignored
+//
+// The parser is intentionally tolerant: malformed lines are skipped
+// (caller logs and advances), missing optional fields default to "",
+// and unknown `type` values are returned as Turn entries with
+// `Skip=true` so callers can filter cheaply.
+package claudewatcher
+
+import (
+	"bufio"
+	"encoding/json"
+	"errors"
+	"fmt"
+	"io"
+	"strings"
+	"time"
+)
+
+// Turn is one parsed JSONL entry from a Claude Code session log.
+//
+// Skip is true for entry types we never want to ingest (queue
+// bookkeeping, snapshots, etc.). Callers fast-path these without
+// running the scrubber or classifier.
+type Turn struct {
+	SessionID    string
+	Type         string
+	ParentUUID   string
+	Timestamp    time.Time
+	Cwd          string
+	GitBranch    string
+	Content      string // plain-text projection of the entry, ready for the scrubber/classifier
+	ToolName     string // populated when an assistant turn invokes a tool
+	OffsetAfter  int64  // byte offset in the file just past this entry
+	Skip         bool
+	ParseWarning string // non-empty when the entry parsed but had a sub-field we couldn't normalise
+}
+
+// ParseStream reads JSONL lines from r starting at startOffset and
+// invokes emit for each parsed entry. emit may return ErrStop to
+// terminate the scan cleanly. Other emit errors propagate.
+//
+// startOffset is informational — the caller is expected to have already
+// seeked the underlying reader to that offset. ParseStream adds the
+// number of bytes consumed per line to it to compute Turn.OffsetAfter.
+//
+// Lines that fail to unmarshal are logged via warnf and skipped; they
+// do NOT advance OffsetAfter past the malformed line by themselves,
+// but the next valid line resumes correctly because bufio.Scanner
+// preserves stream position.
+func ParseStream(
+	r io.Reader,
+	startOffset int64,
+	warnf func(format string, args ...any),
+	emit func(Turn) error,
+) (int64, error) {
+	scanner := bufio.NewScanner(r)
+	scanner.Buffer(make([]byte, 0, 64*1024), 8*1024*1024) // some lines are big (tool outputs)
+
+	offset := startOffset
+	for scanner.Scan() {
+		raw := scanner.Bytes()
+		lineLen := int64(len(raw)) + 1 // +1 for the newline
+		t, err := parseTurn(raw)
+		if err != nil {
+			if warnf != nil {
+				warnf("parse: %v (%d bytes)", err, len(raw))
+			}
+			offset += lineLen
+			continue
+		}
+		t.OffsetAfter = offset + lineLen
+		if err := emit(t); err != nil {
+			if errors.Is(err, ErrStop) {
+				return t.OffsetAfter, nil
+			}
+			return offset, fmt.Errorf("emit: %w", err)
+		}
+		offset = t.OffsetAfter
+	}
+	if err := scanner.Err(); err != nil {
+		return offset, fmt.Errorf("scan: %w", err)
+	}
+	return offset, nil
+}
+
+// ErrStop terminates a ParseStream loop without surfacing an error.
+var ErrStop = errors.New("claudewatcher: stop")
+
+// rawEntry is a permissive shape that covers every type observed in
+// the JSONL files. Fields we don't care about are intentionally
+// omitted to keep the unmarshal cheap.
+type rawEntry struct {
+	Type       string          `json:"type"`
+	SessionID  string          `json:"sessionId"`
+	ParentUUID string          `json:"parentUuid"`
+	Timestamp  string          `json:"timestamp"`
+	Cwd        string          `json:"cwd"`
+	GitBranch  string          `json:"gitBranch"`
+	Message    json.RawMessage `json:"message"`
+	Attachment json.RawMessage `json:"attachment"`
+	Content    string          `json:"content"`    // queue-operation
+	LastPrompt string          `json:"lastPrompt"` // last-prompt
+	Subtype    string          `json:"subtype"`    // system
+}
+
+// skipTypes lists every entry type we want to never ingest. Marked Skip
+// at parse time so the caller's filter is a single boolean check.
+var skipTypes = map[string]struct{}{
+	"queue-operation":       {},
+	"last-prompt":           {},
+	"permission-mode":       {},
+	"ai-title":              {},
+	"bridge-session":        {},
+	"file-history-snapshot": {},
+}
+
+func parseTurn(raw []byte) (Turn, error) {
+	var e rawEntry
+	if err := json.Unmarshal(raw, &e); err != nil {
+		return Turn{}, fmt.Errorf("unmarshal: %w", err)
+	}
+	t := Turn{
+		Type:       e.Type,
+		SessionID:  e.SessionID,
+		ParentUUID: e.ParentUUID,
+		Cwd:        e.Cwd,
+		GitBranch:  e.GitBranch,
+	}
+	if _, skip := skipTypes[e.Type]; skip {
+		t.Skip = true
+		return t, nil
+	}
+	if e.Timestamp != "" {
+		if ts, err := time.Parse(time.RFC3339Nano, e.Timestamp); err == nil {
+			t.Timestamp = ts
+		} else {
+			t.ParseWarning = "timestamp"
+		}
+	}
+
+	switch e.Type {
+	case "user":
+		t.Content = extractMessageText(e.Message)
+	case "assistant":
+		t.Content, t.ToolName = extractAssistantTurn(e.Message)
+	case "attachment":
+		t.Content = extractAttachmentText(e.Attachment)
+	case "system":
+		t.Content = "[system " + e.Subtype + "]"
+	default:
+		// Unknown type — keep the row but mark Skip so callers ignore.
+		t.Skip = true
+	}
+	return t, nil
+}
+
+// extractMessageText pulls the textual projection out of a user/assistant
+// message field. The shape is the Anthropic Messages API content-block
+// array (an array of {type, text|tool_use|tool_result, ...}). We
+// concatenate every text-bearing block and ignore the rest.
+func extractMessageText(raw json.RawMessage) string {
+	if len(raw) == 0 {
+		return ""
+	}
+	var msg struct {
+		Role    string            `json:"role"`
+		Content json.RawMessage   `json:"content"`
+		Stop    string            `json:"stop_reason"`
+		Model   string            `json:"model"`
+		Usage   map[string]any    `json:"usage"`
+		Meta    map[string]string `json:"meta"`
+	}
+	if err := json.Unmarshal(raw, &msg); err != nil {
+		// Some user turns have message as plain string.
+		var s string
+		if err2 := json.Unmarshal(raw, &s); err2 == nil {
+			return s
+		}
+		return ""
+	}
+	// Content can be a string OR an array.
+	var asString string
+	if err := json.Unmarshal(msg.Content, &asString); err == nil {
+		return asString
+	}
+	var blocks []struct {
+		Type    string          `json:"type"`
+		Text    string          `json:"text"`
+		Content json.RawMessage `json:"content"`
+	}
+	if err := json.Unmarshal(msg.Content, &blocks); err != nil {
+		return ""
+	}
+	var sb strings.Builder
+	for _, b := range blocks {
+		switch b.Type {
+		case "text":
+			sb.WriteString(b.Text)
+			sb.WriteByte('\n')
+		case "tool_result":
+			// Tool result content may itself be a string or array of blocks.
+			var s string
+			if err := json.Unmarshal(b.Content, &s); err == nil {
+				sb.WriteString("[tool_result] ")
+				sb.WriteString(s)
+				sb.WriteByte('\n')
+				continue
+			}
+			var sub []struct {
+				Type string `json:"type"`
+				Text string `json:"text"`
+			}
+			if err := json.Unmarshal(b.Content, &sub); err == nil {
+				for _, s := range sub {
+					if s.Type == "text" {
+						sb.WriteString("[tool_result] ")
+						sb.WriteString(s.Text)
+						sb.WriteByte('\n')
+					}
+				}
+			}
+		}
+	}
+	return strings.TrimRight(sb.String(), "\n")
+}
+
+// extractAssistantTurn pulls text + the first tool name (if any) from
+// an assistant content-block array. Multi-tool turns lose the second
+// name; the goal is signal for classification, not perfect fidelity.
+func extractAssistantTurn(raw json.RawMessage) (string, string) {
+	if len(raw) == 0 {
+		return "", ""
+	}
+	var msg struct {
+		Content json.RawMessage `json:"content"`
+	}
+	if err := json.Unmarshal(raw, &msg); err != nil {
+		return "", ""
+	}
+	var blocks []struct {
+		Type string          `json:"type"`
+		Text string          `json:"text"`
+		Name string          `json:"name"`
+		Tool json.RawMessage `json:"input"`
+	}
+	if err := json.Unmarshal(msg.Content, &blocks); err != nil {
+		return "", ""
+	}
+	var sb strings.Builder
+	var firstTool string
+	for _, b := range blocks {
+		switch b.Type {
+		case "text":
+			sb.WriteString(b.Text)
+			sb.WriteByte('\n')
+		case "tool_use":
+			if firstTool == "" {
+				firstTool = b.Name
+			}
+			sb.WriteString("[tool_use:")
+			sb.WriteString(b.Name)
+			sb.WriteString("]\n")
+		}
+	}
+	return strings.TrimRight(sb.String(), "\n"), firstTool
+}
+
+// extractAttachmentText pulls text content from an attachment payload,
+// or returns a short tag when the attachment is a hook event.
+func extractAttachmentText(raw json.RawMessage) string {
+	if len(raw) == 0 {
+		return ""
+	}
+	var a struct {
+		Type      string `json:"type"`
+		HookName  string `json:"hookName"`
+		HookEvent string `json:"hookEvent"`
+		Content   string `json:"content"`
+		Text      string `json:"text"`
+	}
+	if err := json.Unmarshal(raw, &a); err != nil {
+		return ""
+	}
+	if a.Content != "" {
+		return a.Content
+	}
+	if a.Text != "" {
+		return a.Text
+	}
+	if a.HookName != "" {
+		return "[hook " + a.HookEvent + ":" + a.HookName + "]"
+	}
+	return ""
+}
--- a/ingestion/internal/claudewatcher/parser_test.go
+++ b/ingestion/internal/claudewatcher/parser_test.go
@@ -0,0 +1,157 @@
+package claudewatcher
+
+import (
+	"errors"
+	"strings"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
+
+func collect(t *testing.T, body string) ([]Turn, int64, error) {
+	t.Helper()
+	var out []Turn
+	end, err := ParseStream(strings.NewReader(body), 0, nil, func(tr Turn) error {
+		out = append(out, tr)
+		return nil
+	})
+	return out, end, err
+}
+
+func TestParseStream_UserTurnStringContent(t *testing.T) {
+	body := `{"type":"user","sessionId":"S","timestamp":"2026-05-25T07:00:00Z","message":"hello world"}
+`
+	turns, end, err := collect(t, body)
+	require.NoError(t, err)
+	require.Len(t, turns, 1)
+	assert.Equal(t, "user", turns[0].Type)
+	assert.Equal(t, "S", turns[0].SessionID)
+	assert.Equal(t, "hello world", turns[0].Content)
+	assert.False(t, turns[0].Skip)
+	assert.Equal(t, int64(len(body)), end)
+}
+
+func TestParseStream_UserTurnContentBlocks(t *testing.T) {
+	body := `{"type":"user","sessionId":"S","timestamp":"2026-05-25T07:00:00Z","message":{"role":"user","content":[{"type":"text","text":"line 1"},{"type":"text","text":"line 2"}]}}
+`
+	turns, _, err := collect(t, body)
+	require.NoError(t, err)
+	require.Len(t, turns, 1)
+	assert.Equal(t, "line 1\nline 2", turns[0].Content)
+}
+
+func TestParseStream_AssistantToolUse(t *testing.T) {
+	body := `{"type":"assistant","sessionId":"S","timestamp":"2026-05-25T07:00:00Z","message":{"content":[{"type":"text","text":"calling now"},{"type":"tool_use","name":"Edit","input":{}}]}}
+`
+	turns, _, err := collect(t, body)
+	require.NoError(t, err)
+	require.Len(t, turns, 1)
+	assert.Equal(t, "Edit", turns[0].ToolName)
+	assert.Contains(t, turns[0].Content, "calling now")
+	assert.Contains(t, turns[0].Content, "[tool_use:Edit]")
+}
+
+func TestParseStream_AssistantToolResult(t *testing.T) {
+	body := `{"type":"user","sessionId":"S","timestamp":"2026-05-25T07:00:00Z","message":{"content":[{"type":"tool_result","content":"output of cmd"}]}}
+`
+	turns, _, err := collect(t, body)
+	require.NoError(t, err)
+	require.Len(t, turns, 1)
+	assert.Contains(t, turns[0].Content, "[tool_result] output of cmd")
+}
+
+func TestParseStream_SkipsBookkeepingTypes(t *testing.T) {
+	body := strings.Join([]string{
+		`{"type":"queue-operation","sessionId":"S","content":"x"}`,
+		`{"type":"last-prompt","sessionId":"S","lastPrompt":"y"}`,
+		`{"type":"permission-mode","sessionId":"S","permissionMode":"auto"}`,
+		`{"type":"ai-title","sessionId":"S","aiTitle":"My session"}`,
+		`{"type":"file-history-snapshot","messageId":"abc"}`,
+	}, "\n") + "\n"
+	turns, _, err := collect(t, body)
+	require.NoError(t, err)
+	require.Len(t, turns, 5)
+	for _, tr := range turns {
+		assert.True(t, tr.Skip, "expected Skip=true for %q", tr.Type)
+	}
+}
+
+func TestParseStream_UnknownTypeIsSkip(t *testing.T) {
+	body := `{"type":"future-thing","sessionId":"S"}` + "\n"
+	turns, _, err := collect(t, body)
+	require.NoError(t, err)
+	require.Len(t, turns, 1)
+	assert.True(t, turns[0].Skip)
+}
+
+func TestParseStream_MalformedLineIsSkippedNotFatal(t *testing.T) {
+	body := strings.Join([]string{
+		`{"type":"user","sessionId":"S","message":"first"}`,
+		`{not valid json`,
+		`{"type":"user","sessionId":"S","message":"third"}`,
+	}, "\n") + "\n"
+	var warnings int
+	var turns []Turn
+	_, err := ParseStream(strings.NewReader(body), 0, func(format string, args ...any) {
+		warnings++
+	}, func(tr Turn) error {
+		turns = append(turns, tr)
+		return nil
+	})
+	require.NoError(t, err)
+	require.Len(t, turns, 2, "first + third should make it through")
+	assert.Equal(t, 1, warnings)
+}
+
+func TestParseStream_EmitErrStopHaltsCleanly(t *testing.T) {
+	body := strings.Join([]string{
+		`{"type":"user","sessionId":"S","message":"a"}`,
+		`{"type":"user","sessionId":"S","message":"b"}`,
+		`{"type":"user","sessionId":"S","message":"c"}`,
+	}, "\n") + "\n"
+	count := 0
+	end, err := ParseStream(strings.NewReader(body), 0, nil, func(tr Turn) error {
+		count++
+		if count == 2 {
+			return ErrStop
+		}
+		return nil
+	})
+	require.NoError(t, err)
+	assert.Equal(t, 2, count)
+	assert.Greater(t, end, int64(0))
+}
+
+func TestParseStream_EmitOtherErrorPropagates(t *testing.T) {
+	body := `{"type":"user","sessionId":"S","message":"a"}` + "\n"
+	want := errors.New("boom")
+	_, err := ParseStream(strings.NewReader(body), 0, nil, func(tr Turn) error {
+		return want
+	})
+	require.Error(t, err)
+	assert.Contains(t, err.Error(), "boom")
+}
+
+func TestParseStream_AttachmentHookEvent(t *testing.T) {
+	body := `{"type":"attachment","sessionId":"S","timestamp":"2026-05-25T07:00:00Z","attachment":{"type":"hook_success","hookName":"SessionStart:startup","hookEvent":"SessionStart","content":"hook body"}}
+`
+	turns, _, err := collect(t, body)
+	require.NoError(t, err)
+	require.Len(t, turns, 1)
+	assert.Equal(t, "hook body", turns[0].Content)
+}
+
+func TestParseStream_OffsetAdvances(t *testing.T) {
+	body := `{"type":"user","sessionId":"S","message":"a"}` + "\n" +
+		`{"type":"user","sessionId":"S","message":"b"}` + "\n"
+	var offsets []int64
+	_, err := ParseStream(strings.NewReader(body), 100, nil, func(tr Turn) error {
+		offsets = append(offsets, tr.OffsetAfter)
+		return nil
+	})
+	require.NoError(t, err)
+	require.Len(t, offsets, 2)
+	assert.Greater(t, offsets[0], int64(100))
+	assert.Greater(t, offsets[1], offsets[0])
+}
--- a/ingestion/internal/claudewatcher/scrubber.go
+++ b/ingestion/internal/claudewatcher/scrubber.go
@@ -0,0 +1,114 @@
+package claudewatcher
+
+import (
+	"fmt"
+	"regexp"
+	"sync"
+)
+
+// Scrubber drops any turn whose content matches a known-bad pattern.
+// Fail-closed by design: we'd rather lose signal than ingest credentials
+// into a public-readable brain. The caller logs the drop reason.
+//
+// Rules cover the credential shapes most common to leak through Claude
+// Code sessions: bearer tokens, postgres URIs with embedded auth, OAuth
+// secret values, SOPS-encrypted secret blobs (we don't want the
+// ciphertext either — it's a marker that the original message contained
+// secret state), PEM-encoded private keys, and the explicit env-var
+// naming conventions used in the homelab.
+//
+// Pattern philosophy: match by shape, not by content. A 40-char hex
+// string in isolation is fine; the same string after `Authorization:
+// Bearer ` is not. Tuned to catch known leak vectors from prior
+// secret-hygiene incidents (POSTGRES_PASSWORD via kubectl exec env,
+// INFRA_MCP_TOKEN via sops -d output) without dropping every Edit on a
+// config file.
+
+// Rule is a single named regex with a redact hint shown in the warn log.
+type Rule struct {
+	Name string
+	RE   *regexp.Regexp
+}
+
+// DefaultRules is the regex set applied by Scrub. Mutable for tests but
+// callers should treat it as read-only at runtime.
+var DefaultRules = []Rule{
+	// authorization-header is checked before the bare bearer rule so
+	// contextual hits ("Authorization: Bearer X") report the more
+	// specific match name in logs.
+	{Name: "authorization-header", RE: regexp.MustCompile(`(?i)Authorization\s*:\s*[A-Za-z]+\s+\S{8,}`)},
+	{Name: "bearer-token", RE: regexp.MustCompile(`(?i)Bearer\s+[A-Za-z0-9._\-]{16,}`)},
+	{Name: "postgres-uri-with-password", RE: regexp.MustCompile(`postgres(?:ql)?://[^:\s/]+:[^@\s/]+@`)},
+	{Name: "private-key", RE: regexp.MustCompile(`-----BEGIN[^-]*PRIVATE KEY-----`)},
+	{Name: "ssh-key", RE: regexp.MustCompile(`ssh-(?:rsa|ed25519|ecdsa)\s+[A-Za-z0-9+/=]{40,}`)},
+	{Name: "github-pat", RE: regexp.MustCompile(`\b(?:ghp|gho|ghu|ghr|gha)_[A-Za-z0-9]{30,}\b`)},
+	{Name: "openai-sk", RE: regexp.MustCompile(`\bsk-(?:proj-)?[A-Za-z0-9]{32,}\b`)},
+	{Name: "anthropic-sk", RE: regexp.MustCompile(`\bsk-ant-[A-Za-z0-9_\-]{32,}\b`)},
+	{Name: "aws-access-key", RE: regexp.MustCompile(`\bAKIA[0-9A-Z]{16}\b`)},
+	{Name: "homelab-env-token", RE: regexp.MustCompile(`(?i)(?:_TOKEN|_PASSWORD|_API_KEY|_SECRET)\s*[:=]\s*['"]?[A-Za-z0-9._/+\-]{12,}`)},
+	{Name: "sops-encrypted-marker", RE: regexp.MustCompile(`ENC\[AES256_GCM,data:[A-Za-z0-9+/=]{8,}`)},
+}
+
+// extraRules is appended to DefaultRules at process startup via
+// RegisterRule. The mutex guards concurrent RegisterRule calls (rare)
+// against concurrent Scrub reads (hot path). Scrub takes a read lock
+// only when extraRules is non-empty, so steady-state cost is zero
+// when no client-name guard is configured.
+var (
+	extraRulesMu sync.RWMutex
+	extraRules   []Rule
+)
+
+// RegisterRule appends a runtime-configured regex to the scrubber's
+// rule set. Used by main to inject client-name guards from
+// CLAUDE_INGEST_CLIENT_BLOCK env var (or equivalent SOPS-encrypted
+// secret) without baking client identities into source code.
+//
+// pattern is compiled as-is — callers wrap with `\b...\b` and case
+// flags as needed. Duplicate names are accepted (rules are positional);
+// the second registration just fires after the first.
+func RegisterRule(name, pattern string) error {
+	re, err := regexp.Compile(pattern)
+	if err != nil {
+		return fmt.Errorf("compile rule %q: %w", name, err)
+	}
+	extraRulesMu.Lock()
+	extraRules = append(extraRules, Rule{Name: name, RE: re})
+	extraRulesMu.Unlock()
+	return nil
+}
+
+// ResetExtraRules clears every RegisterRule-added rule. Test-only.
+func ResetExtraRules() {
+	extraRulesMu.Lock()
+	extraRules = nil
+	extraRulesMu.Unlock()
+}
+
+// Scrub reports the first matching rule, or empty when content is clean.
+// Empty string is treated as clean. Caller decides what to do on a hit;
+// the convention in claudewatcher is to drop the turn entirely and emit
+// a slog.Warn naming the rule.
+//
+// Rule order: DefaultRules first (credential shapes), then runtime
+// RegisterRule additions (client-name guards). Credential leaks
+// outrank client-name hits in the log because they're strictly more
+// dangerous.
+func Scrub(content string) string {
+	if content == "" {
+		return ""
+	}
+	for _, r := range DefaultRules {
+		if r.RE.MatchString(content) {
+			return r.Name
+		}
+	}
+	extraRulesMu.RLock()
+	defer extraRulesMu.RUnlock()
+	for _, r := range extraRules {
+		if r.RE.MatchString(content) {
+			return r.Name
+		}
+	}
+	return ""
+}
--- a/ingestion/internal/claudewatcher/scrubber_test.go
+++ b/ingestion/internal/claudewatcher/scrubber_test.go
@@ -0,0 +1,117 @@
+package claudewatcher
+
+import (
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+)
+
+func TestScrub_PoisonedFixtures(t *testing.T) {
+	// One representative bad-string per rule. If a rule fires for the
+	// wrong content shape later, this table localises the regression.
+	cases := []struct {
+		name    string
+		content string
+		want    string
+	}{
+		{"bearer-token", "curl -H 'Authorization: Bearer abcdef1234567890ghijklmnop'", "authorization-header"},
+		{"bearer-no-header", "header = Bearer eyJhbGciOiJIUzI1NiJ9.payload.sig", "bearer-token"},
+		{"postgres-uri", "DATABASE_URL=postgres://user:s3cret@10.0.1.20:5432/brain", "postgres-uri-with-password"},
+		{"private-key", "-----BEGIN OPENSSH PRIVATE KEY-----\nb3BlbnNzaC1rZXktdjEAAAAA", "private-key"},
+		{"ssh-public", "deploy: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIK1234567890abcdefghij user@host", "ssh-key"},
+		{"github-pat-classic", "GH_TOKEN=ghp_aBcD1234EfGh5678IjKl9012MnOp3456QrSt", "github-pat"},
+		{"openai-key", "OPENAI_API_KEY=sk-proj-AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIII", "openai-sk"},
+		{"anthropic-key", "ANTHROPIC_API_KEY=sk-ant-api03-aaaaBBBBccccDDDDeeeeFFFFggggHHHHiiiiJJJJkkkk", "anthropic-sk"},
+		{"aws-access-key", "AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE", "aws-access-key"},
+		{"homelab-env", "POSTGRES_PASSWORD=hunter2supersecretvalue", "homelab-env-token"},
+		{"sops-marker", "value: ENC[AES256_GCM,data:abc123def456,iv:zzz]", "sops-encrypted-marker"},
+	}
+	for _, tc := range cases {
+		t.Run(tc.name, func(t *testing.T) {
+			got := Scrub(tc.content)
+			assert.Equal(t, tc.want, got)
+		})
+	}
+}
+
+func TestScrub_CleanContentPassesThrough(t *testing.T) {
+	cases := []string{
+		"",
+		"plain text with no credentials",
+		"a 40 char hex string aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa is fine in isolation",
+		"`Bearer` token mentioned in docs without an actual value",
+		"file at ~/.ssh/id_ed25519",
+		"the function Authorization() takes no args",
+		"comment: see API key in 1Password",
+	}
+	for _, c := range cases {
+		assert.Empty(t, Scrub(c), "expected clean for %q", c)
+	}
+}
+
+func TestScrub_FirstMatchWins(t *testing.T) {
+	// Content matching multiple rules: report the first rule order in
+	// DefaultRules. Stability matters for log triage.
+	content := "Authorization: Bearer ghp_aBcD1234EfGh5678IjKl9012MnOp3456QrSt"
+	assert.Equal(t, "authorization-header", Scrub(content))
+}
+
+func TestRegisterRule_ClientNameGuard(t *testing.T) {
+	t.Cleanup(ResetExtraRules)
+	require := func(err error) {
+		if err != nil {
+			t.Fatalf("unexpected err: %v", err)
+		}
+	}
+	require(RegisterRule("client-name", `(?i)\b(SEB|Mastercard)\b`))
+
+	// Hits — case variations + word-boundary respect.
+	for _, hit := range []string{
+		"mentioned SEB in this commit",
+		"the Mastercard project deadline",
+		"working on mastercard scope",
+		"SEB internal review",
+	} {
+		assert.Equal(t, "client-name", Scrub(hit), "should match %q", hit)
+	}
+
+	// Misses — substring within a longer word should NOT match
+	// thanks to \b. "Sebastian" contains "seb" but \b prevents hit.
+	for _, miss := range []string{
+		"Sebastian wrote the docs",
+		"unrelated text",
+		"researcher",
+		"https://example.com/search?seb=1", // 'seb' bounded by ?=, still matches \b
+	} {
+		got := Scrub(miss)
+		if miss == "https://example.com/search?seb=1" {
+			// `seb=` has word-boundary at '='; this DOES match \bseb\b.
+			// Accept either outcome; document the tradeoff.
+			assert.Contains(t, []string{"", "client-name"}, got)
+			continue
+		}
+		assert.Empty(t, got, "should NOT match %q", miss)
+	}
+}
+
+func TestRegisterRule_CredentialsTakePrecedence(t *testing.T) {
+	t.Cleanup(ResetExtraRules)
+	require := func(err error) {
+		if err != nil {
+			t.Fatalf("unexpected err: %v", err)
+		}
+	}
+	require(RegisterRule("client-name", `\b(SEB)\b`))
+
+	// Content matches both a credential rule AND a client rule —
+	// credential rule wins by ordering, so log triage points at the
+	// strictly more dangerous leak.
+	content := "SEB project uses OPENAI_API_KEY=sk-proj-AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIII"
+	assert.Equal(t, "openai-sk", Scrub(content))
+}
+
+func TestRegisterRule_RejectsInvalidPattern(t *testing.T) {
+	t.Cleanup(ResetExtraRules)
+	err := RegisterRule("bad", "[unclosed")
+	assert.Error(t, err)
+}
--- a/ingestion/internal/claudewatcher/watcher.go
+++ b/ingestion/internal/claudewatcher/watcher.go
@@ -0,0 +1,234 @@
+package claudewatcher
+
+import (
+	"context"
+	"fmt"
+	"log/slog"
+	"os"
+	"path/filepath"
+	"strings"
+	"time"
+)
+
+// Sink consumes batches of ingest-ready turns from the watcher. The
+// production implementation builds wiki pages and calls pipeline.RunRaw
+// against the brain. Tests substitute a counter.
+//
+// A Batch represents the turns ingested from one session file between
+// two cursor checkpoints. Implementations must be idempotent — the
+// watcher only advances the cursor on a nil return.
+type Sink interface {
+	Ingest(ctx context.Context, b Batch) error
+}
+
+// Batch is a per-file slice of turns plus identifying metadata.
+type Batch struct {
+	Host      string // origin host, e.g. "koala"
+	FilePath  string // absolute path to the source .jsonl file
+	SessionID string // first session_id seen in the batch
+	ProjectID string // basename of the parent dir, e.g. "-home-mathias-dev"
+	Turns     []Turn // never empty; caller filters Skip + scrubber matches
+}
+
+// Config drives one Watch loop. SessionsDir is the absolute path to the
+// Claude Code projects directory (~/.claude/projects). Host is the
+// label written into cursors and ingested page frontmatter. Interval
+// is the poll cadence; a zero or negative value disables the loop.
+//
+// Sink is required. Cursors is optional — when nil the watcher
+// re-reads from byte 0 on every tick (useful for first-run testing
+// without a postgres dependency).
+type Config struct {
+	SessionsDir string
+	Host        string
+	Interval    time.Duration
+	Sink        Sink
+	Cursors     *CursorStore
+	Logger      *slog.Logger
+}
+
+// Watch runs the polling loop until ctx is cancelled. Returns ctx.Err()
+// on shutdown. Each tick walks SessionsDir for *.jsonl files, advances
+// each file's cursor, and emits one Batch per file with new turns.
+// Errors during a single file's parse or ingest are logged but do not
+// abort the loop — a single bad file shouldn't block the others.
+func Watch(ctx context.Context, cfg Config) error {
+	if cfg.SessionsDir == "" {
+		return fmt.Errorf("sessions dir is required")
+	}
+	if cfg.Sink == nil {
+		return fmt.Errorf("sink is required")
+	}
+	if cfg.Interval <= 0 {
+		return fmt.Errorf("interval must be positive")
+	}
+	if cfg.Host == "" {
+		cfg.Host = "unknown"
+	}
+	if cfg.Logger == nil {
+		cfg.Logger = slog.Default()
+	}
+	cfg.Logger.Info("claudewatcher: started",
+		"sessions_dir", cfg.SessionsDir,
+		"host", cfg.Host,
+		"interval", cfg.Interval)
+
+	ticker := time.NewTicker(cfg.Interval)
+	defer ticker.Stop()
+	// Run an immediate first sweep so first-launch users don't wait one
+	// tick before anything happens.
+	runTick(ctx, cfg)
+	for {
+		select {
+		case <-ctx.Done():
+			return ctx.Err()
+		case <-ticker.C:
+			runTick(ctx, cfg)
+		}
+	}
+}
+
+// runTick is one polling pass. Exposed (lowercase) for tests via
+// TickOnce.
+func runTick(ctx context.Context, cfg Config) {
+	files, err := listSessionFiles(cfg.SessionsDir)
+	if err != nil {
+		cfg.Logger.Warn("claudewatcher: list session files", "err", err)
+		return
+	}
+	for _, f := range files {
+		if ctx.Err() != nil {
+			return
+		}
+		if err := processFile(ctx, cfg, f); err != nil {
+			cfg.Logger.Warn("claudewatcher: file failed",
+				"path", f, "err", err)
+		}
+	}
+}
+
+// TickOnce runs one sweep synchronously and returns. Used by tests +
+// by ad-hoc CLI invocations.
+func TickOnce(ctx context.Context, cfg Config) error {
+	if cfg.SessionsDir == "" || cfg.Sink == nil {
+		return fmt.Errorf("config invalid")
+	}
+	if cfg.Host == "" {
+		cfg.Host = "unknown"
+	}
+	if cfg.Logger == nil {
+		cfg.Logger = slog.Default()
+	}
+	runTick(ctx, cfg)
+	return nil
+}
+
+func listSessionFiles(root string) ([]string, error) {
+	var out []string
+	err := filepath.WalkDir(root, func(path string, d os.DirEntry, walkErr error) error {
+		if walkErr != nil {
+			return walkErr
+		}
+		if d.IsDir() {
+			return nil
+		}
+		if !strings.HasSuffix(path, ".jsonl") {
+			return nil
+		}
+		out = append(out, path)
+		return nil
+	})
+	if err != nil {
+		return nil, fmt.Errorf("walk %s: %w", root, err)
+	}
+	return out, nil
+}
+
+func processFile(ctx context.Context, cfg Config, path string) error {
+	startOffset := int64(0)
+	if cfg.Cursors != nil {
+		off, _, err := cfg.Cursors.GetOffset(ctx, cfg.Host, path)
+		if err != nil {
+			return fmt.Errorf("get cursor: %w", err)
+		}
+		startOffset = off
+	}
+
+	stat, err := os.Stat(path)
+	if err != nil {
+		return fmt.Errorf("stat: %w", err)
+	}
+	if stat.Size() <= startOffset {
+		return nil // nothing new
+	}
+
+	f, err := os.Open(path)
+	if err != nil {
+		return fmt.Errorf("open: %w", err)
+	}
+	defer func() { _ = f.Close() }()
+	if _, err := f.Seek(startOffset, 0); err != nil {
+		return fmt.Errorf("seek: %w", err)
+	}
+
+	var keep []Turn
+	var sessionID string
+	var droppedScrub int
+	endOffset, err := ParseStream(f, startOffset,
+		func(format string, args ...any) {
+			cfg.Logger.Warn(fmt.Sprintf("claudewatcher: parse: "+format, args...))
+		},
+		func(t Turn) error {
+			if t.Skip || t.Content == "" {
+				return nil
+			}
+			if rule := Scrub(t.Content); rule != "" {
+				droppedScrub++
+				cfg.Logger.Warn("claudewatcher: turn dropped by scrubber",
+					"rule", rule, "path", path, "session_id", t.SessionID)
+				return nil
+			}
+			if sessionID == "" {
+				sessionID = t.SessionID
+			}
+			keep = append(keep, t)
+			return nil
+		})
+	if err != nil {
+		return fmt.Errorf("parse stream: %w", err)
+	}
+
+	if len(keep) == 0 {
+		if cfg.Cursors != nil {
+			if err := cfg.Cursors.SetOffset(ctx, cfg.Host, path, endOffset); err != nil {
+				return fmt.Errorf("advance cursor (no-turns): %w", err)
+			}
+		}
+		if droppedScrub > 0 {
+			cfg.Logger.Info("claudewatcher: only scrubbed turns this tick",
+				"path", path, "dropped", droppedScrub)
+		}
+		return nil
+	}
+
+	batch := Batch{
+		Host:      cfg.Host,
+		FilePath:  path,
+		SessionID: sessionID,
+		ProjectID: filepath.Base(filepath.Dir(path)),
+		Turns:     keep,
+	}
+	if err := cfg.Sink.Ingest(ctx, batch); err != nil {
+		return fmt.Errorf("sink ingest: %w", err)
+	}
+	if cfg.Cursors != nil {
+		if err := cfg.Cursors.SetOffset(ctx, cfg.Host, path, endOffset); err != nil {
+			return fmt.Errorf("advance cursor: %w", err)
+		}
+	}
+	cfg.Logger.Info("claudewatcher: ingested batch",
+		"path", path, "session_id", sessionID,
+		"turns_kept", len(keep), "dropped_scrub", droppedScrub,
+		"new_offset", endOffset)
+	return nil
+}
--- a/ingestion/internal/claudewatcher/watcher_test.go
+++ b/ingestion/internal/claudewatcher/watcher_test.go
@@ -0,0 +1,174 @@
+package claudewatcher
+
+import (
+	"context"
+	"os"
+	"path/filepath"
+	"strings"
+	"sync"
+	"testing"
+	"time"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
+
+// memSink captures batches without touching postgres. Thread-safe so
+// TickOnce can run from any goroutine in concurrent tests.
+type memSink struct {
+	mu      sync.Mutex
+	batches []Batch
+	failOn  string // file basename to error on
+}
+
+func (m *memSink) Ingest(_ context.Context, b Batch) error {
+	m.mu.Lock()
+	defer m.mu.Unlock()
+	if m.failOn != "" && strings.Contains(b.FilePath, m.failOn) {
+		return assert.AnError
+	}
+	m.batches = append(m.batches, b)
+	return nil
+}
+
+func writeSession(t *testing.T, dir, sessionID string, lines []string) string {
+	t.Helper()
+	path := filepath.Join(dir, sessionID+".jsonl")
+	body := strings.Join(lines, "\n") + "\n"
+	require.NoError(t, os.WriteFile(path, []byte(body), 0o644))
+	return path
+}
+
+func TestTickOnce_NoCursorReingestsEverythingEveryTick(t *testing.T) {
+	tmp := t.TempDir()
+	projectDir := filepath.Join(tmp, "-home-mathias-dev")
+	require.NoError(t, os.MkdirAll(projectDir, 0o755))
+	writeSession(t, projectDir, "sess1", []string{
+		`{"type":"user","sessionId":"sess1","message":"first prompt"}`,
+		`{"type":"assistant","sessionId":"sess1","message":{"content":[{"type":"text","text":"first answer"}]}}`,
+	})
+
+	sink := &memSink{}
+	cfg := Config{
+		SessionsDir: tmp,
+		Host:        "koala",
+		Sink:        sink,
+	}
+	require.NoError(t, TickOnce(context.Background(), cfg))
+	require.NoError(t, TickOnce(context.Background(), cfg))
+
+	require.Len(t, sink.batches, 2, "no cursor => re-emits same batch every tick")
+	assert.Equal(t, "sess1", sink.batches[0].SessionID)
+	assert.Equal(t, "koala", sink.batches[0].Host)
+	assert.Equal(t, "-home-mathias-dev", sink.batches[0].ProjectID)
+	assert.Len(t, sink.batches[0].Turns, 2)
+}
+
+func TestTickOnce_FiltersSkipTurnsAndScrubberMatches(t *testing.T) {
+	tmp := t.TempDir()
+	proj := filepath.Join(tmp, "-home-mathias-dev")
+	require.NoError(t, os.MkdirAll(proj, 0o755))
+	writeSession(t, proj, "sess-scrub", []string{
+		`{"type":"queue-operation","sessionId":"sess-scrub","content":"x"}`, // Skip
+		`{"type":"user","sessionId":"sess-scrub","message":"normal prompt"}`,
+		`{"type":"assistant","sessionId":"sess-scrub","message":{"content":[{"type":"text","text":"value POSTGRES_PASSWORD=hunter2supersecretvalue"}]}}`, // scrubbed
+	})
+	sink := &memSink{}
+	require.NoError(t, TickOnce(context.Background(), Config{
+		SessionsDir: tmp, Host: "koala", Sink: sink,
+	}))
+	require.Len(t, sink.batches, 1)
+	turns := sink.batches[0].Turns
+	require.Len(t, turns, 1, "skip + scrubbed turns must not reach the sink")
+	assert.Equal(t, "user", turns[0].Type)
+}
+
+func TestTickOnce_AllScrubbedNoBatchEmitted(t *testing.T) {
+	tmp := t.TempDir()
+	proj := filepath.Join(tmp, "-home-mathias-dev")
+	require.NoError(t, os.MkdirAll(proj, 0o755))
+	writeSession(t, proj, "all-bad", []string{
+		`{"type":"user","sessionId":"all-bad","message":"Authorization: Bearer abcdef1234567890ghijklmnop"}`,
+	})
+	sink := &memSink{}
+	require.NoError(t, TickOnce(context.Background(), Config{
+		SessionsDir: tmp, Host: "koala", Sink: sink,
+	}))
+	assert.Empty(t, sink.batches, "no usable turns => no batch")
+}
+
+func TestTickOnce_IgnoresNonJsonlFiles(t *testing.T) {
+	tmp := t.TempDir()
+	proj := filepath.Join(tmp, "-home-mathias-dev")
+	require.NoError(t, os.MkdirAll(proj, 0o755))
+	require.NoError(t, os.WriteFile(filepath.Join(proj, "README.md"), []byte("ignore me"), 0o644))
+	require.NoError(t, os.WriteFile(filepath.Join(proj, "config.json"), []byte("{}"), 0o644))
+	sink := &memSink{}
+	require.NoError(t, TickOnce(context.Background(), Config{
+		SessionsDir: tmp, Host: "koala", Sink: sink,
+	}))
+	assert.Empty(t, sink.batches)
+}
+
+func TestTickOnce_HandlesMultipleProjectsAndSessions(t *testing.T) {
+	tmp := t.TempDir()
+	projA := filepath.Join(tmp, "-home-mathias-dev")
+	projB := filepath.Join(tmp, "-home-mathias-AI-infra")
+	require.NoError(t, os.MkdirAll(projA, 0o755))
+	require.NoError(t, os.MkdirAll(projB, 0o755))
+	writeSession(t, projA, "a1", []string{`{"type":"user","sessionId":"a1","message":"q1"}`})
+	writeSession(t, projA, "a2", []string{`{"type":"user","sessionId":"a2","message":"q2"}`})
+	writeSession(t, projB, "b1", []string{`{"type":"user","sessionId":"b1","message":"q3"}`})
+
+	sink := &memSink{}
+	require.NoError(t, TickOnce(context.Background(), Config{
+		SessionsDir: tmp, Host: "koala", Sink: sink,
+	}))
+	require.Len(t, sink.batches, 3)
+
+	projects := map[string]int{}
+	for _, b := range sink.batches {
+		projects[b.ProjectID]++
+	}
+	assert.Equal(t, 2, projects["-home-mathias-dev"])
+	assert.Equal(t, 1, projects["-home-mathias-AI-infra"])
+}
+
+func TestTickOnce_SinkErrorDoesNotKillOtherFiles(t *testing.T) {
+	tmp := t.TempDir()
+	proj := filepath.Join(tmp, "-home-mathias-dev")
+	require.NoError(t, os.MkdirAll(proj, 0o755))
+	writeSession(t, proj, "good", []string{`{"type":"user","sessionId":"good","message":"q"}`})
+	writeSession(t, proj, "bad-session", []string{`{"type":"user","sessionId":"bad-session","message":"q"}`})
+
+	sink := &memSink{failOn: "bad-session"}
+	require.NoError(t, TickOnce(context.Background(), Config{
+		SessionsDir: tmp, Host: "koala", Sink: sink,
+	}))
+	require.Len(t, sink.batches, 1, "good session still ingested")
+	assert.Equal(t, "good", sink.batches[0].SessionID)
+}
+
+func TestWatch_RespectsContextCancel(t *testing.T) {
+	tmp := t.TempDir()
+	require.NoError(t, os.MkdirAll(filepath.Join(tmp, "-home-mathias-dev"), 0o755))
+	sink := &memSink{}
+	ctx, cancel := context.WithCancel(context.Background())
+	done := make(chan error, 1)
+	go func() {
+		done <- Watch(ctx, Config{
+			SessionsDir: tmp,
+			Host:        "koala",
+			Interval:    10 * time.Millisecond,
+			Sink:        sink,
+		})
+	}()
+	time.Sleep(50 * time.Millisecond)
+	cancel()
+	select {
+	case err := <-done:
+		assert.ErrorIs(t, err, context.Canceled)
+	case <-time.After(2 * time.Second):
+		t.Fatal("Watch did not return after cancel")
+	}
+}
--- a/ingestion/internal/graph/extract.go
+++ b/ingestion/internal/graph/extract.go
@@ -0,0 +1,263 @@
+// Package graph extracts entity + edge records from brain markdown
+// documents for the brain_entities / brain_edges relational graph.
+//
+// The extractor is pure: it takes markdown bytes and a document path and
+// returns the entity (one per doc) and the wikilink edges (zero or more)
+// it found, with source line numbers so the graph store can record
+// provenance.
+//
+// Edge types in v1: only "wikilink" — derived from [[slug]] and
+// [[slug|Display]] occurrences in the body. Section-header edges are
+// deferred (see infra#62 grill addendum).
+package graph
+
+import (
+	"bufio"
+	"bytes"
+	"path/filepath"
+	"regexp"
+	"strings"
+)
+
+// Entity represents one brain document for graph indexing.
+//
+// Slug is the basename without ".md" — the same identity used by
+// wiki canonicalization and the wikilink target syntax.
+//
+// Type categorises the doc into a coarse bucket so callers can filter
+// graph traversals (e.g. "only entity nodes"). When the doc lives
+// under brain/wiki/<wing>/<hall>/, Wing and Hall capture the
+// taxonomy; otherwise they're empty (legacy brain/knowledge/ docs).
+type Entity struct {
+	DocPath string // forward-slash, relative to brainDir
+	Slug    string
+	Type    string // "concept" | "entity" | "source" | "hall" | "knowledge"
+	Wing    string // optional; from frontmatter or path
+	Hall    string // optional; from frontmatter or path
+	Title   string // optional; from frontmatter
+	// DIKW tier — infra#72. Empty until M3 migration writes `tier:`
+	// frontmatter to every entry. Path-inferred tier kicks in as a
+	// fallback so the column populates immediately on backfill even
+	// for entries that haven't had their frontmatter rewritten yet.
+	Tier  string // "inbox" | "note" | "knowledge"
+	Topic string // kebab-slug; the thing the entry is about
+}
+
+// Edge represents a directed relationship between two slugs.
+//
+// SrcLine is the 1-indexed line in the source document where the link
+// was found, so callers can re-find the linking text after an edit.
+type Edge struct {
+	SrcDoc   string // forward-slash, relative to brainDir
+	SrcSlug  string // == Entity.Slug for SrcDoc
+	DstSlug  string
+	EdgeType string // "wikilink" in v1
+	SrcLine  int    // 1-indexed
+}
+
+// linkRE matches both [[slug]] and [[slug|Display Name]] wikilinks.
+// Group 1 is the slug; group 2 (if present) is the display.
+var linkRE = regexp.MustCompile(`\[\[([^\]|]+)(?:\|([^\]]+))?\]\]`)
+
+// Extract parses one markdown document and returns its Entity plus the
+// outgoing wikilink Edges. docPath is forward-slash, relative to
+// brainDir; content is the raw markdown bytes.
+//
+// Returns ok=false when docPath does not yield a usable slug (e.g.
+// non-markdown file slipped through).
+func Extract(docPath string, content []byte) (Entity, []Edge, bool) {
+	slug := slugFromPath(docPath)
+	if slug == "" {
+		return Entity{}, nil, false
+	}
+	ent := Entity{DocPath: docPath, Slug: slug}
+	classifyByPath(&ent, docPath)
+	readFrontmatter(&ent, content)
+	inferTierFromPath(&ent, docPath)
+
+	edges := extractEdges(docPath, slug, content)
+	return ent, edges, true
+}
+
+// inferTierFromPath fills Tier when frontmatter didn't already set it.
+// The new layout has dedicated subtrees per tier; pre-migration paths
+// (knowledge/, wiki/, raw/, sessions/) get their best-guess mapping so
+// the column populates on backfill before the M3 file moves run.
+func inferTierFromPath(e *Entity, docPath string) {
+	if e.Tier != "" {
+		return
+	}
+	parts := strings.Split(docPath, "/")
+	if len(parts) == 0 {
+		return
+	}
+	switch parts[0] {
+	case "inbox":
+		e.Tier = "inbox"
+	case "notes":
+		e.Tier = "note"
+	case "knowledge":
+		e.Tier = "knowledge"
+	case "wiki":
+		// Pre-M3 wiki layout. Most subdirs are I-level:
+		//   wiki/sources/  — synth summaries of raw inbox material
+		//   wiki/concepts/ — definitions, not lessons
+		// One exception: wiki/entities/ holds anchor facts about
+		// concrete things (models, services, people) that the eval
+		// expects to surface when queried directly. Those map to K
+		// to match the post-M3 layout target (knowledge/facts/).
+		if len(parts) >= 2 && parts[1] == "entities" {
+			e.Tier = "knowledge"
+		} else {
+			e.Tier = "note"
+		}
+	case "raw", "sessions", "clips":
+		e.Tier = "inbox"
+	}
+}
+
+func slugFromPath(docPath string) string {
+	base := filepath.Base(docPath)
+	if !strings.HasSuffix(base, ".md") {
+		return ""
+	}
+	return strings.TrimSuffix(base, ".md")
+}
+
+// classifyByPath fills Type / Wing / Hall from the path layout when the
+// doc lives under brain/wiki/. Layout: wiki/<wing>/<hall>/<slug>.md
+// or wiki/<bucket>/<slug>.md for the legacy concept/entity/source dirs.
+//
+// Files directly under wiki/ (no subdirectory — e.g. wiki/index.md) used
+// to incorrectly land Type="hall" Wing="index.md" because the path's
+// second segment was the file itself. Now they fall through to Type
+// "knowledge" and leave wing/hall to frontmatter.
+func classifyByPath(e *Entity, docPath string) {
+	parts := strings.Split(docPath, "/")
+	if len(parts) < 2 || parts[0] != "wiki" {
+		e.Type = "knowledge"
+		return
+	}
+	if len(parts) < 3 {
+		// wiki/<slug>.md — no subdirectory. Treat as plain knowledge
+		// and let frontmatter set wing/hall if they're present.
+		e.Type = "knowledge"
+		return
+	}
+	switch parts[1] {
+	case "concepts":
+		e.Type = "concept"
+	case "entities":
+		e.Type = "entity"
+	case "sources":
+		e.Type = "source"
+	default:
+		// wiki/<wing>/<hall>/<slug>.md
+		e.Type = "hall"
+		e.Wing = parts[1]
+		if len(parts) >= 4 {
+			e.Hall = parts[2]
+		}
+	}
+}
+
+// readFrontmatter pulls title/wing/hall from a YAML frontmatter block.
+// Frontmatter is optional; missing fields leave the entity unchanged.
+func readFrontmatter(e *Entity, content []byte) {
+	scanner := bufio.NewScanner(bytes.NewReader(content))
+	inFM := false
+	for scanner.Scan() {
+		line := scanner.Text()
+		if strings.TrimSpace(line) == "---" {
+			if !inFM {
+				inFM = true
+				continue
+			}
+			return
+		}
+		if !inFM {
+			return
+		}
+		key, val, ok := strings.Cut(line, ":")
+		if !ok {
+			continue
+		}
+		v := strings.Trim(strings.TrimSpace(val), `"'`)
+		switch strings.TrimSpace(key) {
+		case "title":
+			if e.Title == "" {
+				e.Title = v
+			}
+		case "wing":
+			if e.Wing == "" {
+				e.Wing = v
+			}
+		case "hall":
+			if e.Hall == "" {
+				e.Hall = v
+			}
+		case "tier":
+			if e.Tier == "" {
+				e.Tier = v
+			}
+		case "topic":
+			if e.Topic == "" {
+				e.Topic = v
+			}
+		}
+	}
+}
+
+func extractEdges(docPath, srcSlug string, content []byte) []Edge {
+	var edges []Edge
+	seen := make(map[string]struct{}) // dedupe (dst, line)
+	scanner := bufio.NewScanner(bytes.NewReader(content))
+	line := 0
+	for scanner.Scan() {
+		line++
+		matches := linkRE.FindAllStringSubmatch(scanner.Text(), -1)
+		for _, m := range matches {
+			dst := strings.TrimSpace(m[1])
+			if dst == "" || dst == srcSlug {
+				continue
+			}
+			key := dst + "|" + itoa(line)
+			if _, dup := seen[key]; dup {
+				continue
+			}
+			seen[key] = struct{}{}
+			edges = append(edges, Edge{
+				SrcDoc:   docPath,
+				SrcSlug:  srcSlug,
+				DstSlug:  dst,
+				EdgeType: "wikilink",
+				SrcLine:  line,
+			})
+		}
+	}
+	return edges
+}
+
+// itoa avoids the fmt dependency on a hot path. Single-digit fast path
+// keeps overhead negligible for typical line counts.
+func itoa(n int) string {
+	if n == 0 {
+		return "0"
+	}
+	var buf [20]byte
+	i := len(buf)
+	neg := n < 0
+	if neg {
+		n = -n
+	}
+	for n > 0 {
+		i--
+		buf[i] = byte('0' + n%10)
+		n /= 10
+	}
+	if neg {
+		i--
+		buf[i] = '-'
+	}
+	return string(buf[i:])
+}
--- a/ingestion/internal/graph/extract_test.go
+++ b/ingestion/internal/graph/extract_test.go
@@ -0,0 +1,179 @@
+package graph
+
+import (
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
+
+func TestExtract_HallDoc(t *testing.T) {
+	content := []byte(`---
+wing: jepa-fx
+hall: decisions
+title: Val Vol Decision
+---
+# Val Vol
+
+See also [[other-decision]] and [[parent-concept|Parent Concept]].
+
+Linking to [[unrelated]].
+`)
+
+	ent, edges, ok := Extract("wiki/jepa-fx/decisions/val-vol.md", content)
+	require.True(t, ok)
+	assert.Equal(t, "val-vol", ent.Slug)
+	assert.Equal(t, "hall", ent.Type)
+	assert.Equal(t, "jepa-fx", ent.Wing)
+	assert.Equal(t, "decisions", ent.Hall)
+	assert.Equal(t, "Val Vol Decision", ent.Title)
+
+	require.Len(t, edges, 3)
+	assert.Equal(t, "other-decision", edges[0].DstSlug)
+	assert.Equal(t, "parent-concept", edges[1].DstSlug)
+	assert.Equal(t, "unrelated", edges[2].DstSlug)
+	for _, e := range edges {
+		assert.Equal(t, "wikilink", e.EdgeType)
+		assert.Equal(t, "val-vol", e.SrcSlug)
+		assert.Equal(t, "wiki/jepa-fx/decisions/val-vol.md", e.SrcDoc)
+		assert.Greater(t, e.SrcLine, 0)
+	}
+}
+
+func TestExtract_LegacyConceptDoc(t *testing.T) {
+	content := []byte(`---
+title: Hash Encoding
+---
+# Hash Encoding
+
+Linked to [[financial-sentiment-analysis|FSA]].
+`)
+	ent, edges, ok := Extract("wiki/concepts/hash-encoding.md", content)
+	require.True(t, ok)
+	assert.Equal(t, "hash-encoding", ent.Slug)
+	assert.Equal(t, "concept", ent.Type)
+	assert.Empty(t, ent.Wing)
+	assert.Empty(t, ent.Hall)
+	assert.Equal(t, "Hash Encoding", ent.Title)
+
+	require.Len(t, edges, 1)
+	assert.Equal(t, "financial-sentiment-analysis", edges[0].DstSlug)
+}
+
+func TestExtract_KnowledgeDoc(t *testing.T) {
+	content := []byte("# No frontmatter, no links here.\n")
+	ent, edges, ok := Extract("knowledge/some-note.md", content)
+	require.True(t, ok)
+	assert.Equal(t, "some-note", ent.Slug)
+	assert.Equal(t, "knowledge", ent.Type)
+	assert.Empty(t, edges)
+}
+
+func TestExtract_DedupesRepeatedLinkOnSameLine(t *testing.T) {
+	content := []byte("See [[foo]] and [[foo]] again on the same line.\n")
+	_, edges, ok := Extract("knowledge/dup.md", content)
+	require.True(t, ok)
+	require.Len(t, edges, 1)
+	assert.Equal(t, "foo", edges[0].DstSlug)
+}
+
+func TestExtract_KeepsMultipleEdgesOnDifferentLines(t *testing.T) {
+	content := []byte("First mention [[foo]].\n\nSecond mention [[foo]].\n")
+	_, edges, ok := Extract("knowledge/multi.md", content)
+	require.True(t, ok)
+	require.Len(t, edges, 2)
+	assert.NotEqual(t, edges[0].SrcLine, edges[1].SrcLine)
+}
+
+func TestExtract_IgnoresSelfLinks(t *testing.T) {
+	content := []byte("Self-reference [[self]] should be ignored.\n")
+	_, edges, ok := Extract("knowledge/self.md", content)
+	require.True(t, ok)
+	assert.Empty(t, edges)
+}
+
+func TestExtract_RejectsNonMarkdown(t *testing.T) {
+	_, _, ok := Extract("wiki/concepts/not-markdown.txt", []byte("anything"))
+	assert.False(t, ok)
+}
+
+func TestExtract_LineNumbersAre1Indexed(t *testing.T) {
+	content := []byte("line 1\nline 2 [[bar]]\n")
+	_, edges, ok := Extract("knowledge/lines.md", content)
+	require.True(t, ok)
+	require.Len(t, edges, 1)
+	assert.Equal(t, 2, edges[0].SrcLine)
+}
+
+// Files directly under wiki/ (no subdirectory) used to land
+// Type="hall" Wing="<filename>.md" because the path's second segment
+// was the file itself. The fix routes them to Type="knowledge" with
+// empty Wing/Hall and lets frontmatter set them if present.
+func TestExtract_WikiRootFileIsKnowledgeNotHall(t *testing.T) {
+	content := []byte("# Index\n\n- [[foo]]\n")
+	ent, _, ok := Extract("wiki/index.md", content)
+	require.True(t, ok)
+	assert.Equal(t, "index", ent.Slug)
+	assert.Equal(t, "knowledge", ent.Type)
+	assert.Empty(t, ent.Wing)
+	assert.Empty(t, ent.Hall)
+}
+
+func TestExtract_TierFromFrontmatter(t *testing.T) {
+	content := []byte(`---
+tier: knowledge
+topic: postgres-roles
+title: Least-privilege migration trap
+---
+# body
+`)
+	ent, _, ok := Extract("knowledge/some-lesson.md", content)
+	require.True(t, ok)
+	assert.Equal(t, "knowledge", ent.Tier)
+	assert.Equal(t, "postgres-roles", ent.Topic)
+}
+
+func TestExtract_TierInferredFromPath(t *testing.T) {
+	cases := []struct {
+		path string
+		want string
+	}{
+		{"knowledge/foo.md", "knowledge"},
+		{"wiki/sources/x.md", "note"},
+		{"wiki/concepts/x.md", "note"},
+		{"wiki/x.md", "note"},
+		{"inbox/clips/x.md", "inbox"},
+		{"notes/x.md", "note"},
+		{"raw/x.md", "inbox"},
+		{"sessions/x.md", "inbox"},
+	}
+	for _, tc := range cases {
+		ent, _, ok := Extract(tc.path, []byte("# x\n"))
+		require.True(t, ok, tc.path)
+		assert.Equal(t, tc.want, ent.Tier, tc.path)
+	}
+}
+
+func TestExtract_FrontmatterTierBeatsPathInference(t *testing.T) {
+	// A clip explicitly promoted via frontmatter wins over the path's
+	// inbox inference. Catches the case where a file has been moved
+	// to a new location but frontmatter hasn't been updated.
+	content := []byte("---\ntier: knowledge\n---\n# x\n")
+	ent, _, ok := Extract("inbox/clips/x.md", content)
+	require.True(t, ok)
+	assert.Equal(t, "knowledge", ent.Tier)
+}
+
+func TestExtract_WikiRootFileWithFrontmatterWingHall(t *testing.T) {
+	content := []byte(`---
+wing: homelab
+hall: facts
+---
+# Some root note
+`)
+	ent, _, ok := Extract("wiki/some-note.md", content)
+	require.True(t, ok)
+	assert.Equal(t, "knowledge", ent.Type)
+	assert.Equal(t, "homelab", ent.Wing)
+	assert.Equal(t, "facts", ent.Hall)
+}
--- a/ingestion/internal/graphstore/pg.go
+++ b/ingestion/internal/graphstore/pg.go
@@ -0,0 +1,365 @@
+// Package graphstore stores the brain knowledge graph (entities +
+// directed edges) in PostgreSQL on the shared postgres18 instance,
+// alongside the pgvector embeddings in [vectorstore].
+//
+// Schema (created idempotently by Init):
+//
+//	brain_entities(slug PK, type, wing, hall, doc_path, title, updated_at)
+//	brain_edges(id PK, src_slug FK, dst_slug, edge_type, src_doc, src_line,
+//	            weight, updated_at)
+//
+// Edges fan-out from a source document; calling [PGStore.ReplaceEdgesForDoc]
+// replaces every edge previously emitted from that document so re-ingest is
+// idempotent without bookkeeping.
+//
+// All slug strings are stored verbatim — callers are expected to canonicalise
+// before persisting. Dst slugs may reference entities that don't yet exist
+// (dangling edges); resolution is deferred to query time so ingestion order
+// doesn't matter.
+package graphstore
+
+import (
+	"context"
+	"errors"
+	"fmt"
+
+	"github.com/jackc/pgx/v5"
+	"github.com/jackc/pgx/v5/pgxpool"
+
+	"github.com/mathiasbq/hyperguild/ingestion/internal/graph"
+)
+
+// PGStore is the postgres-backed brain knowledge-graph store. Construct
+// with New + call Init once to create tables and indexes. Use Close to
+// release the pool.
+type PGStore struct {
+	pool *pgxpool.Pool
+}
+
+// New opens a pgxpool against dsn and pings to verify connectivity. The
+// caller owns the resulting PGStore and must invoke Close.
+func New(ctx context.Context, dsn string) (*PGStore, error) {
+	pool, err := pgxpool.New(ctx, dsn)
+	if err != nil {
+		return nil, fmt.Errorf("pgxpool: %w", err)
+	}
+	if err := pool.Ping(ctx); err != nil {
+		pool.Close()
+		return nil, fmt.Errorf("ping: %w", err)
+	}
+	return &PGStore{pool: pool}, nil
+}
+
+// Close releases the underlying connection pool.
+func (s *PGStore) Close() {
+	if s.pool != nil {
+		s.pool.Close()
+	}
+}
+
+// Init creates brain_entities + brain_edges tables and their indexes if
+// they don't yet exist. Safe to call on every startup. No-op when the
+// schema already matches.
+func (s *PGStore) Init(ctx context.Context) error {
+	const ddl = `
+CREATE TABLE IF NOT EXISTS brain_entities (
+    slug       TEXT PRIMARY KEY,
+    type       TEXT NOT NULL DEFAULT 'knowledge',
+    wing       TEXT NOT NULL DEFAULT '',
+    hall       TEXT NOT NULL DEFAULT '',
+    doc_path   TEXT NOT NULL,
+    title      TEXT NOT NULL DEFAULT '',
+    tier       TEXT NOT NULL DEFAULT '',
+    topic      TEXT NOT NULL DEFAULT '',
+    updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
+);
+-- Idempotent migration for clusters created before the DIKW tier
+-- redesign (infra#72). ADD COLUMN IF NOT EXISTS is safe across
+-- repeated startups.
+ALTER TABLE brain_entities
+    ADD COLUMN IF NOT EXISTS tier  TEXT NOT NULL DEFAULT '',
+    ADD COLUMN IF NOT EXISTS topic TEXT NOT NULL DEFAULT '';
+CREATE INDEX IF NOT EXISTS brain_entities_wing_idx
+    ON brain_entities (wing) WHERE wing <> '';
+CREATE INDEX IF NOT EXISTS brain_entities_type_idx
+    ON brain_entities (type);
+CREATE INDEX IF NOT EXISTS brain_entities_tier_idx
+    ON brain_entities (tier) WHERE tier <> '';
+CREATE INDEX IF NOT EXISTS brain_entities_topic_idx
+    ON brain_entities (topic) WHERE topic <> '';
+
+CREATE TABLE IF NOT EXISTS brain_edges (
+    id         BIGSERIAL PRIMARY KEY,
+    src_slug   TEXT NOT NULL,
+    dst_slug   TEXT NOT NULL,
+    edge_type  TEXT NOT NULL DEFAULT 'wikilink',
+    src_doc    TEXT NOT NULL,
+    src_line   INTEGER NOT NULL DEFAULT 0,
+    weight     REAL NOT NULL DEFAULT 1.0,
+    updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
+);
+CREATE INDEX IF NOT EXISTS brain_edges_src_idx
+    ON brain_edges (src_slug, edge_type);
+CREATE INDEX IF NOT EXISTS brain_edges_dst_idx
+    ON brain_edges (dst_slug, edge_type);
+CREATE INDEX IF NOT EXISTS brain_edges_src_doc_idx
+    ON brain_edges (src_doc);
+`
+	_, err := s.pool.Exec(ctx, ddl)
+	return err
+}
+
+// UpsertEntity inserts or updates one entity by slug.
+func (s *PGStore) UpsertEntity(ctx context.Context, e graph.Entity) error {
+	if e.Slug == "" {
+		return errors.New("entity slug is required")
+	}
+	if e.Type == "" {
+		e.Type = "knowledge"
+	}
+	_, err := s.pool.Exec(ctx, `
+        INSERT INTO brain_entities (slug, type, wing, hall, doc_path, title, tier, topic, updated_at)
+        VALUES ($1, $2, $3, $4, $5, $6, $7, $8, now())
+        ON CONFLICT (slug) DO UPDATE
+        SET type       = EXCLUDED.type,
+            wing       = EXCLUDED.wing,
+            hall       = EXCLUDED.hall,
+            doc_path   = EXCLUDED.doc_path,
+            title      = EXCLUDED.title,
+            tier       = EXCLUDED.tier,
+            topic      = EXCLUDED.topic,
+            updated_at = now()
+    `, e.Slug, e.Type, e.Wing, e.Hall, e.DocPath, e.Title, e.Tier, e.Topic)
+	if err != nil {
+		return fmt.Errorf("upsert entity %q: %w", e.Slug, err)
+	}
+	return nil
+}
+
+// ReplaceEdgesForDoc deletes every edge previously emitted from docPath
+// and inserts the new set in one transaction. Caller should pass the
+// complete edge set for the doc — partial updates are not supported.
+func (s *PGStore) ReplaceEdgesForDoc(ctx context.Context, docPath string, edges []graph.Edge) error {
+	if docPath == "" {
+		return errors.New("doc path is required")
+	}
+	tx, err := s.pool.BeginTx(ctx, pgx.TxOptions{})
+	if err != nil {
+		return fmt.Errorf("begin: %w", err)
+	}
+	defer func() { _ = tx.Rollback(ctx) }()
+
+	if _, err := tx.Exec(ctx, `DELETE FROM brain_edges WHERE src_doc = $1`, docPath); err != nil {
+		return fmt.Errorf("delete prior edges for %q: %w", docPath, err)
+	}
+	for _, e := range edges {
+		if e.SrcSlug == "" || e.DstSlug == "" {
+			continue
+		}
+		if _, err := tx.Exec(ctx, `
+            INSERT INTO brain_edges (src_slug, dst_slug, edge_type, src_doc, src_line, weight)
+            VALUES ($1, $2, $3, $4, $5, 1.0)
+        `, e.SrcSlug, e.DstSlug, e.EdgeType, e.SrcDoc, e.SrcLine); err != nil {
+			return fmt.Errorf("insert edge %s->%s: %w", e.SrcSlug, e.DstSlug, err)
+		}
+	}
+	if err := tx.Commit(ctx); err != nil {
+		return fmt.Errorf("commit: %w", err)
+	}
+	return nil
+}
+
+// DeleteByDoc removes the entity at docPath and every edge it sourced.
+// Use when a wiki page is deleted on disk.
+func (s *PGStore) DeleteByDoc(ctx context.Context, docPath string) error {
+	if docPath == "" {
+		return errors.New("doc path is required")
+	}
+	tx, err := s.pool.BeginTx(ctx, pgx.TxOptions{})
+	if err != nil {
+		return fmt.Errorf("begin: %w", err)
+	}
+	defer func() { _ = tx.Rollback(ctx) }()
+
+	if _, err := tx.Exec(ctx, `DELETE FROM brain_edges WHERE src_doc = $1`, docPath); err != nil {
+		return fmt.Errorf("delete edges: %w", err)
+	}
+	if _, err := tx.Exec(ctx, `DELETE FROM brain_entities WHERE doc_path = $1`, docPath); err != nil {
+		return fmt.Errorf("delete entity: %w", err)
+	}
+	return tx.Commit(ctx)
+}
+
+// Neighbor is one row in a Neighbors / Subgraph response.
+type Neighbor struct {
+	Slug     string
+	Type     string
+	Wing     string
+	Hall     string
+	DocPath  string
+	Title    string
+	EdgeType string
+	Distance int // hop count from origin; 1 for direct neighbors
+}
+
+// Neighbors returns the direct (1-hop) outgoing neighbours of slug.
+// edgeType filters by relationship kind; "" returns all kinds.
+// limit defaults to 25 when <= 0.
+func (s *PGStore) Neighbors(ctx context.Context, slug, edgeType string, limit int) ([]Neighbor, error) {
+	if slug == "" {
+		return nil, errors.New("slug is required")
+	}
+	if limit <= 0 {
+		limit = 25
+	}
+	q := `
+SELECT e.dst_slug, COALESCE(t.type,''), COALESCE(t.wing,''), COALESCE(t.hall,''),
+       COALESCE(t.doc_path,''), COALESCE(t.title,''), e.edge_type, 1
+FROM brain_edges e
+LEFT JOIN brain_entities t ON t.slug = e.dst_slug
+WHERE e.src_slug = $1
+  AND ($2 = '' OR e.edge_type = $2)
+ORDER BY e.updated_at DESC
+LIMIT $3
+`
+	rows, err := s.pool.Query(ctx, q, slug, edgeType, limit)
+	if err != nil {
+		return nil, fmt.Errorf("query neighbors: %w", err)
+	}
+	defer rows.Close()
+	return scanNeighbors(rows)
+}
+
+// Subgraph returns every distinct slug reachable from origin within
+// depth outgoing hops, annotated with the shortest hop distance. The
+// origin itself is omitted. depth defaults to 2 when <= 0; values
+// above 6 are clamped to 6 to bound traversal cost.
+func (s *PGStore) Subgraph(ctx context.Context, origin string, depth int) ([]Neighbor, error) {
+	if origin == "" {
+		return nil, errors.New("origin slug is required")
+	}
+	if depth <= 0 {
+		depth = 2
+	}
+	if depth > 6 {
+		depth = 6
+	}
+	q := `
+WITH RECURSIVE walk(slug, edge_type, distance) AS (
+    SELECT e.dst_slug, e.edge_type, 1
+    FROM brain_edges e
+    WHERE e.src_slug = $1
+  UNION
+    SELECT e.dst_slug, e.edge_type, w.distance + 1
+    FROM walk w
+    JOIN brain_edges e ON e.src_slug = w.slug
+    WHERE w.distance < $2
+)
+SELECT w.slug, COALESCE(t.type,''), COALESCE(t.wing,''), COALESCE(t.hall,''),
+       COALESCE(t.doc_path,''), COALESCE(t.title,''), w.edge_type, MIN(w.distance)
+FROM walk w
+LEFT JOIN brain_entities t ON t.slug = w.slug
+WHERE w.slug <> $1
+GROUP BY w.slug, t.type, t.wing, t.hall, t.doc_path, t.title, w.edge_type
+ORDER BY MIN(w.distance), w.slug
+`
+	rows, err := s.pool.Query(ctx, q, origin, depth)
+	if err != nil {
+		return nil, fmt.Errorf("query subgraph: %w", err)
+	}
+	defer rows.Close()
+	return scanNeighbors(rows)
+}
+
+// PathStep is one hop in a Path response.
+type PathStep struct {
+	FromSlug string
+	ToSlug   string
+	EdgeType string
+}
+
+// Path returns the shortest directed path from src to dst within
+// maxDepth hops, as an ordered list of edges. Empty slice means no
+// path exists. maxDepth defaults to 4 when <= 0; values above 8 are
+// clamped to 8.
+func (s *PGStore) Path(ctx context.Context, src, dst string, maxDepth int) ([]PathStep, error) {
+	if src == "" || dst == "" {
+		return nil, errors.New("src and dst are required")
+	}
+	if maxDepth <= 0 {
+		maxDepth = 4
+	}
+	if maxDepth > 8 {
+		maxDepth = 8
+	}
+	q := `
+WITH RECURSIVE walk(cur, path_slugs, path_edges, distance) AS (
+    SELECT e.dst_slug,
+           ARRAY[e.src_slug, e.dst_slug]::TEXT[],
+           ARRAY[e.edge_type]::TEXT[],
+           1
+    FROM brain_edges e
+    WHERE e.src_slug = $1
+  UNION ALL
+    SELECT e.dst_slug,
+           w.path_slugs || e.dst_slug,
+           w.path_edges || e.edge_type,
+           w.distance + 1
+    FROM walk w
+    JOIN brain_edges e ON e.src_slug = w.cur
+    WHERE w.distance < $3
+      AND NOT (e.dst_slug = ANY(w.path_slugs))
+)
+SELECT path_slugs, path_edges
+FROM walk
+WHERE cur = $2
+ORDER BY distance ASC
+LIMIT 1
+`
+	row := s.pool.QueryRow(ctx, q, src, dst, maxDepth)
+	var (
+		slugs []string
+		kinds []string
+	)
+	if err := row.Scan(&slugs, &kinds); err != nil {
+		if errors.Is(err, pgx.ErrNoRows) {
+			return nil, nil
+		}
+		return nil, fmt.Errorf("scan path: %w", err)
+	}
+	if len(slugs) < 2 || len(kinds) == 0 {
+		return nil, nil
+	}
+	steps := make([]PathStep, 0, len(kinds))
+	for i := 0; i < len(kinds) && i+1 < len(slugs); i++ {
+		steps = append(steps, PathStep{
+			FromSlug: slugs[i],
+			ToSlug:   slugs[i+1],
+			EdgeType: kinds[i],
+		})
+	}
+	return steps, nil
+}
+
+// CountEdges is a debug helper — returns the total edges currently stored.
+// Used by tests and by the volume-gate diagnostic.
+func (s *PGStore) CountEdges(ctx context.Context) (int64, error) {
+	var n int64
+	err := s.pool.QueryRow(ctx, `SELECT count(*) FROM brain_edges`).Scan(&n)
+	return n, err
+}
+
+func scanNeighbors(rows pgx.Rows) ([]Neighbor, error) {
+	var out []Neighbor
+	for rows.Next() {
+		var n Neighbor
+		if err := rows.Scan(
+			&n.Slug, &n.Type, &n.Wing, &n.Hall,
+			&n.DocPath, &n.Title, &n.EdgeType, &n.Distance,
+		); err != nil {
+			return nil, fmt.Errorf("scan: %w", err)
+		}
+		out = append(out, n)
+	}
+	return out, rows.Err()
+}
--- a/ingestion/internal/graphsync/graphsync.go
+++ b/ingestion/internal/graphsync/graphsync.go
@@ -0,0 +1,112 @@
+// Package graphsync glues the disk-resident brain markdown documents to
+// the relational graph in [graphstore]. It is a tiny seam so that the
+// MCP handlers can call one function after every successful write or
+// ingest without having to know either the parser or the postgres
+// schema.
+//
+// Every operation is best-effort from the caller's perspective: if the
+// graph store is unconfigured or the doc parses to nothing usable, the
+// helpers return nil. Real database errors are surfaced so the caller
+// can log them.
+package graphsync
+
+import (
+	"context"
+	"fmt"
+	"os"
+	"path/filepath"
+
+	"github.com/mathiasbq/hyperguild/ingestion/internal/graph"
+	"github.com/mathiasbq/hyperguild/ingestion/internal/graphstore"
+)
+
+// Store is the subset of graphstore.PGStore that graphsync requires.
+// Tests can substitute a fake by satisfying this interface.
+type Store interface {
+	UpsertEntity(ctx context.Context, e graph.Entity) error
+	ReplaceEdgesForDoc(ctx context.Context, docPath string, edges []graph.Edge) error
+	DeleteByDoc(ctx context.Context, docPath string) error
+}
+
+// Compile-time assertion that *graphstore.PGStore satisfies Store.
+var _ Store = (*graphstore.PGStore)(nil)
+
+// IndexDoc reads docPath under brainDir and pushes one Entity + its
+// outgoing wikilink Edges into store. relPath must be the
+// forward-slash path relative to brainDir (the same shape returned by
+// api.WriteNote).
+//
+// nil store is a valid no-op so callers can wire the helper
+// unconditionally and let configuration decide whether the graph is
+// populated.
+func IndexDoc(ctx context.Context, store Store, brainDir, relPath string) error {
+	if store == nil {
+		return nil
+	}
+	if relPath == "" {
+		return nil
+	}
+	abs := filepath.Join(brainDir, filepath.FromSlash(relPath))
+	content, err := os.ReadFile(abs)
+	if err != nil {
+		return fmt.Errorf("read %q: %w", relPath, err)
+	}
+	ent, edges, ok := graph.Extract(relPath, content)
+	if !ok {
+		return nil
+	}
+	if err := store.UpsertEntity(ctx, ent); err != nil {
+		return fmt.Errorf("upsert entity: %w", err)
+	}
+	if err := store.ReplaceEdgesForDoc(ctx, relPath, edges); err != nil {
+		return fmt.Errorf("replace edges: %w", err)
+	}
+	return nil
+}
+
+// BackfillFromBrainDir walks every markdown file under brainDir/wiki/
+// and brainDir/knowledge/, parses each, and upserts the resulting
+// Entity + Edges. Existing rows are overwritten; orphan rows for
+// already-deleted files are NOT cleaned up — call this only on a
+// fresh store, or follow with a separate prune pass.
+//
+// Intended for one-shot startup runs against a populated brain dir.
+// Cost scales linearly with corpus size; ~30 wiki pages plus the
+// knowledge corpus is a few hundred ms.
+func BackfillFromBrainDir(ctx context.Context, store Store, brainDir string) (indexed int, _ error) {
+	if store == nil {
+		return 0, nil
+	}
+	roots := []string{"wiki", "knowledge"}
+	for _, root := range roots {
+		base := filepath.Join(brainDir, root)
+		if _, err := os.Stat(base); os.IsNotExist(err) {
+			continue
+		}
+		err := filepath.WalkDir(base, func(path string, d os.DirEntry, walkErr error) error {
+			if walkErr != nil {
+				return walkErr
+			}
+			if d.IsDir() {
+				return nil
+			}
+			if filepath.Ext(path) != ".md" {
+				return nil
+			}
+			rel, relErr := filepath.Rel(brainDir, path)
+			if relErr != nil {
+				return fmt.Errorf("rel %q: %w", path, relErr)
+			}
+			rel = filepath.ToSlash(rel)
+			if err := IndexDoc(ctx, store, brainDir, rel); err != nil {
+				return fmt.Errorf("index %q: %w", rel, err)
+			}
+			indexed++
+			return nil
+		})
+		if err != nil {
+			return indexed, fmt.Errorf("walk %s: %w", root, err)
+		}
+	}
+	return indexed, nil
+}
--- a/ingestion/internal/graphsync/graphsync_test.go
+++ b/ingestion/internal/graphsync/graphsync_test.go
@@ -0,0 +1,134 @@
+package graphsync
+
+import (
+	"context"
+	"errors"
+	"os"
+	"path/filepath"
+	"sync"
+	"testing"
+
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+
+	"github.com/mathiasbq/hyperguild/ingestion/internal/graph"
+)
+
+// fakeStore captures the calls IndexDoc / BackfillFromBrainDir made.
+type fakeStore struct {
+	mu       sync.Mutex
+	upserts  []graph.Entity
+	replaces map[string][]graph.Edge
+	deletes  []string
+	failOn   string // upsert fails when entity slug == failOn
+}
+
+func newFakeStore() *fakeStore {
+	return &fakeStore{replaces: make(map[string][]graph.Edge)}
+}
+
+func (f *fakeStore) UpsertEntity(_ context.Context, e graph.Entity) error {
+	f.mu.Lock()
+	defer f.mu.Unlock()
+	if f.failOn != "" && e.Slug == f.failOn {
+		return errors.New("synthetic failure")
+	}
+	f.upserts = append(f.upserts, e)
+	return nil
+}
+
+func (f *fakeStore) ReplaceEdgesForDoc(_ context.Context, docPath string, edges []graph.Edge) error {
+	f.mu.Lock()
+	defer f.mu.Unlock()
+	f.replaces[docPath] = append([]graph.Edge(nil), edges...)
+	return nil
+}
+
+func (f *fakeStore) DeleteByDoc(_ context.Context, docPath string) error {
+	f.mu.Lock()
+	defer f.mu.Unlock()
+	f.deletes = append(f.deletes, docPath)
+	return nil
+}
+
+func writeBrain(t *testing.T, brainDir, relPath, body string) {
+	t.Helper()
+	full := filepath.Join(brainDir, filepath.FromSlash(relPath))
+	require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
+	require.NoError(t, os.WriteFile(full, []byte(body), 0o644))
+}
+
+func TestIndexDoc_UpsertsEntityAndEdges(t *testing.T) {
+	tmp := t.TempDir()
+	writeBrain(t, tmp, "wiki/concepts/foo.md", `---
+title: Foo
+---
+# Foo
+Linking to [[bar]] and [[baz|Baz]].
+`)
+	fs := newFakeStore()
+	require.NoError(t, IndexDoc(context.Background(), fs, tmp, "wiki/concepts/foo.md"))
+
+	require.Len(t, fs.upserts, 1)
+	assert.Equal(t, "foo", fs.upserts[0].Slug)
+	assert.Equal(t, "concept", fs.upserts[0].Type)
+
+	edges := fs.replaces["wiki/concepts/foo.md"]
+	require.Len(t, edges, 2)
+	assert.Equal(t, "bar", edges[0].DstSlug)
+	assert.Equal(t, "baz", edges[1].DstSlug)
+}
+
+func TestIndexDoc_NoopOnNilStore(t *testing.T) {
+	require.NoError(t, IndexDoc(context.Background(), nil, "anywhere", "foo.md"))
+}
+
+func TestIndexDoc_NoopOnEmptyRelPath(t *testing.T) {
+	fs := newFakeStore()
+	require.NoError(t, IndexDoc(context.Background(), fs, "anywhere", ""))
+	assert.Empty(t, fs.upserts)
+}
+
+func TestIndexDoc_ErrorsOnMissingFile(t *testing.T) {
+	fs := newFakeStore()
+	err := IndexDoc(context.Background(), fs, t.TempDir(), "wiki/nope.md")
+	require.Error(t, err)
+}
+
+func TestIndexDoc_SurfacesStoreFailure(t *testing.T) {
+	tmp := t.TempDir()
+	writeBrain(t, tmp, "wiki/concepts/boom.md", "# Boom\n")
+	fs := newFakeStore()
+	fs.failOn = "boom"
+	err := IndexDoc(context.Background(), fs, tmp, "wiki/concepts/boom.md")
+	require.Error(t, err)
+}
+
+func TestBackfillFromBrainDir_WalksWikiAndKnowledge(t *testing.T) {
+	tmp := t.TempDir()
+	writeBrain(t, tmp, "wiki/concepts/foo.md", "# Foo\n[[bar]]\n")
+	writeBrain(t, tmp, "wiki/entities/bar.md", "# Bar\n")
+	writeBrain(t, tmp, "knowledge/legacy.md", "# Legacy [[foo]]\n")
+	// non-markdown file should be skipped
+	writeBrain(t, tmp, "wiki/concepts/skip.txt", "ignore me")
+
+	fs := newFakeStore()
+	n, err := BackfillFromBrainDir(context.Background(), fs, tmp)
+	require.NoError(t, err)
+	assert.Equal(t, 3, n)
+	assert.Len(t, fs.upserts, 3)
+}
+
+func TestBackfillFromBrainDir_TolerantOfMissingDirs(t *testing.T) {
+	tmp := t.TempDir()
+	fs := newFakeStore()
+	n, err := BackfillFromBrainDir(context.Background(), fs, tmp)
+	require.NoError(t, err)
+	assert.Equal(t, 0, n)
+}
+
+func TestBackfillFromBrainDir_NilStoreNoop(t *testing.T) {
+	n, err := BackfillFromBrainDir(context.Background(), nil, t.TempDir())
+	require.NoError(t, err)
+	assert.Equal(t, 0, n)
+}
--- a/ingestion/internal/mcp/auth.go
+++ b/ingestion/internal/mcp/auth.go
@@ -1,65 +0,0 @@
-package mcp
-
-import (
-	"crypto/subtle"
-	"net/http"
-	"strings"
-
-	"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
-)
-
-// BearerAuth gates an HTTP handler behind dual-mode authentication.
-//
-// Auth precedence:
-//
-//  1. Static Bearer match (constant-time compare against staticToken).
-//     Wins immediately and never emits a WWW-Authenticate header. This is
-//     the path used by internal Tailscale/LAN CLI callers that supply
-//     `Authorization: Bearer $BRAIN_MCP_TOKEN` via `.mcp.json`. Returning
-//     200 without a WWW-Authenticate prevents the MCP client from
-//     speculatively flipping into OAuth-discovery mode.
-//  2. Dex JWT validation (when validator is non-nil). Used by claude.ai
-//     custom MCP connectors that finished the OAuth handshake.
-//  3. Otherwise 401. When resourceMetadataURL is non-empty, a
-//     `WWW-Authenticate: Bearer resource_metadata="…"` header is emitted
-//     per RFC 9728 §6.2 so claude.ai's OAuth discovery flow can find the
-//     server's protected-resource metadata document.
-//
-// The order matters: a valid static Bearer must short-circuit BEFORE any
-// JWT path runs, because a non-empty WWW-Authenticate emitted on the
-// fall-through 401 confuses static-Bearer-only clients into discarding
-// their header and starting an OAuth handshake instead.
-func BearerAuth(staticToken string, validator *auth.Validator, resourceMetadataURL string, next http.Handler) http.Handler {
-	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
-		rawToken, ok := strings.CutPrefix(r.Header.Get("Authorization"), "Bearer ")
-		if !ok {
-			unauthorized(w, resourceMetadataURL)
-			return
-		}
-
-		// 1. Static Bearer wins first — never emits a challenge.
-		if staticToken != "" && subtle.ConstantTimeCompare([]byte(rawToken), []byte(staticToken)) == 1 {
-			next.ServeHTTP(w, r)
-			return
-		}
-
-		// 2. Then Dex JWT, if configured.
-		if validator != nil {
-			if _, err := validator.Validate(r.Context(), rawToken); err == nil {
-				next.ServeHTTP(w, r)
-				return
-			}
-		}
-
-		// 3. Reject with an OAuth resource-metadata challenge if configured.
-		unauthorized(w, resourceMetadataURL)
-	})
-}
-
-func unauthorized(w http.ResponseWriter, resourceMetadataURL string) {
-	if resourceMetadataURL != "" {
-		w.Header().Set("WWW-Authenticate",
-			`Bearer realm="brain", resource_metadata="`+resourceMetadataURL+`"`)
-	}
-	http.Error(w, "unauthorized", http.StatusUnauthorized)
-}
--- a/ingestion/internal/mcp/auth_test.go
+++ b/ingestion/internal/mcp/auth_test.go
@@ -1,202 +0,0 @@
-package mcp_test
-
-import (
-	"context"
-	"crypto/rand"
-	"crypto/rsa"
-	"encoding/json"
-	"net/http"
-	"net/http/httptest"
-	"testing"
-	"time"
-
-	"github.com/lestrrat-go/jwx/v2/jwa"
-	"github.com/lestrrat-go/jwx/v2/jwk"
-	"github.com/lestrrat-go/jwx/v2/jwt"
-	"github.com/mathiasbq/hyperguild/ingestion/internal/auth"
-	"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
-	"github.com/stretchr/testify/assert"
-	"github.com/stretchr/testify/require"
-)
-
-const testResourceMetadataURL = "https://brain-mcp.d-ma.be/.well-known/oauth-protected-resource"
-
-func okHandler() http.Handler {
-	return http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
-		w.WriteHeader(http.StatusOK)
-	})
-}
-
-func TestBearerAuth_MissingHeader(t *testing.T) {
-	handler := mcp.BearerAuth("secret", nil, "", okHandler())
-	req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
-	rr := httptest.NewRecorder()
-	handler.ServeHTTP(rr, req)
-	assert.Equal(t, http.StatusUnauthorized, rr.Code)
-}
-
-func TestBearerAuth_WrongToken(t *testing.T) {
-	handler := mcp.BearerAuth("secret", nil, "", okHandler())
-	req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
-	req.Header.Set("Authorization", "Bearer wrong")
-	rr := httptest.NewRecorder()
-	handler.ServeHTTP(rr, req)
-	assert.Equal(t, http.StatusUnauthorized, rr.Code)
-}
-
-func TestBearerAuth_CorrectToken(t *testing.T) {
-	called := false
-	handler := mcp.BearerAuth("secret", nil, "", http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
-		called = true
-		w.WriteHeader(http.StatusOK)
-	}))
-	req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
-	req.Header.Set("Authorization", "Bearer secret")
-	rr := httptest.NewRecorder()
-	handler.ServeHTTP(rr, req)
-	assert.Equal(t, http.StatusOK, rr.Code)
-	assert.True(t, called)
-}
-
-func TestBearerAuth_EmptyConfiguredToken(t *testing.T) {
-	handler := mcp.BearerAuth("", nil, "", okHandler())
-	req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
-	rr := httptest.NewRecorder()
-	handler.ServeHTTP(rr, req)
-	assert.Equal(t, http.StatusUnauthorized, rr.Code)
-}
-
-// Issue #9: a valid static Bearer must never emit a WWW-Authenticate header,
-// even when a resource-metadata URL is configured. The presence of that
-// header on a 200 response would flip MCP CLI clients into OAuth-discovery
-// mode and break static-Bearer auth from `.mcp.json` on Tailscale/LAN.
-func TestBearerAuth_ValidStaticBearer_NoWWWAuthenticate(t *testing.T) {
-	handler := mcp.BearerAuth("secret", nil, testResourceMetadataURL, okHandler())
-	req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
-	req.Header.Set("Authorization", "Bearer secret")
-	rr := httptest.NewRecorder()
-	handler.ServeHTTP(rr, req)
-	assert.Equal(t, http.StatusOK, rr.Code)
-	assert.Empty(t, rr.Header().Get("WWW-Authenticate"), "static-Bearer 200 must not advertise OAuth")
-}
-
-// Issue #9: a 401 with resource-metadata configured must emit a
-// WWW-Authenticate header so claude.ai discovers the protected-resource
-// metadata document and continues the OAuth dance.
-func TestBearerAuth_Unauthorized_EmitsResourceMetadataChallenge(t *testing.T) {
-	handler := mcp.BearerAuth("secret", nil, testResourceMetadataURL, okHandler())
-	req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
-	rr := httptest.NewRecorder()
-	handler.ServeHTTP(rr, req)
-	assert.Equal(t, http.StatusUnauthorized, rr.Code)
-	got := rr.Header().Get("WWW-Authenticate")
-	assert.Contains(t, got, `Bearer realm="brain"`)
-	assert.Contains(t, got, `resource_metadata="`+testResourceMetadataURL+`"`)
-}
-
-// Static-Bearer-only deployment: no resource-metadata URL, no challenge
-// header on 401 — matches pre-#9 behaviour for tests without Dex wired.
-func TestBearerAuth_Unauthorized_NoChallengeWhenResourceUnset(t *testing.T) {
-	handler := mcp.BearerAuth("secret", nil, "", okHandler())
-	req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
-	rr := httptest.NewRecorder()
-	handler.ServeHTTP(rr, req)
-	assert.Equal(t, http.StatusUnauthorized, rr.Code)
-	assert.Empty(t, rr.Header().Get("WWW-Authenticate"))
-}
-
-// JWT auth tests
-
-func buildOIDCServer(t *testing.T) (*httptest.Server, jwk.Key) {
-	t.Helper()
-	raw, err := rsa.GenerateKey(rand.Reader, 2048)
-	require.NoError(t, err)
-	priv, err := jwk.FromRaw(raw)
-	require.NoError(t, err)
-	require.NoError(t, priv.Set(jwk.KeyIDKey, "k1"))
-	require.NoError(t, priv.Set(jwk.AlgorithmKey, jwa.RS256))
-	pub, err := jwk.PublicKeyOf(priv)
-	require.NoError(t, err)
-
-	set := jwk.NewSet()
-	require.NoError(t, set.AddKey(pub))
-	jwksBytes, err := json.Marshal(set)
-	require.NoError(t, err)
-
-	muxSrv := http.NewServeMux()
-	var srv *httptest.Server
-	muxSrv.HandleFunc("/.well-known/openid-configuration", func(w http.ResponseWriter, _ *http.Request) {
-		_ = json.NewEncoder(w).Encode(map[string]string{
-			"issuer":   srv.URL,
-			"jwks_uri": srv.URL + "/jwks",
-		})
-	})
-	muxSrv.HandleFunc("/jwks", func(w http.ResponseWriter, _ *http.Request) {
-		_, _ = w.Write(jwksBytes)
-	})
-	srv = httptest.NewServer(muxSrv)
-	t.Cleanup(srv.Close)
-	return srv, priv
-}
-
-func signJWT(t *testing.T, priv jwk.Key, issuer, audience string, exp time.Time) string {
-	t.Helper()
-	tok, err := jwt.NewBuilder().
-		Issuer(issuer).Audience([]string{audience}).
-		Subject("s").Expiration(exp).
-		Build()
-	require.NoError(t, err)
-	signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, priv))
-	require.NoError(t, err)
-	return string(signed)
-}
-
-func TestBearerAuth_ValidJWT(t *testing.T) {
-	oidcSrv, priv := buildOIDCServer(t)
-	v, err := auth.NewValidator(oidcSrv.URL, "brain")
-	require.NoError(t, err)
-
-	called := false
-	handler := mcp.BearerAuth("static-secret", v, "", http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
-		called = true
-		w.WriteHeader(http.StatusOK)
-	}))
-
-	token := signJWT(t, priv, oidcSrv.URL, "brain", time.Now().Add(time.Hour))
-	req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
-	req.Header.Set("Authorization", "Bearer "+token)
-	rr := httptest.NewRecorder()
-	handler.ServeHTTP(rr, req)
-	assert.Equal(t, http.StatusOK, rr.Code)
-	assert.True(t, called)
-}
-
-func TestBearerAuth_InvalidJWT_FallsBackToStaticToken(t *testing.T) {
-	oidcSrv, _ := buildOIDCServer(t)
-	v, err := auth.NewValidator(oidcSrv.URL, "brain")
-	require.NoError(t, err)
-
-	handler := mcp.BearerAuth("static-secret", v, "", okHandler())
-	req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
-	req.Header.Set("Authorization", "Bearer static-secret")
-	rr := httptest.NewRecorder()
-	handler.ServeHTTP(rr, req)
-	assert.Equal(t, http.StatusOK, rr.Code)
-}
-
-func TestBearerAuth_InvalidJWT_WrongStaticToken(t *testing.T) {
-	oidcSrv, priv := buildOIDCServer(t)
-	v, err := auth.NewValidator(oidcSrv.URL, "brain")
-	require.NoError(t, err)
-
-	handler := mcp.BearerAuth("static-secret", v, "", okHandler())
-	// Expired JWT — JWT fails, static token doesn't match either
-	token := signJWT(t, priv, oidcSrv.URL, "brain", time.Now().Add(-time.Hour))
-	req := httptest.NewRequest(http.MethodPost, "/mcp", nil)
-	req.Header.Set("Authorization", "Bearer "+token)
-
-	_ = context.Background() // satisfies import
-	rr := httptest.NewRecorder()
-	handler.ServeHTTP(rr, req)
-	assert.Equal(t, http.StatusUnauthorized, rr.Code)
-}
--- a/ingestion/internal/mcp/handlers.go
+++ b/ingestion/internal/mcp/handlers.go
@@ -12,6 +12,7 @@ import (
 	"github.com/mathiasbq/hyperguild/ingestion/internal/api"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/brain"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/extract"
+	"github.com/mathiasbq/hyperguild/ingestion/internal/graphsync"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/search"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/session"
@@ -108,6 +109,32 @@ func (s *Server) tools() []map[string]any {
 				"text": str("raw document text to classify (first 3000 chars used)"),
 			}),
 		},
+		{
+			"name":        "brain_graph",
+			"description": "Query the brain knowledge graph (entities + wikilink edges). Op selects the traversal: neighbors (1-hop outgoing from slug), subgraph (every reachable slug within depth hops), or path (shortest directed path src→dst). Returns slug + entity metadata + edge_type + hop distance.",
+			"inputSchema": schema([]string{"op"}, map[string]any{
+				"op":        enum("traversal kind", "neighbors", "subgraph", "path"),
+				"slug":      str("origin slug for op=neighbors or op=subgraph"),
+				"src":       str("source slug for op=path"),
+				"dst":       str("destination slug for op=path"),
+				"edge_type": str("optional edge type filter for op=neighbors (e.g. wikilink); empty matches all"),
+				"limit":     int_("max neighbors to return for op=neighbors, default 25"),
+				"depth":     int_("max traversal depth for op=subgraph (default 2, clamped to 6) and op=path (default 4, clamped to 8)"),
+			}),
+		},
+		{
+			"name":        "brain_context",
+			"description": "Return top-N relevant brain entries for a project context. Use at session start or before a complex task to load prior decisions, corrections, and surprises.",
+			"inputSchema": schema([]string{"project_root"}, map[string]any{
+				"project_root": str("absolute path to the project root"),
+				"recent_files": map[string]any{
+					"type":        "array",
+					"items":       map[string]any{"type": "string"},
+					"description": "optional: recent file paths in the project to bias relevance",
+				},
+				"limit": int_("max entries to return, default 10"),
+			}),
+		},
 		{
 			"name":        "session_log",
 			"description": "Append a structured entry to brain/sessions/<session_id>.jsonl.",
@@ -194,9 +221,23 @@ func (s *Server) brainWrite(ctx context.Context, args json.RawMessage) (json.Raw
 			slog.Warn("brain_write: auto-tunnel failed", "src", relPath, "err", err)
 		}
 	}
+	s.indexInGraph(ctx, "brain_write", relPath)
 	return json.Marshal(map[string]string{"path": relPath})
 }

+// indexInGraph is a best-effort wrapper around graphsync.IndexDoc that
+// logs failures but never propagates them — the underlying write/ingest
+// has already succeeded and the graph is an augmentation, not a
+// correctness invariant.
+func (s *Server) indexInGraph(ctx context.Context, op, relPath string) {
+	if s.graph == nil || relPath == "" {
+		return
+	}
+	if err := graphsync.IndexDoc(ctx, s.graph, s.brainDir, relPath); err != nil {
+		slog.Warn(op+": graph index failed", "path", relPath, "err", err)
+	}
+}
+
 type brainTunnelArgs struct {
 	Source string `json:"source"`
 	Target string `json:"target"`
@@ -213,6 +254,8 @@ func (s *Server) brainTunnel(ctx context.Context, args json.RawMessage) (json.Ra
 	if err := brain.WriteTunnel(s.brainDir, a.Source, a.Target); err != nil {
 		return nil, fmt.Errorf("tunnel: %w", err)
 	}
+	s.indexInGraph(ctx, "brain_tunnel", a.Source)
+	s.indexInGraph(ctx, "brain_tunnel", a.Target)
 	return json.Marshal(map[string]string{"status": "ok"})
 }

@@ -268,6 +311,11 @@ func (s *Server) brainIngestRaw(ctx context.Context, args json.RawMessage) (json
 	if warnings == nil {
 		warnings = []string{}
 	}
+	if !a.DryRun {
+		for _, p := range pages {
+			s.indexInGraph(ctx, "brain_ingest_raw", p)
+		}
+	}
 	return json.Marshal(map[string]any{"pages": pages, "warnings": warnings})
 }

@@ -358,6 +406,11 @@ func (s *Server) runIngest(ctx context.Context, content, source string, dryRun b
 	if pages == nil {
 		pages = []string{}
 	}
+	if !dryRun {
+		for _, p := range pages {
+			s.indexInGraph(ctx, "brain_ingest", p)
+		}
+	}
 	warnings := result.Warnings
 	if warnings == nil {
 		warnings = []string{}
--- a/ingestion/internal/mcp/server.go
+++ b/ingestion/internal/mcp/server.go
@@ -1,6 +1,7 @@
 // Package mcp implements an MCP HTTP handler for the ingestion service.
 // Exposed tools: brain_query, brain_write, brain_index, brain_tunnel,
-// brain_ingest, brain_ingest_raw, brain_answer, brain_classify, session_log.
+// brain_ingest, brain_ingest_raw, brain_answer, brain_classify,
+// brain_graph, brain_context, session_log.
 package mcp

 import (
@@ -9,6 +10,8 @@ import (
 	"fmt"
 	"net/http"

+	"github.com/mathiasbq/hyperguild/ingestion/internal/graphstore"
+	"github.com/mathiasbq/hyperguild/ingestion/internal/graphsync"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/reranker"
 	"github.com/mathiasbq/hyperguild/ingestion/internal/search"
@@ -42,6 +45,7 @@ type Server struct {
 	reranker  *reranker.Client      // nil = no rerank, BM25 top-10 → LLM
 	vector    search.VectorSearcher // nil = BM25-only retrieval
 	embedder  search.Embedder       // nil = BM25-only retrieval
+	graph     graphsync.Store       // nil = brain_graph and GraphRAG augmentation disabled
 }

 // NewServer constructs a Server bound to brainDir. pipelineCfg supplies the
@@ -73,6 +77,19 @@ func (s *Server) WithHybridRetrieval(v search.VectorSearcher, e search.Embedder)
 	return s
 }

+// WithGraph wires the brain entities + edges store so every successful
+// brain_write / brain_ingest / brain_tunnel re-indexes its written docs
+// into the graph, and so brain_graph + GraphRAG-augmented brain_answer
+// are available. nil disables graph features and is the legacy default.
+func (s *Server) WithGraph(g *graphstore.PGStore) *Server {
+	if g == nil {
+		s.graph = nil
+		return s
+	}
+	s.graph = g
+	return s
+}
+
 func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
 	// MCP streamable HTTP: GET establishes the SSE stream for server-to-client events.
 	if r.Method == http.MethodGet {
@@ -174,6 +191,10 @@ func (s *Server) handleCall(ctx context.Context, name string, args json.RawMessa
 		return s.brainAnswer(ctx, args)
 	case "brain_classify":
 		return s.brainClassify(ctx, args)
+	case "brain_graph":
+		return s.brainGraph(ctx, args)
+	case "brain_context":
+		return s.brainContext(ctx, args)
 	default:
 		return nil, fmt.Errorf("unknown tool: %s", name)
 	}
--- a/ingestion/internal/mcp/server_test.go
+++ b/ingestion/internal/mcp/server_test.go
@@ -57,7 +57,8 @@ func TestServerToolsList(t *testing.T) {
 	assert.ElementsMatch(t, []string{
 		"brain_query", "brain_write", "brain_index", "brain_tunnel",
 		"brain_ingest_raw", "brain_ingest",
-		"brain_answer", "brain_classify", "session_log",
+		"brain_answer", "brain_classify", "brain_graph", "brain_context",
+		"session_log",
 	}, names)
 }

--- a/ingestion/internal/mcp/tools_answer.go
+++ b/ingestion/internal/mcp/tools_answer.go
@@ -96,6 +96,29 @@ func (s *Server) brainAnswer(ctx context.Context, args json.RawMessage) (json.Ra
 		sources = append(sources, r.Path)
 	}

+	// GraphRAG augmentation: when the graph is wired, attach the 1-hop
+	// outgoing neighbourhood of the top BM25/rerank hit as an extra
+	// context block. The LLM can ignore it when irrelevant; when the
+	// neighbour adds signal we don't need a second retrieval pass.
+	// Failures are silently skipped — graph is augmentation, not
+	// correctness.
+	if reader, ok := s.graph.(graphReader); ok && len(results) > 0 {
+		topSlug := slugFromPath(results[0].Path)
+		if topSlug != "" {
+			if ns, gerr := reader.Subgraph(ctx, topSlug, 1); gerr == nil && len(ns) > 0 {
+				sb.WriteString("<related>\n")
+				for _, n := range ns {
+					label := n.Title
+					if label == "" {
+						label = n.Slug
+					}
+					fmt.Fprintf(&sb, "- %s (%s) at %s\n", label, n.EdgeType, n.DocPath)
+				}
+				sb.WriteString("</related>\n\n")
+			}
+		}
+	}
+
 	answer, err := s.answerLLM(ctx, answerSystemPrompt, sb.String()+"Question: "+a.Query)
 	if err != nil {
 		return nil, fmt.Errorf("llm: %w", err)
@@ -107,6 +130,25 @@ func (s *Server) brainAnswer(ctx context.Context, args json.RawMessage) (json.Ra
 	})
 }

+// slugFromPath converts "wiki/concepts/foo.md" → "foo".
+// Returns "" when path has no .md suffix or empty basename.
+func slugFromPath(path string) string {
+	if path == "" {
+		return ""
+	}
+	// strip directory
+	for i := len(path) - 1; i >= 0; i-- {
+		if path[i] == '/' {
+			path = path[i+1:]
+			break
+		}
+	}
+	if !strings.HasSuffix(path, ".md") {
+		return ""
+	}
+	return strings.TrimSuffix(path, ".md")
+}
+
 type brainClassifyArgs struct {
 	Text string `json:"text"`
 }
--- a/ingestion/internal/mcp/tools_context.go
+++ b/ingestion/internal/mcp/tools_context.go
@@ -0,0 +1,202 @@
+package mcp
+
+import (
+	"context"
+	"encoding/json"
+	"fmt"
+	"os"
+	"path/filepath"
+	"sort"
+	"strings"
+
+	"github.com/mathiasbq/hyperguild/ingestion/internal/search"
+)
+
+// brainContextArgs is the input shape of brain_context. project_root is
+// required; recent_files biases ranking when provided; limit caps the
+// returned set (default 10).
+type brainContextArgs struct {
+	ProjectRoot string   `json:"project_root"`
+	RecentFiles []string `json:"recent_files,omitempty"`
+	Limit       int      `json:"limit,omitempty"`
+}
+
+// contextEntry is one returned brain entry: the slug, its title,
+// frontmatter-stripped excerpt, source (bm25|graph), and a final score
+// used for ranking before truncation to Limit.
+type contextEntry struct {
+	Slug     string  `json:"slug"`
+	Title    string  `json:"title"`
+	DocPath  string  `json:"doc_path"`
+	Excerpt  string  `json:"excerpt"`
+	EdgeType string  `json:"edge_type"`
+	Score    float64 `json:"score"`
+}
+
+// brainContext returns top-N brain entries relevant to a project context.
+// It runs a BM25 query against the project name, takes the top-3 hits as
+// seeds, expands each seed 2 hops in the brain graph (when configured),
+// then merges and deduplicates by slug. recent_files optionally boosts
+// entries whose doc_path matches a recent file basename.
+func (s *Server) brainContext(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
+	var a brainContextArgs
+	if err := json.Unmarshal(args, &a); err != nil {
+		return nil, fmt.Errorf("parse args: %w", err)
+	}
+	if a.ProjectRoot == "" {
+		return nil, fmt.Errorf("project_root is required")
+	}
+	limit := a.Limit
+	if limit <= 0 {
+		limit = 10
+	}
+
+	projectName := filepath.Base(strings.TrimRight(a.ProjectRoot, "/"))
+	if projectName == "" || projectName == "." || projectName == "/" {
+		return nil, fmt.Errorf("project_root has no usable basename: %q", a.ProjectRoot)
+	}
+
+	// Seed BM25 hits on the project name. Take top-3 as graph expansion seeds.
+	bm25, err := search.QueryContext(ctx, s.brainDir, search.QueryOptions{
+		Query:    projectName,
+		Limit:    3,
+		Vector:   s.vector,
+		Embedder: s.embedder,
+	})
+	if err != nil {
+		return nil, fmt.Errorf("search: %w", err)
+	}
+
+	// Dedup by slug while merging BM25 hits and graph neighbours.
+	bySlug := make(map[string]*contextEntry)
+	// BM25 score: highest rank gets the largest score, decaying linearly.
+	// Score 3.0 / 2.0 / 1.0 for ranks 0/1/2 respectively.
+	for i, r := range bm25 {
+		slug := slugFromPath(r.Path)
+		if slug == "" {
+			continue
+		}
+		score := float64(len(bm25) - i)
+		bySlug[slug] = &contextEntry{
+			Slug:     slug,
+			Title:    r.Title,
+			DocPath:  r.Path,
+			Excerpt:  truncateExcerpt(r.Excerpt, 200),
+			EdgeType: "bm25",
+			Score:    score,
+		}
+	}
+
+	// Graph expansion: for each BM25 hit, fetch its 2-hop subgraph and
+	// merge those neighbours in with a graph score that decays with hop
+	// distance. Failures are silently dropped — graph augmentation is
+	// best-effort.
+	if reader, ok := s.graph.(graphReader); ok {
+		for _, r := range bm25 {
+			seed := slugFromPath(r.Path)
+			if seed == "" {
+				continue
+			}
+			ns, gerr := reader.Subgraph(ctx, seed, 2)
+			if gerr != nil {
+				continue
+			}
+			for _, n := range ns {
+				if n.Slug == "" || n.Slug == seed {
+					continue
+				}
+				// Graph score: closer hops carry more signal. Distance 1
+				// scores 0.6, distance 2 scores 0.3.
+				gscore := 0.6 / float64(max1(n.Distance))
+				if existing, ok := bySlug[n.Slug]; ok {
+					// Already surfaced via BM25 — bump its score so that
+					// BM25 + graph evidence outranks BM25-only hits.
+					existing.Score += gscore
+					continue
+				}
+				bySlug[n.Slug] = &contextEntry{
+					Slug:     n.Slug,
+					Title:    n.Title,
+					DocPath:  n.DocPath,
+					Excerpt:  readExcerpt(s.brainDir, n.DocPath, 200),
+					EdgeType: "graph",
+					Score:    gscore,
+				}
+			}
+		}
+	}
+
+	// Optional recent_files boost: +1 to entries whose doc_path basename
+	// matches any recent file basename. v1 is intentionally simple.
+	if len(a.RecentFiles) > 0 {
+		recent := make(map[string]struct{}, len(a.RecentFiles))
+		for _, f := range a.RecentFiles {
+			recent[filepath.Base(f)] = struct{}{}
+		}
+		for _, e := range bySlug {
+			if _, hit := recent[filepath.Base(e.DocPath)]; hit {
+				e.Score += 1.0
+			}
+		}
+	}
+
+	// Flatten and sort by score desc, slug asc as a stable tiebreaker.
+	entries := make([]contextEntry, 0, len(bySlug))
+	for _, e := range bySlug {
+		entries = append(entries, *e)
+	}
+	sort.SliceStable(entries, func(i, j int) bool {
+		if entries[i].Score != entries[j].Score {
+			return entries[i].Score > entries[j].Score
+		}
+		return entries[i].Slug < entries[j].Slug
+	})
+	if len(entries) > limit {
+		entries = entries[:limit]
+	}
+
+	return json.Marshal(map[string]any{"entries": entries})
+}
+
+// truncateExcerpt clamps an already-stripped excerpt to maxLen characters
+// without re-running the frontmatter parser. The ellipsis suffix matches
+// the convention used in search.excerpt.
+func truncateExcerpt(s string, maxLen int) string {
+	if len(s) <= maxLen {
+		return s
+	}
+	return s[:maxLen] + "…"
+}
+
+// readExcerpt loads a doc relative to brainDir, strips its frontmatter,
+// and returns the first maxLen chars. Returns "" on any error — the
+// excerpt is informational, not load-bearing for correctness.
+func readExcerpt(brainDir, relPath string, maxLen int) string {
+	if relPath == "" {
+		return ""
+	}
+	full := filepath.Join(brainDir, filepath.FromSlash(relPath))
+	content, err := os.ReadFile(full)
+	if err != nil {
+		return ""
+	}
+	parts := strings.SplitN(string(content), "---", 3)
+	body := string(content)
+	if len(parts) == 3 {
+		body = strings.TrimSpace(parts[2])
+	}
+	if len(body) > maxLen {
+		return body[:maxLen] + "…"
+	}
+	return body
+}
+
+// max1 returns the maximum of n and 1, used to guard against divide-by-zero
+// on graph distance and to give self-references (distance 0) a sensible
+// score instead of an infinity.
+func max1(n int) int {
+	if n < 1 {
+		return 1
+	}
+	return n
+}
--- a/ingestion/internal/mcp/tools_context_test.go
+++ b/ingestion/internal/mcp/tools_context_test.go
@@ -0,0 +1,212 @@
+package mcp
+
+import (
+	"context"
+	"encoding/json"
+	"os"
+	"path/filepath"
+	"sort"
+	"testing"
+
+	"github.com/mathiasbq/hyperguild/ingestion/internal/graph"
+	"github.com/mathiasbq/hyperguild/ingestion/internal/graphstore"
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
+
+// fakeGraph implements graphsync.Store + graphReader so it can be
+// assigned to Server.graph and downcast by brainContext. Only Subgraph
+// is exercised by brain_context today; the rest are no-op satisfiers.
+type fakeGraph struct {
+	subgraph map[string][]graphstore.Neighbor
+}
+
+func (f *fakeGraph) UpsertEntity(_ context.Context, _ graph.Entity) error { return nil }
+func (f *fakeGraph) ReplaceEdgesForDoc(_ context.Context, _ string, _ []graph.Edge) error {
+	return nil
+}
+func (f *fakeGraph) DeleteByDoc(_ context.Context, _ string) error { return nil }
+
+func (f *fakeGraph) Neighbors(_ context.Context, slug, _ string, _ int) ([]graphstore.Neighbor, error) {
+	return f.subgraph[slug], nil
+}
+
+func (f *fakeGraph) Subgraph(_ context.Context, origin string, _ int) ([]graphstore.Neighbor, error) {
+	return f.subgraph[origin], nil
+}
+
+func (f *fakeGraph) Path(_ context.Context, _, _ string, _ int) ([]graphstore.PathStep, error) {
+	return nil, nil
+}
+
+func writeNote(t *testing.T, brainDir, relPath, title, body string) {
+	t.Helper()
+	full := filepath.Join(brainDir, filepath.FromSlash(relPath))
+	require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
+	content := "---\ntitle: " + title + "\n---\n\n" + body
+	require.NoError(t, os.WriteFile(full, []byte(content), 0o644))
+}
+
+// callContext runs brainContext directly and decodes the JSON response.
+func callContext(t *testing.T, s *Server, args map[string]any) map[string]any {
+	t.Helper()
+	raw, err := json.Marshal(args)
+	require.NoError(t, err)
+	out, err := s.brainContext(context.Background(), raw)
+	require.NoError(t, err)
+	var resp map[string]any
+	require.NoError(t, json.Unmarshal(out, &resp))
+	return resp
+}
+
+func sortedSlugs(entries []any) []string {
+	slugs := make([]string, 0, len(entries))
+	for _, e := range entries {
+		slugs = append(slugs, e.(map[string]any)["slug"].(string))
+	}
+	sort.Strings(slugs)
+	return slugs
+}
+
+func TestBrainContext_RejectsMissingProjectRoot(t *testing.T) {
+	s := NewServer(t.TempDir(), nil, nil, nil)
+	_, err := s.brainContext(context.Background(), json.RawMessage(`{}`))
+	assert.Error(t, err)
+}
+
+func TestBrainContext_RejectsUnusableBasename(t *testing.T) {
+	s := NewServer(t.TempDir(), nil, nil, nil)
+	_, err := s.brainContext(context.Background(), json.RawMessage(`{"project_root":"/"}`))
+	assert.Error(t, err)
+}
+
+func TestBrainContext_BM25Only_NoGraph(t *testing.T) {
+	brainDir := t.TempDir()
+	// Two notes whose body contains the hyphenated project name. BM25
+	// uses literal substring matching after whitespace tokenisation, so
+	// the bodies must carry "azure-tiger" verbatim, not "Azure tiger".
+	writeNote(t, brainDir, "wiki/finance/decisions/azure-tiger-routing.md",
+		"Azure Tiger Routing", "azure-tiger payment routing decisions.")
+	writeNote(t, brainDir, "wiki/finance/facts/iso20022.md",
+		"Azure Tiger ISO 20022 fields", "azure-tiger maps invoice fields to ISO 20022.")
+
+	s := NewServer(brainDir, nil, nil, nil)
+	// graph is nil — only BM25 hits should appear.
+
+	resp := callContext(t, s, map[string]any{
+		"project_root": "/home/mathias/dev/QKX/azure-tiger",
+	})
+	entries := resp["entries"].([]any)
+	require.NotEmpty(t, entries, "expected at least one BM25 hit on project name")
+
+	for _, e := range entries {
+		entry := e.(map[string]any)
+		assert.Equal(t, "bm25", entry["edge_type"], "no graph configured, every entry must be BM25")
+		assert.NotEmpty(t, entry["slug"])
+		assert.NotEmpty(t, entry["doc_path"])
+	}
+}
+
+func TestBrainContext_BM25PlusGraphExpansion(t *testing.T) {
+	brainDir := t.TempDir()
+	// BM25 seed — body carries the hyphenated project name verbatim.
+	writeNote(t, brainDir, "wiki/finance/decisions/azure-tiger-routing.md",
+		"Azure Tiger Routing", "azure-tiger payment routing decisions.")
+	// Graph neighbour — does NOT match BM25 on "azure-tiger" so it can
+	// only arrive via the graph subgraph traversal.
+	writeNote(t, brainDir, "wiki/finance/facts/sepa-clearing.md",
+		"SEPA Clearing", "SEPA payment clearing rules and timing windows.")
+
+	graphFake := &fakeGraph{
+		subgraph: map[string][]graphstore.Neighbor{
+			"azure-tiger-routing": {
+				{
+					Slug:     "sepa-clearing",
+					Title:    "SEPA Clearing",
+					DocPath:  "wiki/finance/facts/sepa-clearing.md",
+					EdgeType: "wikilink",
+					Distance: 1,
+				},
+			},
+		},
+	}
+	s := NewServer(brainDir, nil, nil, nil)
+	s.graph = graphFake
+
+	resp := callContext(t, s, map[string]any{
+		"project_root": "/home/mathias/dev/QKX/azure-tiger",
+	})
+	entries := resp["entries"].([]any)
+	require.GreaterOrEqual(t, len(entries), 2, "expected BM25 seed plus graph neighbour")
+
+	slugs := sortedSlugs(entries)
+	assert.Contains(t, slugs, "azure-tiger-routing", "BM25 seed must appear")
+	assert.Contains(t, slugs, "sepa-clearing", "graph neighbour must appear")
+
+	// Verify the graph-only entry carries edge_type="graph".
+	var sepaEntry map[string]any
+	for _, e := range entries {
+		m := e.(map[string]any)
+		if m["slug"] == "sepa-clearing" {
+			sepaEntry = m
+			break
+		}
+	}
+	require.NotNil(t, sepaEntry)
+	assert.Equal(t, "graph", sepaEntry["edge_type"])
+	assert.NotEmpty(t, sepaEntry["excerpt"], "excerpt should be loaded from disk for graph neighbours")
+}
+
+func TestBrainContext_LimitClamps(t *testing.T) {
+	brainDir := t.TempDir()
+	// Five notes all matching "azure-tiger".
+	for i, name := range []string{"a", "b", "c", "d", "e"} {
+		writeNote(t, brainDir,
+			"wiki/finance/decisions/azure-tiger-"+name+".md",
+			"Azure Tiger "+name,
+			"azure-tiger note "+name+" with index "+string(rune('0'+i)))
+	}
+	s := NewServer(brainDir, nil, nil, nil)
+	resp := callContext(t, s, map[string]any{
+		"project_root": "/home/mathias/dev/QKX/azure-tiger",
+		"limit":        2,
+	})
+	entries := resp["entries"].([]any)
+	assert.LessOrEqual(t, len(entries), 2)
+}
+
+func TestBrainContext_RecentFilesBoost(t *testing.T) {
+	brainDir := t.TempDir()
+	// Both notes BM25-match the project name, but azure-tiger-z has
+	// twice the term frequency so it naturally ranks above azure-tiger-a.
+	// The recent_files boost on azure-tiger-a should pull it level on
+	// score; the alphabetical slug tiebreaker (a < z) then promotes it
+	// to the top — exercising both the boost and the deterministic
+	// tiebreak.
+	writeNote(t, brainDir, "wiki/finance/decisions/azure-tiger-a.md",
+		"A", "azure-tiger note about a.")
+	writeNote(t, brainDir, "wiki/finance/decisions/azure-tiger-z.md",
+		"Z", "azure-tiger azure-tiger note about z.")
+
+	s := NewServer(brainDir, nil, nil, nil)
+
+	// Baseline ranking: azure-tiger-z must lead (higher term frequency).
+	baseline := callContext(t, s, map[string]any{
+		"project_root": "/home/mathias/dev/QKX/azure-tiger",
+	})
+	baselineEntries := baseline["entries"].([]any)
+	require.GreaterOrEqual(t, len(baselineEntries), 2)
+	baselineTop := baselineEntries[0].(map[string]any)
+	require.Equal(t, "azure-tiger-z", baselineTop["slug"],
+		"sanity: higher tf must rank first without a boost")
+
+	// With boost on azure-tiger-a — boosted entry must now lead.
+	boosted := callContext(t, s, map[string]any{
+		"project_root": "/home/mathias/dev/QKX/azure-tiger",
+		"recent_files": []string{"/some/where/azure-tiger-a.md"},
+	})
+	entries := boosted["entries"].([]any)
+	require.GreaterOrEqual(t, len(entries), 2)
+	top := entries[0].(map[string]any)
+	assert.Equal(t, "azure-tiger-a", top["slug"], "recent_files boost must promote the matching doc")
+}
--- a/ingestion/internal/mcp/tools_graph.go
+++ b/ingestion/internal/mcp/tools_graph.go
@@ -0,0 +1,116 @@
+package mcp
+
+import (
+	"context"
+	"encoding/json"
+	"fmt"
+
+	"github.com/mathiasbq/hyperguild/ingestion/internal/graphstore"
+)
+
+// graphReader is the read-side surface of graphstore.PGStore the
+// brain_graph handler needs. Splitting it out (vs. depending on the
+// concrete *PGStore) lets tests inject a fake without standing up
+// postgres, and keeps the write-side graphsync.Store interface free
+// of query concerns.
+type graphReader interface {
+	Neighbors(ctx context.Context, slug, edgeType string, limit int) ([]graphstore.Neighbor, error)
+	Subgraph(ctx context.Context, origin string, depth int) ([]graphstore.Neighbor, error)
+	Path(ctx context.Context, src, dst string, maxDepth int) ([]graphstore.PathStep, error)
+}
+
+// Compile-time check that *graphstore.PGStore satisfies graphReader.
+var _ graphReader = (*graphstore.PGStore)(nil)
+
+type brainGraphArgs struct {
+	Op       string `json:"op"`
+	Slug     string `json:"slug,omitempty"`
+	Src      string `json:"src,omitempty"`
+	Dst      string `json:"dst,omitempty"`
+	EdgeType string `json:"edge_type,omitempty"`
+	Limit    int    `json:"limit,omitempty"`
+	Depth    int    `json:"depth,omitempty"`
+}
+
+func (s *Server) brainGraph(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
+	reader, ok := s.graph.(graphReader)
+	if s.graph == nil || !ok {
+		return nil, fmt.Errorf("brain graph not configured: set BRAIN_GRAPH_ENABLED=true")
+	}
+	var a brainGraphArgs
+	if err := json.Unmarshal(args, &a); err != nil {
+		return nil, fmt.Errorf("parse args: %w", err)
+	}
+
+	switch a.Op {
+	case "neighbors":
+		if a.Slug == "" {
+			return nil, fmt.Errorf("slug is required for op=neighbors")
+		}
+		ns, err := reader.Neighbors(ctx, a.Slug, a.EdgeType, a.Limit)
+		if err != nil {
+			return nil, fmt.Errorf("neighbors: %w", err)
+		}
+		return json.Marshal(map[string]any{"results": neighborsView(ns)})
+
+	case "subgraph":
+		if a.Slug == "" {
+			return nil, fmt.Errorf("slug is required for op=subgraph")
+		}
+		ns, err := reader.Subgraph(ctx, a.Slug, a.Depth)
+		if err != nil {
+			return nil, fmt.Errorf("subgraph: %w", err)
+		}
+		return json.Marshal(map[string]any{"results": neighborsView(ns)})
+
+	case "path":
+		if a.Src == "" || a.Dst == "" {
+			return nil, fmt.Errorf("src and dst are required for op=path")
+		}
+		steps, err := reader.Path(ctx, a.Src, a.Dst, a.Depth)
+		if err != nil {
+			return nil, fmt.Errorf("path: %w", err)
+		}
+		return json.Marshal(map[string]any{"steps": pathView(steps)})
+
+	default:
+		return nil, fmt.Errorf("unknown op %q (want neighbors|subgraph|path)", a.Op)
+	}
+}
+
+type neighborView struct {
+	Slug     string `json:"slug"`
+	Type     string `json:"type,omitempty"`
+	Wing     string `json:"wing,omitempty"`
+	Hall     string `json:"hall,omitempty"`
+	DocPath  string `json:"doc_path,omitempty"`
+	Title    string `json:"title,omitempty"`
+	EdgeType string `json:"edge_type"`
+	Distance int    `json:"distance"`
+}
+
+func neighborsView(ns []graphstore.Neighbor) []neighborView {
+	out := make([]neighborView, 0, len(ns))
+	for _, n := range ns {
+		out = append(out, neighborView{
+			Slug: n.Slug, Type: n.Type, Wing: n.Wing, Hall: n.Hall,
+			DocPath: n.DocPath, Title: n.Title,
+			EdgeType: n.EdgeType, Distance: n.Distance,
+		})
+	}
+	return out
+}
+
+type pathStepView struct {
+	From     string `json:"from"`
+	To       string `json:"to"`
+	EdgeType string `json:"edge_type"`
+}
+
+func pathView(steps []graphstore.PathStep) []pathStepView {
+	out := make([]pathStepView, 0, len(steps))
+	for _, s := range steps {
+		out = append(out, pathStepView{From: s.FromSlug, To: s.ToSlug, EdgeType: s.EdgeType})
+	}
+	return out
+}
--- a/ingestion/internal/metrics/metrics.go
+++ b/ingestion/internal/metrics/metrics.go
@@ -0,0 +1,194 @@
+// Package metrics is a tiny Prometheus exposition layer.
+//
+// Hand-rolled rather than pulling in github.com/prometheus/client_golang
+// to keep ingestion's dependency surface minimal (stdlib + jwx + testify
+// per the repo CLAUDE.md). The single histogram + counter it emits cover
+// the canary alert wired in k3s/apps/monitoring/ — see infra#50.
+//
+// Wire format follows the OpenMetrics text exposition that
+// kube-prometheus-stack scrapes by default.
+package metrics
+
+import (
+	"fmt"
+	"net/http"
+	"sort"
+	"strings"
+	"sync"
+	"sync/atomic"
+	"time"
+)
+
+// histogram buckets in seconds. Tuned for in-cluster HTTP API
+// latencies: BM25 query is sub-10ms, hybrid retrieval + LLM-synthesis
+// can run into seconds. +Inf catch-all is implicit.
+var defaultBuckets = []float64{
+	0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10,
+}
+
+// Registry holds one histogram (request latency) labeled by path + status
+// and one counter (request total) with the same labels. Concurrent-safe.
+type Registry struct {
+	mu      sync.RWMutex
+	series  map[labelKey]*series
+	buckets []float64
+}
+
+type labelKey struct{ path, status string }
+
+type series struct {
+	// One atomic counter per bucket (counts of observations ≤ bucket).
+	// counts[len(buckets)] = +Inf bucket (== total observations).
+	counts []atomic.Uint64
+	sumNs  atomic.Uint64 // sum of durations in nanoseconds
+}
+
+// New returns a Registry pre-populated with no series; the first
+// observation per (path, status) lazy-creates one.
+func New() *Registry {
+	return &Registry{
+		series:  make(map[labelKey]*series),
+		buckets: defaultBuckets,
+	}
+}
+
+// Observe records a single request duration for the given path + status.
+func (r *Registry) Observe(path, status string, d time.Duration) {
+	key := labelKey{path: path, status: status}
+
+	r.mu.RLock()
+	s := r.series[key]
+	r.mu.RUnlock()
+
+	if s == nil {
+		r.mu.Lock()
+		s = r.series[key]
+		if s == nil {
+			s = &series{counts: make([]atomic.Uint64, len(r.buckets)+1)}
+			r.series[key] = s
+		}
+		r.mu.Unlock()
+	}
+
+	secs := d.Seconds()
+	for i, b := range r.buckets {
+		if secs <= b {
+			s.counts[i].Add(1)
+		}
+	}
+	// +Inf bucket always increments.
+	s.counts[len(r.buckets)].Add(1)
+	s.sumNs.Add(uint64(d.Nanoseconds()))
+}
+
+// Middleware wraps next, observing every request's duration + status.
+// The metric label `path` uses the request's Pattern (Go 1.22+ ServeMux),
+// falling back to the URL path if no Pattern is set. Pattern keeps
+// cardinality bounded (one series per route, not one per unique URL).
+func (r *Registry) Middleware(next http.Handler) http.Handler {
+	return http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) {
+		rec := &statusRecorder{ResponseWriter: w, code: http.StatusOK}
+		start := time.Now()
+		next.ServeHTTP(rec, req)
+		path := req.Pattern
+		if path == "" {
+			path = req.URL.Path
+		}
+		r.Observe(path, statusClass(rec.code), time.Since(start))
+	})
+}
+
+// Handler exposes /metrics in OpenMetrics text format.
+func (r *Registry) Handler() http.HandlerFunc {
+	return func(w http.ResponseWriter, req *http.Request) {
+		w.Header().Set("Content-Type", "text/plain; version=0.0.4; charset=utf-8")
+		r.write(w)
+	}
+}
+
+func (r *Registry) write(w http.ResponseWriter) {
+	r.mu.RLock()
+	defer r.mu.RUnlock()
+
+	_, _ = fmt.Fprintln(w, "# HELP brain_query_duration_seconds Brain HTTP API request latency in seconds.")
+	_, _ = fmt.Fprintln(w, "# TYPE brain_query_duration_seconds histogram")
+
+	// Sort keys for stable output (helps diffing scrape responses).
+	keys := make([]labelKey, 0, len(r.series))
+	for k := range r.series {
+		keys = append(keys, k)
+	}
+	sort.Slice(keys, func(i, j int) bool {
+		if keys[i].path != keys[j].path {
+			return keys[i].path < keys[j].path
+		}
+		return keys[i].status < keys[j].status
+	})
+
+	for _, k := range keys {
+		s := r.series[k]
+		labels := fmt.Sprintf(`path=%q,status=%q`, k.path, k.status)
+		for i, b := range r.buckets {
+			_, _ = fmt.Fprintf(w, "brain_query_duration_seconds_bucket{%s,le=%q} %d\n",
+				labels, formatBucket(b), s.counts[i].Load())
+		}
+		// +Inf bucket
+		inf := s.counts[len(r.buckets)].Load()
+		_, _ = fmt.Fprintf(w, "brain_query_duration_seconds_bucket{%s,le=\"+Inf\"} %d\n", labels, inf)
+		_, _ = fmt.Fprintf(w, "brain_query_duration_seconds_sum{%s} %s\n",
+			labels, formatSeconds(s.sumNs.Load()))
+		_, _ = fmt.Fprintf(w, "brain_query_duration_seconds_count{%s} %d\n", labels, inf)
+	}
+}
+
+func formatBucket(b float64) string {
+	// Match Prometheus convention: no trailing zeros.
+	s := fmt.Sprintf("%g", b)
+	if !strings.ContainsAny(s, ".e") {
+		s = s + ".0"
+	}
+	return s
+}
+
+func formatSeconds(ns uint64) string {
+	return fmt.Sprintf("%g", float64(ns)/1e9)
+}
+
+func statusClass(code int) string {
+	switch {
+	case code >= 200 && code < 300:
+		return "2xx"
+	case code >= 300 && code < 400:
+		return "3xx"
+	case code >= 400 && code < 500:
+		return "4xx"
+	case code >= 500 && code < 600:
+		return "5xx"
+	default:
+		return "xxx"
+	}
+}
+
+// statusRecorder captures the response code so middleware can label
+// the histogram by status class without buffering the body.
+type statusRecorder struct {
+	http.ResponseWriter
+	code        int
+	wroteHeader bool
+}
+
+func (r *statusRecorder) WriteHeader(code int) {
+	if r.wroteHeader {
+		return
+	}
+	r.code = code
+	r.wroteHeader = true
+	r.ResponseWriter.WriteHeader(code)
+}
+
+func (r *statusRecorder) Write(b []byte) (int, error) {
+	if !r.wroteHeader {
+		r.WriteHeader(http.StatusOK)
+	}
+	return r.ResponseWriter.Write(b)
+}
--- a/ingestion/internal/metrics/metrics_test.go
+++ b/ingestion/internal/metrics/metrics_test.go
@@ -0,0 +1,119 @@
+package metrics
+
+import (
+	"io"
+	"net/http"
+	"net/http/httptest"
+	"strings"
+	"testing"
+	"time"
+)
+
+func TestRegistry_ObserveAndExpose(t *testing.T) {
+	t.Parallel()
+
+	r := New()
+	// Three observations on the same series; one falls into each
+	// representative band.
+	r.Observe("/query", "2xx", 4*time.Millisecond)   // ≤ 5ms
+	r.Observe("/query", "2xx", 20*time.Millisecond)  // ≤ 25ms
+	r.Observe("/query", "2xx", 600*time.Millisecond) // ≤ 1s
+
+	req := httptest.NewRequest(http.MethodGet, "/metrics", nil)
+	rec := httptest.NewRecorder()
+	r.Handler().ServeHTTP(rec, req)
+
+	body := rec.Body.String()
+
+	mustContain := []string{
+		`# TYPE brain_query_duration_seconds histogram`,
+		`brain_query_duration_seconds_bucket{path="/query",status="2xx",le="0.005"} 1`,
+		`brain_query_duration_seconds_bucket{path="/query",status="2xx",le="0.025"} 2`,
+		`brain_query_duration_seconds_bucket{path="/query",status="2xx",le="1.0"} 3`,
+		`brain_query_duration_seconds_bucket{path="/query",status="2xx",le="+Inf"} 3`,
+		`brain_query_duration_seconds_count{path="/query",status="2xx"} 3`,
+	}
+	for _, want := range mustContain {
+		if !strings.Contains(body, want) {
+			t.Errorf("missing line: %q\n--- body ---\n%s", want, body)
+		}
+	}
+
+	if got := rec.Header().Get("Content-Type"); !strings.HasPrefix(got, "text/plain") {
+		t.Errorf("content-type = %q, want text/plain prefix", got)
+	}
+}
+
+func TestRegistry_LabelsByStatus(t *testing.T) {
+	t.Parallel()
+
+	r := New()
+	r.Observe("/query", "2xx", time.Millisecond)
+	r.Observe("/query", "5xx", time.Millisecond)
+	r.Observe("/write", "2xx", time.Millisecond)
+
+	rec := httptest.NewRecorder()
+	r.Handler().ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/metrics", nil))
+	body := rec.Body.String()
+
+	for _, want := range []string{
+		`brain_query_duration_seconds_count{path="/query",status="2xx"} 1`,
+		`brain_query_duration_seconds_count{path="/query",status="5xx"} 1`,
+		`brain_query_duration_seconds_count{path="/write",status="2xx"} 1`,
+	} {
+		if !strings.Contains(body, want) {
+			t.Errorf("missing %q in body:\n%s", want, body)
+		}
+	}
+}
+
+func TestMiddleware_RecordsTiming(t *testing.T) {
+	t.Parallel()
+
+	r := New()
+	handler := r.Middleware(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
+		time.Sleep(2 * time.Millisecond)
+		w.WriteHeader(http.StatusOK)
+		_, _ = io.WriteString(w, "ok")
+	}))
+
+	srv := httptest.NewServer(handler)
+	defer srv.Close()
+
+	resp, err := http.Get(srv.URL + "/query")
+	if err != nil {
+		t.Fatalf("get: %v", err)
+	}
+	_ = resp.Body.Close()
+
+	if resp.StatusCode != http.StatusOK {
+		t.Fatalf("status %d, want 200", resp.StatusCode)
+	}
+
+	// Exposition should now include /query.
+	rec := httptest.NewRecorder()
+	r.Handler().ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/metrics", nil))
+	body := rec.Body.String()
+	if !strings.Contains(body, `path="/query"`) {
+		t.Errorf("expected /query series, got body:\n%s", body)
+	}
+	if !strings.Contains(body, `status="2xx"`) {
+		t.Errorf("expected 2xx status class, got body:\n%s", body)
+	}
+}
+
+func TestStatusRecorder_DefaultsTo200(t *testing.T) {
+	t.Parallel()
+
+	r := New()
+	handler := r.Middleware(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
+		_, _ = w.Write([]byte("hello"))
+	}))
+
+	rec := httptest.NewRecorder()
+	handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/x", nil))
+
+	if rec.Code != http.StatusOK {
+		t.Errorf("code %d, want 200", rec.Code)
+	}
+}
--- a/ingestion/internal/search/search.go
+++ b/ingestion/internal/search/search.go
@@ -12,6 +12,7 @@ import (
 	"strings"

 	"github.com/mathiasbq/hyperguild/ingestion/internal/brain"
+	"github.com/mathiasbq/hyperguild/ingestion/internal/vectorstore"
 )

 // VectorSearcher returns the top-limit nearest paths by cosine
@@ -42,6 +43,30 @@ type Result struct {
 	Score   int    `json:"score"`
 	Wing    string `json:"wing,omitempty"`
 	Hall    string `json:"hall,omitempty"`
+	// Tier is the DIKW classification used for retrieval weighting
+	// (infra#72). Read from frontmatter when present, otherwise
+	// inferred from the parent directory.
+	Tier string `json:"tier,omitempty"`
+}
+
+// tierWeight maps the DIKW tier to a score multiplier applied right
+// before the final truncation. Knowledge entries (focused lessons that
+// age well) get boosted; inbox entries (raw captures, sessions, clips)
+// get demoted. Empty / unknown tiers keep the original BM25 score
+// (multiplier 1.0). See infra#72 for the failure mode this addresses:
+// short focused entries lose to long aggregate dump-files under
+// raw BM25 ranking.
+func tierWeight(tier string) float64 {
+	switch tier {
+	case "knowledge":
+		return 1.5
+	case "note":
+		return 1.0
+	case "inbox":
+		return 0.3
+	default:
+		return 1.0
+	}
 }

 // QueryOptions configures a search.
@@ -119,6 +144,7 @@ func QueryContext(ctx context.Context, brainDir string, opts QueryOptions) ([]Re
 			}
 			rel = filepath.ToSlash(rel)
 			wing, hall := extractWingHall(string(content), rel)
+			tier := extractTier(string(content), rel)
 			results = append(results, Result{
 				Path:    rel,
 				Title:   extractTitle(string(content), d.Name()),
@@ -126,6 +152,7 @@ func QueryContext(ctx context.Context, brainDir string, opts QueryOptions) ([]Re
 				Score:   score,
 				Wing:    wing,
 				Hall:    hall,
+				Tier:    tier,
 			})
 			return nil
 		})
@@ -149,6 +176,15 @@ func QueryContext(ctx context.Context, brainDir string, opts QueryOptions) ([]Re
 		}
 	}

+	// Tier-weighted final re-rank (infra#72). Knowledge tier entries
+	// boost ×1.5, inbox demote ×0.3, note stays at ×1.0. Applied after
+	// hybridMerge so RRF ranking still drives candidate generation;
+	// the tier weight only re-orders the merged set.
+	sort.SliceStable(results, func(i, j int) bool {
+		return float64(results[i].Score)*tierWeight(results[i].Tier) >
+			float64(results[j].Score)*tierWeight(results[j].Tier)
+	})
+
 	if len(results) > opts.Limit {
 		results = results[:opts.Limit]
 	}
@@ -186,17 +222,21 @@ func hybridMerge(ctx context.Context, brainDir string, opts QueryOptions, bm25 [
 		byPath[r.Path] = r
 	}
 	for rank, h := range hits {
-		if opts.Wing != "" && !pathInScope(h.Path, opts.Wing, opts.Hall) {
+		// Vector store keys are chunk paths ("wiki/foo.md#0001"); collapse
+		// back to the parent so multiple chunk hits from the same file
+		// score against a single result row.
+		parent := vectorstore.ParentPath(h.Path)
+		if opts.Wing != "" && !pathInScope(parent, opts.Wing, opts.Hall) {
 			continue
 		}
-		rrf[h.Path] += 1.0 / (rrfK + float64(rank+1))
-		if _, seen := byPath[h.Path]; !seen {
-			r, err := hydrate(brainDir, h.Path)
+		rrf[parent] += 1.0 / (rrfK + float64(rank+1))
+		if _, seen := byPath[parent]; !seen {
+			r, err := hydrate(brainDir, parent)
 			if err != nil {
-				slog.Warn("search: hydrate failed for vector hit", "path", h.Path, "err", err)
+				slog.Warn("search: hydrate failed for vector hit", "path", parent, "err", err)
 				continue
 			}
-			byPath[h.Path] = r
+			byPath[parent] = r
 		}
 	}

@@ -230,12 +270,14 @@ func hydrate(brainDir, relPath string) (Result, error) {
 		return Result{}, err
 	}
 	wing, hall := extractWingHall(string(content), relPath)
+	tier := extractTier(string(content), relPath)
 	return Result{
 		Path:    relPath,
 		Title:   extractTitle(string(content), filepath.Base(relPath)),
 		Excerpt: excerpt(string(content), 300),
 		Wing:    wing,
 		Hall:    hall,
+		Tier:    tier,
 	}, nil
 }

@@ -264,6 +306,55 @@ func resolveRoots(brainDir, wing, hall string) ([]string, error) {
 	}, nil
 }

+// extractTier reads the DIKW tier from frontmatter first, falling back
+// to the path prefix mapping (infra#72). Mirrors graph.inferTierFromPath
+// so the two callers stay in lockstep — frontmatter is canonical,
+// path inference is the migration-window fallback.
+func extractTier(content, relPath string) string {
+	scanner := bufio.NewScanner(strings.NewReader(content))
+	inFrontmatter := false
+	for scanner.Scan() {
+		line := scanner.Text()
+		if strings.TrimSpace(line) == "---" {
+			if !inFrontmatter {
+				inFrontmatter = true
+				continue
+			}
+			break
+		}
+		if !inFrontmatter {
+			continue
+		}
+		key, val, ok := strings.Cut(line, ":")
+		if !ok {
+			continue
+		}
+		if strings.TrimSpace(key) == "tier" {
+			return strings.Trim(strings.TrimSpace(val), `"'`)
+		}
+	}
+	parts := strings.Split(relPath, "/")
+	if len(parts) == 0 {
+		return ""
+	}
+	switch parts[0] {
+	case "inbox", "raw", "sessions", "clips":
+		return "inbox"
+	case "notes":
+		return "note"
+	case "wiki":
+		// wiki/entities/ anchor pages map to knowledge (see
+		// graph.inferTierFromPath for the rationale).
+		if len(parts) >= 2 && parts[1] == "entities" {
+			return "knowledge"
+		}
+		return "note"
+	case "knowledge":
+		return "knowledge"
+	}
+	return ""
+}
+
 // extractWingHall reads wing/hall from frontmatter first, falling back to
 // path segments brain/wiki/<wing>/<hall>/.
 func extractWingHall(content, relPath string) (wing, hall string) {
--- a/ingestion/internal/search/search_test.go
+++ b/ingestion/internal/search/search_test.go
@@ -6,6 +6,7 @@ import (
 	"fmt"
 	"os"
 	"path/filepath"
+	"strings"
 	"testing"

 	"github.com/mathiasbq/hyperguild/ingestion/internal/search"
@@ -55,6 +56,36 @@ func TestSearch_HybridRRFPromotesVectorOnlyHit(t *testing.T) {
 	assert.Contains(t, paths, "wiki/jepa-fx/facts/semantic.md")
 }

+func TestSearch_HybridDedupesChunkPathsToParent(t *testing.T) {
+	dir := t.TempDir()
+	full := filepath.Join(dir, "knowledge", "long.md")
+	require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
+	// Body contains the BM25 keyword "alpaca" so hybridMerge actually runs
+	// (it only kicks in when BM25 returns at least one candidate).
+	require.NoError(t, os.WriteFile(full, []byte("---\ntitle: Long\n---\nalpaca content.\n"), 0o644))
+
+	embedder := stubEmbedder{vec: []float32{0.1}}
+	// Vector store returns three chunk-path hits all pointing at the same
+	// parent file. The merged result must surface ONE row per parent — not
+	// three rows with chunk-suffixed paths.
+	vector := stubVector{hits: []search.VectorHit{
+		{Path: "knowledge/long.md#0001", Distance: 0.05},
+		{Path: "knowledge/long.md#0002", Distance: 0.07},
+		{Path: "knowledge/long.md#0003", Distance: 0.09},
+	}}
+
+	got, err := search.Query(dir, search.QueryOptions{
+		Query:    "alpaca",
+		Limit:    5,
+		Vector:   vector,
+		Embedder: embedder,
+	})
+	require.NoError(t, err)
+	require.Len(t, got, 1, "three chunk hits for one parent must merge to one result")
+	assert.Equal(t, "knowledge/long.md", got[0].Path)
+	assert.Equal(t, "Long", got[0].Title)
+}
+
 func TestSearch_HybridFallsBackOnEmbedderError(t *testing.T) {
 	dir := t.TempDir()
 	require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki"), 0o755))
@@ -100,6 +131,29 @@ func TestSearch_ReturnsMatchingPages(t *testing.T) {
 	assert.Contains(t, results[0].Excerpt, "Retry")
 }

+func TestSearch_TierWeightingReordersResults(t *testing.T) {
+	dir := t.TempDir()
+	// A long note-tier dump mentions the keyword many times (high raw
+	// BM25 score); a short knowledge entry mentions it three times.
+	// Raw BM25 prefers the dump; tier weighting (knowledge ×1.5 vs
+	// note ×1.0) flips the order if the score gap is within reach.
+	// note raw = 5 × 2 terms = 10 hits, weight 1.0 → 10
+	// knowledge raw = 4 × 2 terms = 8 hits, weight 1.5 → 12 (overtakes)
+	noteBody := "---\ntier: note\n---\n" + strings.Repeat("scram trap. ", 5)
+	knowledgeBody := "---\ntier: knowledge\n---\n" + strings.Repeat("scram trap. ", 4)
+	require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki", "sources"), 0o755))
+	require.NoError(t, os.MkdirAll(filepath.Join(dir, "knowledge"), 0o755))
+	require.NoError(t, os.WriteFile(filepath.Join(dir, "wiki", "sources", "dump.md"), []byte(noteBody), 0o644))
+	require.NoError(t, os.WriteFile(filepath.Join(dir, "knowledge", "trap.md"), []byte(knowledgeBody), 0o644))
+
+	results, err := search.Query(dir, search.QueryOptions{Query: "scram trap", Limit: 5})
+	require.NoError(t, err)
+	require.GreaterOrEqual(t, len(results), 2)
+	assert.Equal(t, "knowledge/trap.md", results[0].Path, "knowledge tier weight should beat note tier")
+	assert.Equal(t, "knowledge", results[0].Tier)
+	assert.Equal(t, "note", results[1].Tier)
+}
+
 func TestSearch_WingHallScoping(t *testing.T) {
 	dir := t.TempDir()
 	for _, p := range []struct{ rel, body string }{
--- a/ingestion/internal/vectorstore/chunk.go
+++ b/ingestion/internal/vectorstore/chunk.go
@@ -0,0 +1,137 @@
+package vectorstore
+
+import (
+	"fmt"
+	"strings"
+)
+
+// NumberedChunk pairs a chunk's body with the storage path it will use
+// in brain_embeddings. Path format: "<parent>#NNNN" where NNNN is the
+// 1-based chunk index zero-padded to 4 digits.
+type NumberedChunk struct {
+	Path    string
+	Content string
+}
+
+// ParentPath returns the file path with any "#NNNN" chunk suffix removed.
+// Inputs without a "#" are returned unchanged. Used by search to dedupe
+// chunk-level hits back to a single document per result.
+func ParentPath(p string) string {
+	if i := strings.Index(p, "#"); i >= 0 {
+		return p[:i]
+	}
+	return p
+}
+
+// NumberChunks assigns "<parent>#NNNN" storage paths to a slice of chunk
+// bodies, indexed from 0001. Empty chunks are dropped.
+func NumberChunks(parent string, chunks []string) []NumberedChunk {
+	out := make([]NumberedChunk, 0, len(chunks))
+	idx := 1
+	for _, c := range chunks {
+		if strings.TrimSpace(c) == "" {
+			continue
+		}
+		out = append(out, NumberedChunk{
+			Path:    fmt.Sprintf("%s#%04d", parent, idx),
+			Content: c,
+		})
+		idx++
+	}
+	return out
+}
+
+// ChunkMarkdown splits a markdown document into embedding-sized pieces.
+// Strategy:
+//  1. Split at H1/H2 headings (top-of-line "#" or "##"). The intro before
+//     the first heading is its own chunk.
+//  2. Any section larger than maxBytes is further split at paragraph
+//     boundaries (blank lines), packing paragraphs greedily under the
+//     byte budget.
+//
+// The function aims for "fits comfortably under nomic-embed-text's 2048-
+// token context" — at ~4 chars/token for English markdown, maxBytes ≈ 4000
+// is a safe call-site default.
+func ChunkMarkdown(content string, maxBytes int) []string {
+	if maxBytes <= 0 {
+		maxBytes = 4000
+	}
+	sections := splitAtHeadings(content)
+
+	out := make([]string, 0, len(sections))
+	for _, s := range sections {
+		if len(s) <= maxBytes {
+			out = append(out, s)
+			continue
+		}
+		out = append(out, splitAtParagraphs(s, maxBytes)...)
+	}
+	return out
+}
+
+// splitAtHeadings cuts content into sections that each start with an
+// "# " or "## " line (intro before any heading is the leading section).
+func splitAtHeadings(content string) []string {
+	lines := strings.Split(content, "\n")
+	var sections []string
+	var cur strings.Builder
+	flush := func() {
+		if cur.Len() == 0 {
+			return
+		}
+		// Trim all trailing whitespace then re-add a single newline so a
+		// single-paragraph file round-trips to its original content rather
+		// than accumulating extra newlines from the empty-line split.
+		s := strings.TrimRight(cur.String(), "\n")
+		sections = append(sections, s+"\n")
+		cur.Reset()
+	}
+	for _, ln := range lines {
+		trimmed := strings.TrimLeft(ln, " ")
+		isH := strings.HasPrefix(trimmed, "# ") || strings.HasPrefix(trimmed, "## ")
+		if isH && cur.Len() > 0 {
+			flush()
+		}
+		cur.WriteString(ln)
+		cur.WriteByte('\n')
+	}
+	flush()
+	// Drop empty / whitespace-only trailing section (common when content
+	// itself ends with a "\n" — Split leaves a final empty element).
+	if n := len(sections); n > 0 && strings.TrimSpace(sections[n-1]) == "" {
+		sections = sections[:n-1]
+	}
+	return sections
+}
+
+// splitAtParagraphs packs paragraphs (blank-line separated blocks) into
+// sub-chunks of at most maxBytes. A single paragraph that itself exceeds
+// maxBytes is emitted as one over-budget chunk rather than being split
+// mid-sentence — better to over-spend a little than truncate prose.
+func splitAtParagraphs(section string, maxBytes int) []string {
+	paras := strings.Split(section, "\n\n")
+	var out []string
+	var cur strings.Builder
+	for _, p := range paras {
+		if p == "" {
+			continue
+		}
+		// +2 for the "\n\n" rejoin if cur isn't empty
+		need := len(p)
+		if cur.Len() > 0 {
+			need += 2
+		}
+		if cur.Len() > 0 && cur.Len()+need > maxBytes {
+			out = append(out, cur.String())
+			cur.Reset()
+		}
+		if cur.Len() > 0 {
+			cur.WriteString("\n\n")
+		}
+		cur.WriteString(p)
+	}
+	if cur.Len() > 0 {
+		out = append(out, cur.String())
+	}
+	return out
+}
--- a/ingestion/internal/vectorstore/chunk_test.go
+++ b/ingestion/internal/vectorstore/chunk_test.go
@@ -0,0 +1,72 @@
+package vectorstore_test
+
+import (
+	"strings"
+	"testing"
+
+	"github.com/mathiasbq/hyperguild/ingestion/internal/vectorstore"
+	"github.com/stretchr/testify/assert"
+	"github.com/stretchr/testify/require"
+)
+
+func TestChunkMarkdown_ShortFileFitsInOne(t *testing.T) {
+	out := vectorstore.ChunkMarkdown("Just a short paragraph.\n", 4000)
+	require.Len(t, out, 1)
+	assert.Equal(t, "Just a short paragraph.\n", out[0])
+}
+
+func TestChunkMarkdown_SplitsAtHeadings(t *testing.T) {
+	src := "# Top\n\nintro\n\n## A\n\nbody a\n\n## B\n\nbody b\n"
+	out := vectorstore.ChunkMarkdown(src, 50) // tiny limit forces per-section split
+
+	assert.GreaterOrEqual(t, len(out), 2, "should split at H2 boundaries")
+	// Each chunk should start with a heading (top-level intro chunk OK without one)
+	for i, c := range out {
+		if i == 0 {
+			continue
+		}
+		assert.True(t, strings.HasPrefix(strings.TrimSpace(c), "#"),
+			"non-first chunk %d should start with heading: %q", i, c)
+	}
+}
+
+func TestChunkMarkdown_FurtherSplitsOversizedSection(t *testing.T) {
+	// One H2 section with 4 paragraphs of ~80 chars each, limit 100.
+	src := "## big\n\n" +
+		strings.Repeat("paragraph one is moderately long.\n\n", 1) +
+		strings.Repeat("paragraph two also moderately long.\n\n", 1) +
+		strings.Repeat("paragraph three is moderately long.\n\n", 1) +
+		strings.Repeat("paragraph four is moderately long.\n\n", 1)
+	out := vectorstore.ChunkMarkdown(src, 100)
+
+	assert.Greater(t, len(out), 1, "oversized section should sub-split at paragraph boundaries")
+	for i, c := range out {
+		assert.LessOrEqual(t, len(c), 200,
+			"chunk %d exceeds 2x maxBytes: %d", i, len(c))
+	}
+}
+
+func TestChunkMarkdown_PreservesContent(t *testing.T) {
+	src := "# H1\n\nfirst section body.\n\n## H2a\n\nsecond section body.\n\n## H2b\n\nthird section body.\n"
+	out := vectorstore.ChunkMarkdown(src, 50)
+	joined := strings.Join(out, "")
+	// All non-whitespace tokens from src must appear in the joined output
+	for _, token := range []string{"H1", "first", "H2a", "second", "H2b", "third"} {
+		assert.Contains(t, joined, token, "token %q missing after chunking", token)
+	}
+}
+
+func TestChunkMarkdown_NumberedSuffix(t *testing.T) {
+	out := vectorstore.NumberChunks("knowledge/foo.md", []string{"a", "b", "c"})
+	require.Len(t, out, 3)
+	assert.Equal(t, "knowledge/foo.md#0001", out[0].Path)
+	assert.Equal(t, "knowledge/foo.md#0002", out[1].Path)
+	assert.Equal(t, "knowledge/foo.md#0003", out[2].Path)
+	assert.Equal(t, "a", out[0].Content)
+}
+
+func TestParentPath_StripsChunkSuffix(t *testing.T) {
+	assert.Equal(t, "knowledge/foo.md", vectorstore.ParentPath("knowledge/foo.md#0001"))
+	assert.Equal(t, "knowledge/foo.md", vectorstore.ParentPath("knowledge/foo.md"))
+	assert.Equal(t, "wiki/a/b.md", vectorstore.ParentPath("wiki/a/b.md#9999"))
+}
--- a/ingestion/internal/vectorstore/pg.go
+++ b/ingestion/internal/vectorstore/pg.go
@@ -8,6 +8,7 @@ import (
 	"errors"
 	"fmt"
 	"strings"
+	"time"

 	"github.com/jackc/pgx/v5"
 	"github.com/jackc/pgx/v5/pgxpool"
@@ -120,21 +121,26 @@ func (s *PGStore) Search(ctx context.Context, query []float32, limit int) ([]Hit
 	return hits, nil
 }

-// KnownPaths returns the path set already present in the store. Used by
-// the watcher to diff against the wiki/ tree and decide what to upsert.
-func (s *PGStore) KnownPaths(ctx context.Context) (map[string]struct{}, error) {
-	rows, err := s.pool.Query(ctx, `SELECT path FROM brain_embeddings`)
+// KnownPathsWithTime returns every embedded chunk path paired with the
+// row's updated_at. Sync uses the timestamps to decide whether a file
+// has been edited since its chunks were last embedded — when the file's
+// mtime exceeds the oldest chunk's updated_at, the file is re-embedded.
+func (s *PGStore) KnownPathsWithTime(ctx context.Context) (map[string]time.Time, error) {
+	rows, err := s.pool.Query(ctx, `SELECT path, updated_at FROM brain_embeddings`)
 	if err != nil {
 		return nil, fmt.Errorf("query paths: %w", err)
 	}
 	defer rows.Close()
-	out := make(map[string]struct{})
+	out := make(map[string]time.Time)
 	for rows.Next() {
-		var p string
-		if err := rows.Scan(&p); err != nil {
+		var (
+			p string
+			t time.Time
+		)
+		if err := rows.Scan(&p, &t); err != nil {
 			return nil, err
 		}
-		out[p] = struct{}{}
+		out[p] = t
 	}
 	return out, rows.Err()
 }
--- a/ingestion/internal/vectorstore/pg_test.go
+++ b/ingestion/internal/vectorstore/pg_test.go
@@ -36,7 +36,7 @@ func freshStore(t *testing.T) (*vectorstore.PGStore, context.Context) {
 	t.Cleanup(s.Close)
 	require.NoError(t, s.Init(ctx))
 	// Clean slate per test.
-	_, _ = s.KnownPaths(ctx)
+	_, _ = s.KnownPathsWithTime(ctx)
 	require.NoError(t, s.Delete(ctx, "%test-fixture%"))
 	return s, ctx
 }
@@ -67,15 +67,18 @@ func TestIntegration_UpsertAndSearch(t *testing.T) {
 	})
 }

-func TestIntegration_KnownPaths(t *testing.T) {
+func TestIntegration_KnownPathsWithTime(t *testing.T) {
 	s, ctx := freshStore(t)
+	before := time.Now()
 	require.NoError(t, s.Upsert(ctx, "wiki/k.md", vec(768, 0.5)))
 	t.Cleanup(func() { _ = s.Delete(ctx, "wiki/k.md") })

-	paths, err := s.KnownPaths(ctx)
+	paths, err := s.KnownPathsWithTime(ctx)
 	require.NoError(t, err)
-	_, ok := paths["wiki/k.md"]
-	assert.True(t, ok)
+	at, ok := paths["wiki/k.md"]
+	require.True(t, ok)
+	assert.False(t, at.IsZero(), "updated_at must not be zero")
+	assert.WithinDuration(t, before, at, 5*time.Second, "updated_at must be recent")
 }

 func TestUpsert_RejectsWrongDimension(t *testing.T) {
--- a/ingestion/internal/vectorstore/sync.go
+++ b/ingestion/internal/vectorstore/sync.go
@@ -18,7 +18,11 @@ type Embedder interface {

 // Store is the subset of PGStore that Sync needs. Lets tests stub it.
 type Store interface {
-	KnownPaths(ctx context.Context) (map[string]struct{}, error)
+	// KnownPathsWithTime returns every embedded chunk path paired with the
+	// row's updated_at. Sync uses the timestamp to detect edits — a file
+	// whose mtime is newer than ANY of its chunks' updated_at is re-embedded
+	// from scratch (old chunks deleted, fresh chunks upserted).
+	KnownPathsWithTime(ctx context.Context) (map[string]time.Time, error)
 	Upsert(ctx context.Context, path string, embedding []float32) error
 	Delete(ctx context.Context, path string) error
 }
@@ -37,6 +41,13 @@ type SyncResult struct {
 // source pages; knowledge/ holds curated hand-written entries.
 var scanDirs = []string{"wiki", "knowledge"}

+// maxChunkBytes is the per-chunk byte budget passed to ChunkMarkdown.
+// Sized to fit comfortably under nomic-embed-text's 2048-token default
+// context (~4 chars/token for English markdown → ~8 KB ceiling; we sit
+// at 4 KB to leave headroom for unicode, code blocks, and tokenizer
+// variance).
+const maxChunkBytes = 4000
+
 // Sync brings the embedding store in line with brain/{wiki,knowledge}/
 // on disk:
 //   - new files (in the tree, not in the store) get embedded + upserted
@@ -51,11 +62,33 @@ func Sync(ctx context.Context, brainDir string, store Store, embedder Embedder)
 		return res, nil
 	}

-	known, err := store.KnownPaths(ctx)
+	known, err := store.KnownPathsWithTime(ctx)
 	if err != nil {
 		return res, fmt.Errorf("known paths: %w", err)
 	}
-	seen := make(map[string]struct{})
+	// Group known chunks by parent path and remember the EARLIEST
+	// updated_at per parent. A file is considered stale if its mtime is
+	// after the oldest of its chunk rows — i.e. at least one chunk hasn't
+	// been refreshed since the last edit. Also keep the full chunk-path
+	// list per parent so we can delete every old chunk before re-embedding
+	// (handles "file shrunk → fewer chunks → orphan rows" cleanly).
+	type parentState struct {
+		minUpdatedAt time.Time
+		chunkPaths   []string
+	}
+	parents := make(map[string]*parentState, len(known))
+	for p, t := range known {
+		parent := ParentPath(p)
+		ps, ok := parents[parent]
+		if !ok {
+			ps = &parentState{minUpdatedAt: t}
+			parents[parent] = ps
+		} else if t.Before(ps.minUpdatedAt) {
+			ps.minUpdatedAt = t
+		}
+		ps.chunkPaths = append(ps.chunkPaths, p)
+	}
+	seenParents := make(map[string]struct{})

 	for _, sub := range scanDirs {
 		root := filepath.Join(brainDir, sub)
@@ -75,12 +108,28 @@ func Sync(ctx context.Context, brainDir string, store Store, embedder Embedder)
 				return err
 			}
 			relSlash := filepath.ToSlash(rel)
-			seen[relSlash] = struct{}{}
+			seenParents[relSlash] = struct{}{}

-			if _, ok := known[relSlash]; ok {
-				// Already embedded — TODO: compare mtime once Store exposes
-				// updated_at so we re-embed on edit. For now, skip.
-				return nil
+			if ps, ok := parents[relSlash]; ok {
+				// File already has chunks in the store. Re-embed only when
+				// the file has been edited since the oldest chunk was
+				// written. Tolerate clock skew with a sub-second grace.
+				info, statErr := d.Info()
+				if statErr != nil {
+					res.Errors = append(res.Errors, fmt.Errorf("stat %s: %w", relSlash, statErr))
+					return nil
+				}
+				if !info.ModTime().After(ps.minUpdatedAt) {
+					return nil
+				}
+				// Stale: delete old chunks before re-embedding so a shrunk
+				// file doesn't leave orphan rows at higher #NNNN indexes.
+				for _, oldPath := range ps.chunkPaths {
+					if delErr := store.Delete(ctx, oldPath); delErr != nil {
+						res.Errors = append(res.Errors, fmt.Errorf("delete %s for re-embed: %w", oldPath, delErr))
+						return nil
+					}
+				}
 			}

 			content, readErr := os.ReadFile(path)
@@ -88,16 +137,19 @@ func Sync(ctx context.Context, brainDir string, store Store, embedder Embedder)
 				res.Errors = append(res.Errors, fmt.Errorf("read %s: %w", relSlash, readErr))
 				return nil
 			}
-			vec, embErr := embedder.Embed(ctx, string(content))
-			if embErr != nil {
-				res.Errors = append(res.Errors, fmt.Errorf("embed %s: %w", relSlash, embErr))
-				return nil
+			chunks := NumberChunks(relSlash, ChunkMarkdown(string(content), maxChunkBytes))
+			for _, ch := range chunks {
+				vec, embErr := embedder.Embed(ctx, ch.Content)
+				if embErr != nil {
+					res.Errors = append(res.Errors, fmt.Errorf("embed %s: %w", ch.Path, embErr))
+					continue
+				}
+				if upErr := store.Upsert(ctx, ch.Path, vec); upErr != nil {
+					res.Errors = append(res.Errors, fmt.Errorf("upsert %s: %w", ch.Path, upErr))
+					continue
+				}
+				res.Added++
 			}
-			if upErr := store.Upsert(ctx, relSlash, vec); upErr != nil {
-				res.Errors = append(res.Errors, fmt.Errorf("upsert %s: %w", relSlash, upErr))
-				return nil
-			}
-			res.Added++
 			return nil
 		})
 		if err != nil {
@@ -105,9 +157,9 @@ func Sync(ctx context.Context, brainDir string, store Store, embedder Embedder)
 		}
 	}

-	// Drop rows whose file is gone.
+	// Drop chunk rows whose parent file is gone.
 	for path := range known {
-		if _, ok := seen[path]; ok {
+		if _, ok := seenParents[ParentPath(path)]; ok {
 			continue
 		}
 		if err := store.Delete(ctx, path); err != nil {
--- a/ingestion/internal/vectorstore/sync_test.go
+++ b/ingestion/internal/vectorstore/sync_test.go
@@ -5,7 +5,9 @@ import (
 	"errors"
 	"os"
 	"path/filepath"
+	"strings"
 	"testing"
+	"time"

 	"github.com/mathiasbq/hyperguild/ingestion/internal/vectorstore"
 	"github.com/stretchr/testify/assert"
@@ -13,16 +15,27 @@ import (
 )

 type stubStore struct {
-	known    map[string]struct{}
+	// known maps chunk-path → updated_at. Tests that don't care about
+	// re-embed-on-mtime use a far-future time so the Sync skip path
+	// always wins. Tests that do exercise the mtime path set the
+	// updated_at explicitly.
+	known    map[string]time.Time
 	upserts  map[string][]float32
 	deletes  []string
 	failNext error
 }

-func (s *stubStore) KnownPaths(_ context.Context) (map[string]struct{}, error) {
-	out := make(map[string]struct{}, len(s.known))
-	for k := range s.known {
-		out[k] = struct{}{}
+// farFuture is "newer than any file mtime", used as the default
+// updated_at in stubs that don't care about re-embed behavior.
+var farFuture = time.Now().Add(24 * time.Hour)
+
+func (s *stubStore) KnownPathsWithTime(_ context.Context) (map[string]time.Time, error) {
+	out := make(map[string]time.Time, len(s.known))
+	for k, t := range s.known {
+		if t.IsZero() {
+			t = farFuture
+		}
+		out[k] = t
 	}
 	return out, nil
 }
@@ -66,21 +79,21 @@ func TestSync_AddsNewFiles(t *testing.T) {
 	writeNote(t, dir, "wiki/jepa-fx/facts/x.md", "body of x")
 	writeNote(t, dir, "wiki/jepa-fx/facts/y.md", "body of y")

-	store := &stubStore{known: map[string]struct{}{}}
+	store := &stubStore{known: map[string]time.Time{}}
 	emb := stubEmbedder{vec: make([]float32, 768)}
 	res, err := vectorstore.Sync(context.Background(), dir, store, emb)
 	require.NoError(t, err)
 	assert.Equal(t, 2, res.Added)
 	assert.Empty(t, res.Deleted)
-	assert.Contains(t, store.upserts, "wiki/jepa-fx/facts/x.md")
-	assert.Contains(t, store.upserts, "wiki/jepa-fx/facts/y.md")
+	assert.Contains(t, store.upserts, "wiki/jepa-fx/facts/x.md#0001")
+	assert.Contains(t, store.upserts, "wiki/jepa-fx/facts/y.md#0001")
 }

 func TestSync_SkipsAlreadyKnown(t *testing.T) {
 	dir := t.TempDir()
 	writeNote(t, dir, "wiki/a/facts/x.md", "x")

-	store := &stubStore{known: map[string]struct{}{"wiki/a/facts/x.md": {}}}
+	store := &stubStore{known: map[string]time.Time{"wiki/a/facts/x.md#0001": {}}}
 	emb := stubEmbedder{vec: make([]float32, 768)}
 	res, err := vectorstore.Sync(context.Background(), dir, store, emb)
 	require.NoError(t, err)
@@ -92,7 +105,7 @@ func TestSync_DeletesDisappearedFiles(t *testing.T) {
 	dir := t.TempDir()
 	require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki"), 0o755))
 	// store has a path that doesn't exist on disk anymore
-	store := &stubStore{known: map[string]struct{}{"wiki/old/facts/ghost.md": {}}}
+	store := &stubStore{known: map[string]time.Time{"wiki/old/facts/ghost.md#0001": {}}}
 	res, err := vectorstore.Sync(context.Background(), dir, &stubStoreWithDelete{stubStore: store}, stubEmbedder{vec: make([]float32, 768)})
 	require.NoError(t, err)
 	assert.Equal(t, 1, res.Deleted)
@@ -110,11 +123,11 @@ func TestSync_SkipsIndexFiles(t *testing.T) {
 	writeNote(t, dir, "wiki/a/_index.md", "moc")
 	writeNote(t, dir, "wiki/a/facts/real.md", "body")

-	store := &stubStore{known: map[string]struct{}{}}
+	store := &stubStore{known: map[string]time.Time{}}
 	res, err := vectorstore.Sync(context.Background(), dir, store, stubEmbedder{vec: make([]float32, 768)})
 	require.NoError(t, err)
 	assert.Equal(t, 1, res.Added)
-	assert.NotContains(t, store.upserts, "wiki/a/_index.md")
+	assert.NotContains(t, store.upserts, "wiki/a/_index.md#0001")
 }

 func TestSync_ScansKnowledgeDir(t *testing.T) {
@@ -122,13 +135,123 @@ func TestSync_ScansKnowledgeDir(t *testing.T) {
 	writeNote(t, dir, "wiki/a/facts/x.md", "x")
 	writeNote(t, dir, "knowledge/2026-05-19-koala-gpu-setup.md", "knowledge body")

-	store := &stubStore{known: map[string]struct{}{}}
+	store := &stubStore{known: map[string]time.Time{}}
 	emb := stubEmbedder{vec: make([]float32, 768)}
 	res, err := vectorstore.Sync(context.Background(), dir, store, emb)
 	require.NoError(t, err)
 	assert.Equal(t, 2, res.Added)
-	assert.Contains(t, store.upserts, "wiki/a/facts/x.md")
-	assert.Contains(t, store.upserts, "knowledge/2026-05-19-koala-gpu-setup.md")
+	assert.Contains(t, store.upserts, "wiki/a/facts/x.md#0001")
+	assert.Contains(t, store.upserts, "knowledge/2026-05-19-koala-gpu-setup.md#0001")
+}
+
+func TestSync_ChunksLongFiles(t *testing.T) {
+	dir := t.TempDir()
+	// Build a file that's well over the chunk byte budget. Multi-section
+	// markdown so the chunker has heading boundaries to cut on.
+	body := "# Doc\n\nintro line.\n\n"
+	for i := 0; i < 10; i++ {
+		body += "## Section " + string(rune('A'+i)) + "\n\n"
+		body += strings.Repeat("This section has a fair amount of content. ", 50) + "\n\n"
+	}
+	writeNote(t, dir, "knowledge/long.md", body)
+
+	store := &stubStore{known: map[string]time.Time{}}
+	emb := stubEmbedder{vec: make([]float32, 768)}
+	res, err := vectorstore.Sync(context.Background(), dir, store, emb)
+	require.NoError(t, err)
+	assert.Greater(t, res.Added, 1, "long file should produce multiple chunk rows")
+	// Every upserted path for this file must be a chunk path.
+	chunkCount := 0
+	for p := range store.upserts {
+		if strings.HasPrefix(p, "knowledge/long.md#") {
+			chunkCount++
+		}
+	}
+	assert.Equal(t, res.Added, chunkCount, "all rows for long file should be chunk-suffixed")
+	// The bare parent path must NOT be upserted directly.
+	assert.NotContains(t, store.upserts, "knowledge/long.md")
+}
+
+func TestSync_ShortFileGetsSingleChunkRow(t *testing.T) {
+	dir := t.TempDir()
+	writeNote(t, dir, "wiki/short.md", "tiny body\n")
+
+	store := &stubStore{known: map[string]time.Time{}}
+	emb := stubEmbedder{vec: make([]float32, 768)}
+	res, err := vectorstore.Sync(context.Background(), dir, store, emb)
+	require.NoError(t, err)
+	assert.Equal(t, 1, res.Added)
+	assert.Contains(t, store.upserts, "wiki/short.md#0001")
+}
+
+func TestSync_SkipsFileIfAnyChunkAlreadyKnown(t *testing.T) {
+	dir := t.TempDir()
+	writeNote(t, dir, "wiki/foo.md", "body\n")
+
+	store := &stubStore{known: map[string]time.Time{
+		"wiki/foo.md#0001": {},
+	}}
+	emb := stubEmbedder{vec: make([]float32, 768)}
+	res, err := vectorstore.Sync(context.Background(), dir, store, emb)
+	require.NoError(t, err)
+	assert.Equal(t, 0, res.Added)
+	assert.Empty(t, store.upserts)
+}
+
+func TestSync_DeletesAllChunksOfDisappearedFile(t *testing.T) {
+	dir := t.TempDir()
+	require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki"), 0o755))
+	store := &stubStore{known: map[string]time.Time{
+		"wiki/ghost.md#0001": {},
+		"wiki/ghost.md#0002": {},
+		"wiki/ghost.md#0003": {},
+	}}
+	res, err := vectorstore.Sync(context.Background(), dir, store, stubEmbedder{vec: make([]float32, 768)})
+	require.NoError(t, err)
+	assert.Equal(t, 3, res.Deleted)
+}
+
+func TestSync_ReembedsFileWhenMtimeNewer(t *testing.T) {
+	dir := t.TempDir()
+	writeNote(t, dir, "wiki/edited.md", "original body\n")
+	// Force the file's mtime ahead of any plausible store updated_at.
+	future := time.Now().Add(1 * time.Hour)
+	require.NoError(t, os.Chtimes(filepath.Join(dir, "wiki/edited.md"), future, future))
+
+	store := &stubStore{
+		known: map[string]time.Time{
+			// Existing chunk row pre-dates the file's mtime.
+			"wiki/edited.md#0001": time.Now().Add(-1 * time.Hour),
+		},
+	}
+	emb := stubEmbedder{vec: make([]float32, 768)}
+	res, err := vectorstore.Sync(context.Background(), dir, store, emb)
+	require.NoError(t, err)
+	assert.Equal(t, 1, res.Added, "file with newer mtime should be re-embedded")
+	assert.Contains(t, store.upserts, "wiki/edited.md#0001")
+	// Old chunks of the same parent must be deleted before re-embed so
+	// shrunk files don't leave orphan rows at higher #NNNN indexes.
+	assert.Contains(t, store.deletes, "wiki/edited.md#0001")
+}
+
+func TestSync_SkipsFileWhenMtimeOlder(t *testing.T) {
+	dir := t.TempDir()
+	writeNote(t, dir, "wiki/stable.md", "body\n")
+	// Backdate mtime to before the store's recorded updated_at.
+	past := time.Now().Add(-2 * time.Hour)
+	require.NoError(t, os.Chtimes(filepath.Join(dir, "wiki/stable.md"), past, past))
+
+	store := &stubStore{
+		known: map[string]time.Time{
+			"wiki/stable.md#0001": time.Now(),
+		},
+	}
+	emb := stubEmbedder{vec: make([]float32, 768)}
+	res, err := vectorstore.Sync(context.Background(), dir, store, emb)
+	require.NoError(t, err)
+	assert.Equal(t, 0, res.Added)
+	assert.Empty(t, store.upserts)
+	assert.Empty(t, store.deletes)
 }

 func TestSync_NoOpWhenComponentsNil(t *testing.T) {
@@ -142,7 +265,7 @@ func TestSync_NoOpWhenComponentsNil(t *testing.T) {
 func TestSync_CollectsEmbedderErrors(t *testing.T) {
 	dir := t.TempDir()
 	writeNote(t, dir, "wiki/a/facts/x.md", "x")
-	store := &stubStore{known: map[string]struct{}{}}
+	store := &stubStore{known: map[string]time.Time{}}
 	emb := stubEmbedder{err: errors.New("upstream down")}
 	res, err := vectorstore.Sync(context.Background(), dir, store, emb)
 	require.NoError(t, err)
--- a/internal/config/routing.go
+++ b/internal/config/routing.go
@@ -11,7 +11,7 @@ import (
 type RoutingConfig struct {
 	Port               string  // ROUTING_PORT, default 3210
 	MCPAuthToken       string  // ROUTING_MCP_TOKEN, optional bearer token
-	LiteLLMBaseURL     string  // LITELLM_BASE_URL, default http://piguard:4000
+	LiteLLMBaseURL     string  // LITELLM_BASE_URL, default https://llm-api.d-ma.be
 	LiteLLMAPIKey      string  // LITELLM_API_KEY
 	BrainURL           string  // BRAIN_URL, default http://ingestion.supervisor:3300
 	FastModel          string  // HYPERGUILD_FAST_MODEL, default koala/qwen35-9b-fast
@@ -41,7 +41,7 @@ func LoadRouting() (RoutingConfig, error) {
 	cfg := RoutingConfig{
 		Port:           envOr("ROUTING_PORT", "3210"),
 		MCPAuthToken:   os.Getenv("ROUTING_MCP_TOKEN"),
-		LiteLLMBaseURL: envOr("LITELLM_BASE_URL", "http://piguard:4000"),
+		LiteLLMBaseURL: envOr("LITELLM_BASE_URL", "https://llm-api.d-ma.be"),
 		LiteLLMAPIKey:  os.Getenv("LITELLM_API_KEY"),
 		BrainURL:       envOr("BRAIN_URL", "http://ingestion.supervisor:3300"),
 		FastModel:      envOr("HYPERGUILD_FAST_MODEL", "koala/qwen35-9b-fast"),
--- a/internal/config/routing_test.go
+++ b/internal/config/routing_test.go
@@ -22,7 +22,7 @@ func TestLoadRoutingDefaults(t *testing.T) {
 	require.NoError(t, err)
 	assert.Equal(t, "3210", cfg.Port)
 	assert.Equal(t, "", cfg.MCPAuthToken)
-	assert.Equal(t, "http://piguard:4000", cfg.LiteLLMBaseURL)
+	assert.Equal(t, "https://llm-api.d-ma.be", cfg.LiteLLMBaseURL)
 	assert.Equal(t, "http://ingestion.supervisor:3300", cfg.BrainURL)
 	assert.Equal(t, "koala/qwen35-9b-fast", cfg.FastModel)
 	assert.Equal(t, "iguana/gemma4-26b", cfg.ThinkingModel)
--- a/internal/skills/project/handlers.go
+++ b/internal/skills/project/handlers.go
@@ -13,12 +13,13 @@ import (
 )

 type createArgs struct {
-	Name        string `json:"name"`
-	Description string `json:"description"`
-	Hypothesis  string `json:"hypothesis"`
-	Folder      string `json:"folder"`
-	Stack       string `json:"stack"`
-	Private     bool   `json:"private"`
+	Name           string `json:"name"`
+	Description    string `json:"description"`
+	Hypothesis     string `json:"hypothesis"`
+	Folder         string `json:"folder"`
+	Stack          string `json:"stack"`
+	Private        bool   `json:"private"`
+	MirrorToGitHub bool   `json:"mirror_to_github,omitempty"`
 }

 type createResult struct {
@@ -59,11 +60,12 @@ func (s *Skill) handleCreate(ctx context.Context, raw json.RawMessage) (json.Raw

 	tmpl := templateFor(args.Stack)
 	giteaURL := fmt.Sprintf("http://gitea.d-ma.be/%s/%s", s.cfg.GiteaOwner, args.Name)
-	githubURL := fmt.Sprintf("https://github.com/%s/%s", s.cfg.GitHubOwner, args.Name)

 	res := createResult{
-		GiteaURL:  giteaURL,
-		GitHubURL: githubURL,
+		GiteaURL: giteaURL,
+	}
+	if args.MirrorToGitHub {
+		res.GitHubURL = fmt.Sprintf("https://github.com/%s/%s", s.cfg.GitHubOwner, args.Name)
 	}

 	// Step 1: create_project_from_template. If the repo already exists,
@@ -75,25 +77,32 @@ func (s *Skill) handleCreate(ctx context.Context, raw json.RawMessage) (json.Raw
 	}
 	res.Reached = append(res.Reached, stepCreateRepo)

-	// Step 2: create empty GitHub repo. Gitea's push-mirror cannot push
-	// to a non-existent remote, so the destination must exist before
-	// step 3 configures the mirror. Skipped when GitHub client is unset
-	// (degraded mode — see Config.GitHub doc).
-	if s.cfg.GitHub != nil {
-		if err := s.callCreateGitHubRepo(ctx, args); err != nil && !errors.Is(err, githubclient.ErrAlreadyExists) {
-			return marshalPartial(res, stepCreateGitHub, err)
+	// Steps 2+3 are skipped when MirrorToGitHub is false. Default per
+	// infra ADR (Gitea as true master, GitHub as optional opt-in): keep
+	// client / business-logic / personal repos Gitea-only. Set
+	// `mirror_to_github: true` for open-source projects that want a
+	// public GitHub mirror (hyperguild, gitea-mcp, template-*).
+	if args.MirrorToGitHub {
+		// Step 2: create empty GitHub repo. Gitea's push-mirror cannot push
+		// to a non-existent remote, so the destination must exist before
+		// step 3 configures the mirror. Skipped when GitHub client is unset
+		// (degraded mode — see Config.GitHub doc).
+		if s.cfg.GitHub != nil {
+			if err := s.callCreateGitHubRepo(ctx, args); err != nil && !errors.Is(err, githubclient.ErrAlreadyExists) {
+				return marshalPartial(res, stepCreateGitHub, err)
+			}
+			res.Reached = append(res.Reached, stepCreateGitHub)
 		}
-		res.Reached = append(res.Reached, stepCreateGitHub)
-	}

-	// Step 3: configure push mirror to GitHub. Idempotent: if a mirror with
-	// the same remote already exists, gitea-mcp returns Conflict; we swallow it.
-	if err := s.callMirror(ctx, args.Name); err != nil {
-		if !isConflict(err) {
-			return marshalPartial(res, stepMirror, err)
+		// Step 3: configure push mirror to GitHub. Idempotent: if a mirror with
+		// the same remote already exists, gitea-mcp returns Conflict; we swallow it.
+		if err := s.callMirror(ctx, args.Name); err != nil {
+			if !isConflict(err) {
+				return marshalPartial(res, stepMirror, err)
+			}
 		}
+		res.Reached = append(res.Reached, stepMirror)
 	}
-	res.Reached = append(res.Reached, stepMirror)

 	// Step 3: commit staging namespace manifest to infra repo. Done before
 	// the issue so the staging env is reconciling by the time the issue lands.
@@ -228,7 +237,11 @@ func experimentBrief(args createArgs, existed bool) string {
 	b.WriteString("- Repo created from `template-")
 	b.WriteString(args.Stack)
 	b.WriteString("` on Gitea.\n")
-	b.WriteString("- Push-mirror configured to GitHub.\n")
+	if args.MirrorToGitHub {
+		b.WriteString("- Push-mirror configured to GitHub.\n")
+	} else {
+		b.WriteString("- Gitea-only (no GitHub mirror — set `mirror_to_github: true` to opt in).\n")
+	}
 	b.WriteString("- Staging namespace manifest committed to infra repo.\n\n")
 	if existed {
 		b.WriteString("> Note: this repo already existed when `project_create` ran — provisioning steps were re-applied idempotently.\n")
--- a/internal/skills/project/handlers_test.go
+++ b/internal/skills/project/handlers_test.go
@@ -158,6 +158,9 @@ func mustClient(t *testing.T, url string) *mcpclient.Client {
 	return c
 }

+// happyArgs returns the minimal valid request. With the Gitea-as-true-master
+// ADR shipped, this defaults to Gitea-only (mirror_to_github omitted = false).
+// Tests that need the full Gitea + GitHub mirror flow use mirroredArgs().
 func happyArgs() json.RawMessage {
 	return json.RawMessage(`{
 		"name":"my-experiment",
@@ -169,6 +172,20 @@ func happyArgs() json.RawMessage {
 	}`)
 }

+// mirroredArgs is happyArgs + mirror_to_github=true — the explicit opt-in
+// path. Equivalent to the pre-ADR default.
+func mirroredArgs() json.RawMessage {
+	return json.RawMessage(`{
+		"name":"my-experiment",
+		"description":"One-line desc",
+		"hypothesis":"We believe X produces Y",
+		"folder":"AGENTS",
+		"stack":"go-agent",
+		"private":true,
+		"mirror_to_github":true
+	}`)
+}
+
 func TestProjectCreate_HappyPath(t *testing.T) {
 	f := &fakeGiteaMCP{
 		Responses: map[string]any{
@@ -177,7 +194,7 @@ func TestProjectCreate_HappyPath(t *testing.T) {
 	}
 	skill, gh := newSkill(t, f)

-	out, err := skill.Handle(context.Background(), "project_create", happyArgs())
+	out, err := skill.Handle(context.Background(), "project_create", mirroredArgs())
 	require.NoError(t, err)

 	var res map[string]any
@@ -228,7 +245,7 @@ func TestProjectCreate_GitHubExists_Idempotent(t *testing.T) {
 	skill, gh := newSkill(t, f)
 	gh.ReturnError = 422 // already exists

-	_, err := skill.Handle(context.Background(), "project_create", happyArgs())
+	_, err := skill.Handle(context.Background(), "project_create", mirroredArgs())
 	require.NoError(t, err, "422 already-exists should be idempotent")
 	require.Len(t, f.Calls, 4, "all gitea steps still run despite github 422")
 }
@@ -238,7 +255,7 @@ func TestProjectCreate_GitHubFails(t *testing.T) {
 	skill, gh := newSkill(t, f)
 	gh.ReturnError = 401 // bad PAT

-	out, err := skill.Handle(context.Background(), "project_create", happyArgs())
+	out, err := skill.Handle(context.Background(), "project_create", mirroredArgs())
 	require.Error(t, err)
 	var res map[string]any
 	require.NoError(t, json.Unmarshal(out, &res))
@@ -255,7 +272,11 @@ func TestProjectCreate_NoGitHubClient_DegradedMode(t *testing.T) {
 	}
 	skill := newSkillNoGitHub(t, f)

-	out, err := skill.Handle(context.Background(), "project_create", happyArgs())
+	// Use mirroredArgs so we exercise the GitHub-mirror path. With the
+	// GitHub client nil, the create_github_repo step is skipped but the
+	// mirror step still attempts to configure the push-mirror remote
+	// (degraded mode preserves the prior contract for opted-in projects).
+	out, err := skill.Handle(context.Background(), "project_create", mirroredArgs())
 	require.NoError(t, err)
 	var res map[string]any
 	require.NoError(t, json.Unmarshal(out, &res))
@@ -275,7 +296,7 @@ func TestProjectCreate_Idempotent_RepoExists(t *testing.T) {
 	}
 	skill, _ := newSkill(t, f)

-	out, err := skill.Handle(context.Background(), "project_create", happyArgs())
+	out, err := skill.Handle(context.Background(), "project_create", mirroredArgs())
 	require.NoError(t, err)

 	var res map[string]any
@@ -295,7 +316,7 @@ func TestProjectCreate_MirrorFails(t *testing.T) {
 	}
 	skill, _ := newSkill(t, f)

-	out, err := skill.Handle(context.Background(), "project_create", happyArgs())
+	out, err := skill.Handle(context.Background(), "project_create", mirroredArgs())
 	require.Error(t, err)
 	assert.Contains(t, err.Error(), `"mirror" failed`)

@@ -317,7 +338,7 @@ func TestProjectCreate_InfraCommitFails(t *testing.T) {
 	}
 	skill, _ := newSkill(t, f)

-	out, err := skill.Handle(context.Background(), "project_create", happyArgs())
+	out, err := skill.Handle(context.Background(), "project_create", mirroredArgs())
 	require.Error(t, err)

 	var res map[string]any
@@ -351,6 +372,45 @@ func TestProjectCreate_ValidationErrors(t *testing.T) {
 	assert.Empty(t, f.Calls, "no upstream calls should occur on validation failure")
 }

+func TestProjectCreate_DefaultSkipsGitHubMirror(t *testing.T) {
+	// Default (mirror_to_github omitted) skips create_github_repo + mirror
+	// per the Gitea-as-true-master ADR. Gitea repo + staging namespace
+	// + issue still run; github_url is empty in the response.
+	f := &fakeGiteaMCP{
+		Responses: map[string]any{
+			"issue_create": map[string]any{"html_url": "http://gitea.d-ma.be/mathias/my-experiment/issues/1"},
+		},
+	}
+	skill, gh := newSkill(t, f)
+
+	out, err := skill.Handle(context.Background(), "project_create", happyArgs())
+	require.NoError(t, err)
+
+	var res map[string]any
+	require.NoError(t, json.Unmarshal(out, &res))
+
+	assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment", res["gitea_url"])
+	assert.Equal(t, "", res["github_url"], "github_url must be empty when mirror not opted in")
+	assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment/issues/1", res["issue_url"])
+
+	// 3 gitea-mcp calls: template create, staging file write, issue. NO mirror call.
+	require.Len(t, f.Calls, 3)
+	assert.Equal(t, "create_project_from_template", f.Calls[0].Tool)
+	assert.Equal(t, "file_write_branch", f.Calls[1].Tool)
+	assert.Equal(t, "issue_create", f.Calls[2].Tool)
+
+	// Zero GitHub API calls.
+	assert.Empty(t, gh.Calls, "no GitHub repo created when mirror_to_github is false")
+
+	// reached lists the Gitea-only path.
+	reached := res["reached"].([]any)
+	assert.Equal(t, []any{"create_repo", "infra_commit", "issue"}, reached)
+
+	// experiment-brief body reflects Gitea-only provisioning.
+	require.Contains(t, f.Calls[2].Args["body"], "Gitea-only")
+	require.NotContains(t, f.Calls[2].Args["body"], "Push-mirror configured")
+}
+
 func TestProjectCreate_UnknownTool(t *testing.T) {
 	f := &fakeGiteaMCP{}
 	skill, _ := newSkill(t, f)
--- a/internal/skills/project/skill.go
+++ b/internal/skills/project/skill.go
@@ -79,13 +79,22 @@ func (s *Skill) Tools() []registry.ToolDef {
 				"description": "Selects template-go-agent or template-go-web.",
 			},
 			"private": map[string]any{"type": "boolean"},
+			"mirror_to_github": map[string]any{
+				"type": "boolean",
+				"description": "Default false. When true, also create an empty GitHub repo " +
+					"and configure a push-mirror from Gitea. Opt-in per the Gitea-as-true-master " +
+					"ADR — only set true for open-source projects (hyperguild, gitea-mcp, template-*). " +
+					"Never set true for client projects, business logic, or personal experiments.",
+			},
 		},
 		"required": []string{"name", "description", "hypothesis", "stack"},
 	})
 	return []registry.ToolDef{
 		{
-			Name:        "project_create",
-			Description: "Bootstrap a new project: Gitea repo from template, GitHub push-mirror, staging namespace manifest, experiment-brief issue. Idempotent — re-running with an existing repo returns the existing URLs.",
+			Name: "project_create",
+			Description: "Bootstrap a new project: Gitea repo from template, staging namespace manifest, " +
+				"experiment-brief issue. Optionally mirrors to GitHub when `mirror_to_github: true` " +
+				"(default false). Idempotent — re-running with an existing repo returns the existing URLs.",
 			InputSchema: schema,
 		},
 	}
--- a/scripts/smoke-routing.sh
+++ b/scripts/smoke-routing.sh
@@ -4,7 +4,7 @@ set -euo pipefail
 # Boot the routing binary and exercise its four tools against live deps.
 # Skipped when LITELLM_BASE_URL or BRAIN_URL is unreachable.

-LITELLM_BASE_URL="${LITELLM_BASE_URL:-http://piguard:4000}"
+LITELLM_BASE_URL="${LITELLM_BASE_URL:-https://llm-api.d-ma.be}"
 BRAIN_URL="${BRAIN_URL:-http://koala:30330}"

 if ! curl -sS --max-time 2 "${LITELLM_BASE_URL}/v1/models" >/dev/null 2>&1; then