102 Commits

Author SHA1 Message Date
Mathias
bad0581623 merge: client-name scrubber rule (refs hyperguild#27)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 13s
CI / Mirror to GitHub (push) Successful in 4s
2026-05-26 07:10:05 +02:00
Mathias
a94b860c2e feat(claudewatcher): client-name guard via RegisterRule + env
Pre-rollout guard. Source code stays clean — client identities come
from CLAUDE_INGEST_CLIENT_BLOCK env (sourced from a SOPS-encrypted k8s
secret in infra repo). Env value is a regex alternation; main wraps
it with `(?i)\b(...)\b` so word-boundary matching avoids false hits
inside longer identifiers (e.g. "Sebastian" doesn't trigger on "SEB").

DefaultRules (credential shapes) still take precedence so any leak
that's BOTH a client mention AND a credential shape logs as the
credential — strictly more dangerous, points triage at the right
thing. Tests cover precedence + case variations + word-boundary
respect + invalid-pattern rejection.

Refs: infra#73 Track E.1 pre-rollout grill (option B).

Bump-Type: minor
2026-05-26 07:10:05 +02:00
Mathias
f8cf27e5de merge: claudewatcher (closes hyperguild#27, refs infra#73)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
2026-05-25 19:59:13 +02:00
Mathias
49b188e9c9 feat(server): wire claudewatcher behind CLAUDE_SESSIONS_DIR
Opt-in by setting CLAUDE_SESSIONS_DIR to the ~/.claude/projects path.
When set, the server starts claudewatcher.Watch in a goroutine that
ticks every CLAUDE_INGEST_INTERVAL seconds (default 60). Requires
BRAIN_PG_DSN for the cursor table — fail-fast if missing.

Each Batch becomes one wiki note at:
  brain/wiki/claude-sessions/facts/session-<host>-<session_id>.md

with frontmatter type=source + domain=<project basename>. Per-turn
content capped at 2000 chars (full transcripts stay in
~/.claude/projects already); the brain entry is a digest, not a
mirror.

CLAUDE_INGEST_HOST overrides the os.Hostname()-derived host label,
useful when multiple ingestion pods consume the same DSN from
different machines.

Closes hyperguild#27.

Bump-Type: minor
2026-05-25 19:59:07 +02:00
Mathias
bc011cc1f0 feat(claudewatcher): ingest Claude Code session transcripts into brain
New package internal/claudewatcher. The volume gate (24 turns/week of
agentsquad logs vs 500/week gate) exposed that the real signal lives
in daily Claude Code usage at ~/.claude/projects/*/<uuid>.jsonl, not
in agentsquad output. This package captures that signal. See infra#73
Track E + hyperguild#27 for the full reframe.

Components:
- parser: tolerant JSONL parser over the observed Claude Code session
  schema (user / assistant / attachment / system + bookkeeping types).
  Skip-flag fast-paths queue-operation, last-prompt, permission-mode,
  ai-title, bridge-session, file-history-snapshot.
- scrubber: 11-rule fail-closed regex set for credential shapes
  (bearer, postgres URIs, PEM, ssh-key, ghp_/sk-/sk-ant-/AKIA, homelab
  env tokens, SOPS markers). Drop turn + log on match.
- cursor: postgres-backed claude_session_cursors table, keyed by
  (host, file_path) with byte_offset. Resumable across pod restarts.
- watcher: poll loop. Walks SessionsDir, processes each .jsonl from
  its cursor offset, runs scrubber, emits a Batch per file to a
  Sink interface, advances cursor on successful Ingest.

No classifier integration in this commit — every kept turn is emitted
in a per-session batch. The cmd/server wiring (next commit) routes
batches to brain/wiki/claude-sessions/facts/. Classifier-driven hall
routing (decisions / failures / hypotheses) is a follow-up.

19 unit tests across parser + scrubber + watcher. task check green.

Refs: infra#73, hyperguild#27
2026-05-25 19:58:58 +02:00
Mathias
2726896079 feat(mcp): wire brain_context tool
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
Returns top-N relevant brain entries for a project context. Combines
BM25 hits on project name with 2-hop graph expansion via Track A's
graphstore (when BRAIN_GRAPH_ENABLED). Closes hyperguild#28.

Notes on implementation choices that deviate slightly from the spec:
- Excerpt length: 200 chars per spec (vs the 300 used by search.Result).
  truncateExcerpt clamps the already-stripped BM25 excerpt; graph-only
  neighbours load their excerpt from disk via a private readExcerpt
  helper (search.hydrate is unexported).
- Graph scoring: 0.6 / max(1, distance) per neighbour, so distance-1
  contributes 0.6 and distance-2 contributes 0.3. BM25 hits decay
  linearly from 3.0 (rank-0) to 1.0 (rank-2), giving BM25 hits a
  natural ceiling above pure-graph hits while still letting a doc
  surfaced via both edge types outrank a BM25-only one.
- Test placement: package mcp (internal) rather than mcp_test, because
  graphReader is unexported and WithGraph only accepts *PGStore; an
  internal test can install a dual-interface fake directly on s.graph
  without spinning up postgres.

Bump-Type: minor

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 18:53:14 +02:00
Mathias
2b7bbe38c7 docs(eval): record M4 + M4b scorer runs — phase 2 gate cleared (infra#72)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
Tier-weighted retrieval against the qa-2026-05.md 20-question set:

| run                            | top-1 | top-3 |
|--------------------------------|-------|-------|
| baseline (pre-phase-1)         | 20%   | 65%   |
| post phase 1 (parser+content)  | 20%   | 70%   |
| post M4 (tier weighting)       | 30%   | 75%   |
| post M4b (entities → K tier)   | 35%   | 80%   |

Net Phase 2 lift: +15pt top-1, +15pt top-3 — comfortably above the
≥10pt close-gate set in infra#72.

Three remaining misses are content-keyword issues, not structure
issues (the questions don't share enough lexical surface with the
target entries to surface via BM25 alone). Vector search would
help here but the iguana embedder is off-mesh (see infra#64).
2026-05-25 18:51:29 +02:00
Mathias
1b00cbc0ae fix(search,graph): M4b wiki/entities/ → tier=knowledge
All checks were successful
CI / Lint / Test / Vet (push) Successful in 13s
CI / Mirror to GitHub (push) Successful in 3s
Initial M4 mapping put wiki/entities/* in tier=note. Post-M4 eval
regressed qwen35-9b-fast from rank 2 → off top-5: knowledge entries
that cite the entity in passing now outscore the entity page itself
(1.5× weight vs 1.0×).

Entity anchor pages are durable facts about concrete things — they
map cleanly to the knowledge/facts/ slot in the post-M3 layout
target. Promote them now so the path inference matches.

Eval re-run after deploy is in infra#72.
2026-05-25 18:49:37 +02:00
Mathias
4f78fecd06 feat(search): M4 tier-weighted BM25 re-rank (infra#72)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 3s
The eval set under brain/eval/qa-2026-05.md showed BM25 top-1 at 20%
with 5 of the missing slugs being short focused knowledge entries
that lost to long aggregate docs on raw term-frequency. Tier weighting
addresses that without touching the BM25 algorithm itself.

How

- Result struct gains a Tier field, populated during the file walk
  via extractTier (frontmatter wins, path prefix as fallback —
  mirrors the graph.inferTierFromPath logic so the two callers stay
  in lockstep).
- After the existing sort (and optional hybridMerge), do a final
  stable re-sort by float64(Score) * tierWeight(Tier). Knowledge
  ×1.5, note ×1.0, inbox ×0.3, unknown ×1.0.
- hydrate() (vector-only hits) also fills Tier so re-ranking covers
  the hybrid path.

Test covers the load-bearing case: a long note-tier doc with raw=10
loses to a short knowledge-tier doc with raw=8 after weighting
(8×1.5=12 vs 10×1.0=10).

Measurement gate is in infra#72: re-run brain/eval/score.py against
the live brain after this image lands; close the issue when top-1
hit rate lifts by ≥10 absolute points.
2026-05-25 18:45:20 +02:00
Mathias
d5f112b600 feat(graph,graphstore): M2 parse tier+topic from frontmatter, persist via Upsert (infra#72)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 13s
CI / Mirror to GitHub (push) Successful in 4s
extract.go now reads `tier:` and `topic:` from YAML frontmatter, with
a path-based fallback when frontmatter is absent (the pre-M3 state on
every existing entry):

  knowledge/* → tier=knowledge
  notes/*     → tier=note
  wiki/**     → tier=note   (sources + concepts + entities are I-level)
  inbox/**, raw/**, sessions/**, clips/** → tier=inbox

Frontmatter wins when present — covers the M3-migrated case where an
entry's path may not match the tier the author chose for it.

UpsertEntity persists both columns. M1's schema already has them.

Backfill on next pod start populates tier for the whole corpus
without any file moves; M3 will follow up with the actual layout
migration and explicit frontmatter writes.
2026-05-25 12:35:38 +02:00
Mathias
ea9518e712 feat(graphstore): M1 add tier + topic columns to brain_entities (infra#72)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 15s
CI / Mirror to GitHub (push) Successful in 3s
Schema-only change. DDL adds tier + topic on fresh tables and uses
ADD COLUMN IF NOT EXISTS on existing tables (idempotent across pod
restarts). New conditional indexes match the wing/hall pattern.

No behavior change in this commit — UpsertEntity still writes only
the original columns; tier + topic stay '' on every row. M2 plumbs
the parser through. The empty default means existing queries are
untouched until the rest of the chain lands.

Part of infra#72 — brain DIKW tier redesign.
2026-05-25 07:17:39 +02:00
Mathias
e34cd6c12b docs(eval): record post-fix scorer run — phase 1 lift insufficient
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
Top-1 stayed at 20% (4/20), top-3 +5pt (65→70%) after:
- extract.go wing/topic parser fix (commit 3084c41)
- qwen35-9b-fast entity pad (was 239-byte stub → full entity)
- grafana entry: add "pod restart" synonym to lesson body
- dangling refs stripped from index.md + entities/k3s.md

The only retrieval move: qwen35-9b-fast climbed from rank 0 (off top-5)
to rank 2 — the entity pad worked. Other 5 misses are ranker behaviour
on already-keyword-overlapping entries; BM25 doesn't weight the right
slugs to the top.

Per the proposal's gate (≥10pt lift = stop, <10pt = Phase 2 justified),
the DIKW tier redesign earns its cost. Next session: tier column +
file moves + tier-weighted retrieval, then re-measure against this
same eval set.
2026-05-24 22:48:48 +02:00
Mathias
3084c4173d fix(graph): route wiki/<flat>.md to Type=knowledge, not Type=hall with filename-as-wing
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
classifyByPath had a hole: paths like wiki/index.md or wiki/<slug>.md
(direct children of wiki/, no subdirectory) hit the default branch and
wrote Wing=parts[1] — which IS the filename, not a wing. Symptom in
brain_entities: rows like (slug=index, wing=index.md) and
(slug=autobe-..., wing=autobe-evaluation-pattern-....md).

Fix: when len(parts) < 3 (no subdirectory at all), fall through to
Type=knowledge and let frontmatter set wing/hall if present.

Add brain/eval/ artifacts at the same time:
- qa-2026-05.md — 20 hand-authored Q→expected-slug pairs covering the
  homelab knowledge corpus across mcp, dex, gitops, postgres, go,
  models, methodology
- score.py — calls brain_query for each pair, scores top-1 + top-3,
  emits per-question detail. BRAIN_MCP_TOKEN via env.

Pre-fix baseline against the live brain: top-1 = 20% (4/20),
top-3 = 65% (13/20). Six hard misses where the expected slug doesn't
even land in the top-5.

Used to gate the phase 2 DIKW redesign (infra#62 follow-up): if
phase 1 fixes (this parser fix + 20 backlink authoring on top
orphans) lift top-1 by <10 absolute points, structure is the
bottleneck and the tier redesign is justified.
2026-05-24 22:33:04 +02:00
Mathias
72be87b4e7 chore(routing): flip LITELLM_BASE_URL default to https://llm-api.d-ma.be
All checks were successful
CI / Lint / Test / Vet (push) Successful in 13s
CI / Mirror to GitHub (push) Successful in 3s
Follow-up to infra#70. LiteLLM moved off piguard into k3s and the
public llm-api.d-ma.be hostname now upstreams to koala:30401. The
piguard:4000 default in the source bit-rots — works today because
piguard:4000 is still alive during the 7-day soak, breaks the moment
the compose comes down.

Pointing the default at the public hostname survives the cutover
without needing a follow-up. Production deploys via k3s already
override via env (in-cluster Service DNS) so this only affects local
dev shells without LITELLM_BASE_URL set.

- internal/config/routing.go: comment + envOr fallback
- internal/config/routing_test.go: expected value in defaults test
- scripts/smoke-routing.sh: shell default

task check: clean (tests + vet + govulncheck).
2026-05-24 15:06:23 +02:00
Mathias
153ef6ccac feat(graph): GraphRAG augment brain_answer with top-hit subgraph
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 3s
Commit 4 of Track A — the no-shelfware close-out the grill demanded.
brain_answer now folds the 1-hop outgoing neighbourhood of its top
BM25/rerank hit into the LLM's context as a <related> block when
BRAIN_GRAPH_ENABLED is on. With the flag off the prompt is byte-for-
byte identical to the pre-Track-A behaviour, so existing tests still
pass without modification.

The hop list contains slug, edge_type, doc_path — no extra retrieval
pass, no second LLM call, no file reads. The model can ignore the
block when irrelevant; when it adds signal we get GraphRAG for free.

Refs: docs/superpowers/specs/2026-05-homelab-training-graph-next-step.md
in infra repo + grill addendum item "Track A: GraphRAG wiring into
brain_answer is mandatory in same commit chain (no shelfware risk)".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 15:24:45 +02:00
Mathias
2148565ee6 feat(mcp): expose brain_graph tool — neighbors, subgraph, path
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
Commit 3 of Track A. The MCP server now publishes a new tool that
opens the brain knowledge graph (entities + wikilink edges) for
external consumers (claude.ai connectors, gitea-mcp, agentsquad).

- tools_graph.go: brain_graph handler dispatches by op:
    neighbors  — 1-hop outgoing from slug, optional edge_type filter
    subgraph   — every reachable slug within depth hops (≤6)
    path       — shortest directed path src→dst within depth (≤8)
  Returns slug + entity metadata + edge_type + hop distance.

- server.go: handleCall routes "brain_graph" to brainGraph.

- handlers.go: tool descriptor with the op enum + per-op required
  fields documented in the description.

- server_test.go: TestServerToolsList expects brain_graph in the
  listing.

The tool returns an error when BRAIN_GRAPH_ENABLED is unset — same
shape as brain_answer when the answer LLM is unconfigured.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 15:23:18 +02:00
Mathias
f43e0bccbf feat(graph): wire graphsync into MCP write/ingest/tunnel handlers
All checks were successful
CI / Lint / Test / Vet (push) Successful in 13s
CI / Mirror to GitHub (push) Successful in 4s
Commit 2 of Track A. Service stays a no-op until BRAIN_GRAPH_ENABLED=
true; flipping it on creates the schema (idempotent), starts indexing
every successful write, and optionally backfills the existing brain
dir.

- internal/graphsync: best-effort wrapper around graph.Extract +
  graphstore. IndexDoc reads docPath under brainDir, parses, upserts
  entity + replaces edges. BackfillFromBrainDir walks wiki/ +
  knowledge/. Both are no-ops on nil store so callers wire
  unconditionally.

- mcp.Server gains WithGraph builder + graphsync.Store field.
  brain_write, brain_ingest, brain_ingest_raw, brain_tunnel call
  indexInGraph after success — failures slog.Warn but never
  propagate (graph is augmentation, not correctness).

- cmd/server gates the wiring on BRAIN_GRAPH_ENABLED=true (default
  off so first rollout doesn't surprise). BRAIN_GRAPH_BACKFILL=true
  triggers a one-shot walk of the brain dir on boot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 15:21:33 +02:00
Mathias
f53ee18cb6 feat(graph): add brain_entities + brain_edges store and wikilink parser
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 3s
Foundation for Track A (GraphRAG on top of existing wiki). Two new
packages, both unwired — service behaviour unchanged until commit 2
hooks the pipeline.

- internal/graph: pure parser. Extract() walks markdown + frontmatter
  and emits one Entity + N wikilink Edges per doc. Dedupes per (dst,
  line), ignores self-references, classifies hall/concept/entity/
  source/knowledge from path layout.

- internal/graphstore: pgx-backed PGStore mirroring vectorstore's
  shape. Idempotent Init() creates brain_entities + brain_edges with
  indexes on src_slug, dst_slug, src_doc, wing, type. Operations:
  UpsertEntity, ReplaceEdgesForDoc (tx), DeleteByDoc, Neighbors,
  Subgraph (recursive CTE, depth ≤6), Path (shortest path, depth ≤8).

Schema lives on the shared postgres18 instance alongside the
brain_embeddings table — no new datastore. See
docs/superpowers/specs/2026-05-homelab-training-graph-next-step.md
in infra repo + infra#62.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 15:18:08 +02:00
Mathias
c153e9105c ci: retrigger build after chassis repo made public
All checks were successful
CI / Lint / Test / Vet (push) Successful in 13s
CI / Mirror to GitHub (push) Successful in 3s
Same reason as gitea-mcp ci retrigger commit — mcp-chassis was created
private; the ingestion port (commit ca22df2) couldn't fetch it in CI.
Chassis is now public; this empty commit retriggers the Build and deploy
pipeline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 12:17:55 +02:00
Mathias
ce96a6a571 fix(ci): allow ingestion Dockerfile to fetch internal gitea modules
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
Same fix as gitea-mcp commit for the same reason — mcp-chassis (added
in commit ca22df2) is hosted at gitea.d-ma.be and Gitea returns http://
in its go-import meta tag, breaking the default go module resolution
inside the Docker build.

GOPRIVATE+GOPROXY=direct+GOSUMDB=off plus a git config insteadOf rewrite
to flip http:// → https:// for gitea.d-ma.be clones.

Without this, hyperguild CI Build and deploy failed on the chassis
port (sha=ca22df2). Reapplying CI should now succeed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 12:12:09 +02:00
Mathias
ca22df2d6a feat(ingestion): migrate to gitea.d-ma.be/mathias/mcp-chassis v0.1.0
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 3s
Second port of the MCP chassis (gitea-mcp was first, commit 658f4ba).
Closes the chassis-adoption loop on the two highest-LOC consumers.

Changes:
- Drop ingestion/internal/auth/ entirely (jwt.go + jwt_test.go +
  protected_resource.go + protected_resource_test.go) — chassis provides
  JWTValidator + ProtectedResourceHandler with identical semantics.
- Drop ingestion/internal/mcp/auth.go (BearerAuth function, ~65 LOC)
  and the integration test auth_test.go (~200 LOC) — chassis
  BearerMiddleware replaces it. Static-Bearer-or-Dex-JWT precedence and
  RFC 9728 resource_metadata challenge behavior preserved 1:1.
- cmd/server/main.go: import chassis as `chassisauth`, rewire the three
  call sites. Use realm="brain" in the BearerMiddleware call so a 401
  challenge identifies the resource as the brain MCP.

OAuth client_credentials handler (ingestion/internal/oauth) stays —
chassis v0.1.0 covers only the JWT path; OAuth flow is a candidate for
chassis v0.2.0 once a second MCP needs it (rule of three).

Net delta: -~330 LOC of duplicated auth code; +1 import; +1 GOPRIVATE
env requirement on dev machines (documented in the spike handoff
2026-05-22-mcp-chassis-spike.md).

task check green (lint + test + vet + govulncheck).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 10:43:11 +02:00
Mathias
e49b36e463 feat(ingestion): expose Prometheus /metrics for brain query latency
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 3s
Closes infra#50.

Adds an internal/metrics package with a hand-rolled Prometheus
exposition layer (stdlib + sync/atomic only — no new dep) and wraps the
HTTP mux with a timing middleware. Every request emits one observation
on the `brain_query_duration_seconds` histogram labeled by
`path` (request Pattern, low cardinality) and `status` (2xx/3xx/4xx/5xx).

Dependency choice: hand-rolled rather than github.com/prometheus/client_golang
because the surface needed is small (one histogram + bucket constants)
and the repo CLAUDE.md keeps deps stdlib + jwx + testify only. ~150 LOC
of code + tests is cheaper than the chart of transitive prometheus deps.

Endpoints:
- GET /metrics  — OpenMetrics text exposition, no auth (cluster-internal)

Wire format pinned by tests in internal/metrics/metrics_test.go. The
ServiceMonitor that drives the kube-prometheus-stack scrape lives in
infra/k3s/apps/supervisor/ (separate commit on mathias/infra).

After this image deploys, the canary alert from
docs/superpowers/specs/2026-05-homelab-architecture-review.md becomes
wireable:

  histogram_quantile(0.95,
    sum(rate(brain_query_duration_seconds_bucket[5m])) by (le))
    > 1.5 * histogram_quantile(0.95,
        sum(rate(brain_query_duration_seconds_bucket[5m] offset 7d)) by (le))

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 07:13:05 +02:00
Mathias
815739758e feat(vectorstore): re-embed on file mtime > store updated_at (#23)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Has been skipped
Removes the TODO in Sync that left files static after their first embed.
Edits to brain/wiki/ and brain/knowledge/ now surface in subsequent
syncs without manual /backfill-embeddings calls.

Approach
- Store interface: KnownPaths → KnownPathsWithTime returning path →
  updated_at. Callers compare against file mtime to detect edits.
- PGStore: SELECT path, updated_at FROM brain_embeddings.
- Sync groups known chunks by parent path and tracks the EARLIEST
  updated_at per parent. A file is stale when its mtime is after that
  oldest chunk's timestamp — any chunk older than the file means at
  least one chunk hasn't been refreshed since the last edit.
- Stale-path rewrite: delete every old chunk for the parent (handles
  "file shrunk → fewer chunks → orphan rows at higher #NNNN" cleanly),
  then re-chunk + re-embed + re-upsert.

Tests
- New: TestSync_ReembedsFileWhenMtimeNewer — file mtime forced into the
  future vs store updated_at; Sync deletes old chunk + upserts fresh one.
- New: TestSync_SkipsFileWhenMtimeOlder — file mtime backdated; Sync is
  a no-op (no upserts, no deletes).
- Updated: stubStore.known is now map[string]time.Time. A zero value
  resolves to a far-future sentinel so existing "skip if already known"
  tests keep passing without per-test setup.
- pg_test renamed KnownPaths integration → KnownPathsWithTime; asserts
  updated_at is non-zero and within 5s of insert wall-clock.

Backward compat
- brain_embeddings rows pre-dating this change carry valid updated_at
  values (column was always populated via `DEFAULT now()` + ON CONFLICT
  `updated_at = now()`). No migration needed. Live pod will start
  re-embedding any file whose source has been edited since its chunks
  were originally written.

Closes gitea/mathias/hyperguild#23.
2026-05-20 09:50:45 +02:00
Mathias
6f1cb53295 feat(project_create): mirror_to_github opt-in, default false (infra#34 ADR)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Has been skipped
Per the Gitea-as-true-master ADR (infra#34), GitHub mirror is now an
explicit opt-in via mirror_to_github=true. Default (omit / false) provisions
a Gitea repo + staging namespace + experiment-brief issue only — no GitHub
repo, no push-mirror.

Rationale: US cloud providers (Microsoft/GitHub) are subject to CLOUD Act
and NSL. Client code, business logic, and infra-adjacent repos should
never live on US-owned infrastructure. Only open-source projects intended
for public community (hyperguild, gitea-mcp, template-*) should opt in.

Changes
- internal/skills/project/handlers.go
  - createArgs gains MirrorToGitHub bool (json:"mirror_to_github,omitempty").
  - res.GitHubURL is set only when MirrorToGitHub is true; empty string otherwise.
  - Steps 2 (create_github_repo) + 3 (mirror) are wrapped in `if args.MirrorToGitHub`.
  - experimentBrief renders "Gitea-only" line by default and the existing
    "Push-mirror configured" line only on opt-in.
- internal/skills/project/skill.go
  - Tool schema gains mirror_to_github (boolean, default false) with description
    spelling out when to opt in. Tool Description updated to reflect new default.
- internal/skills/project/handlers_test.go
  - Added mirroredArgs() helper (happyArgs + mirror_to_github:true).
  - Tests that exercise the GitHub flow (HappyPath, GitHubExists_Idempotent,
    GitHubFails, NoGitHubClient_DegradedMode, Idempotent_RepoExists,
    MirrorFails, InfraCommitFails) switched to mirroredArgs.
  - Added TestProjectCreate_DefaultSkipsGitHubMirror covering the Gitea-only
    path: 3 gitea-mcp calls, zero GitHub calls, empty github_url, reached=
    [create_repo, infra_commit, issue], body reflects Gitea-only.

Closes gitea/mathias/hyperguild#17. Moves infra#34 acceptance item
"project_create updated: mirror_to_github defaults to false".
2026-05-20 08:35:02 +02:00
Mathias
37fdd33b2d feat(ingestion): chunk markdown before embedding (#38)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Has been skipped
Long markdown files (>~8KB) silently failed to embed because nomic-embed-text
on iguana has a 2048-token context. embed sync logged errors=1 every cycle
with no useful body until #37 added per-item logging — three files exceed
the ceiling: finbert source (8 KB), koala-machine-state (7.1 KB),
litellm-absorption (8.8 KB). Curated knowledge entries should never be
vector-blind.

Approach: chunk-before-embed, no schema change.

vectorstore/chunk.go (new)
- ChunkMarkdown splits at H1/H2 boundaries; sections over maxBytes are
  further split at paragraph boundaries, packing greedily under budget.
- NumberChunks assigns "<parent>#NNNN" storage paths (1-based, zero-padded
  to 4 digits — handles files with up to ~10k sections in stable sort order).
- ParentPath strips the chunk suffix for retrieval-side dedup.

vectorstore/sync.go
- After ChunkMarkdown produces N pieces, each is embedded + upserted as a
  separate brain_embeddings row at "<parent>#NNNN". maxChunkBytes = 4000
  (≈1000 nomic tokens, well under the 2048 ceiling with headroom for
  unicode/code blocks).
- "Already embedded?" check now reduces known paths to parent set via
  ParentPath, so the first chunk hit short-circuits the file.
- Delete walk also reduces via ParentPath; when a parent file disappears,
  every chunk row (and any pre-existing bare-path row, for backward
  compatibility with rows written before this change) gets dropped.

search/search.go
- hybridMerge collapses chunk-path vector hits to parent via ParentPath
  before scope check, RRF accumulation, and hydration. A file with three
  chunk hits returns one result row, not three.

Backward compatibility: pre-existing bare-path rows in brain_embeddings
keep working — ParentPath returns them unchanged, knownParents handles
them as if they were "wiki/foo.md#NNNN" hits, sync skips re-embed, and
search dedup is a no-op for them. No migration required to ship.

Tests:
- chunk_test.go covers short / heading split / oversized section /
  content preservation / chunk numbering / parent-path stripping.
- sync_test.go adds long-file chunking, single-chunk-row short file,
  skip-if-any-chunk-known, delete-all-chunks-of-disappeared-file.
  Existing tests updated for #NNNN paths.
- search_test.go adds chunk-paths-dedupe-to-parent.

Closes gitea/mathias/infra#38.
2026-05-19 21:57:09 +02:00
Mathias
078ec029da fix(ingestion): embed sync also scans brain/knowledge/ + logs per-item errors
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Has been skipped
The embed sync goroutine only walked brain/wiki/. brain/knowledge/ (112
curated entries, per CLAUDE.md the most-important brain content) had zero
coverage in brain_embeddings — vector retrieval was blind to it. Hybrid
BM25 + pgvector retrieval would never surface a curated knowledge entry
via the vector arm.

Extract the per-root walk into a loop over a small subdir list and add
"knowledge" alongside "wiki". scanDirs is package-level so it stays a
single source of truth for what gets embedded.

Also log each failing item's path + error string from StartSync.
Previously only the aggregate count was logged, so a persistent
`errors=1` per cycle was opaque. With per-item warnings, the actual
ollama "input length exceeds the context length" surface immediately.

Refs gitea/mathias/infra#37 (this commit covers the knowledge/ scan
bug; the long-file chunking bug is a separate change.)
2026-05-19 21:27:15 +02:00
Mathias
4af1036423 fix(ingestion): redact password from BRAIN_PG_DSN log line
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
The previous "crude redaction" — pgDSN[:strings.IndexByte(pgDSN+"@", '@')] —
sliced up to the `@` character, which sits *after* the password in a
postgres URL, so the log line included the password in plaintext (caught
on first activation, 2026-05-18 startup log).

Use url.Parse + URL.Redacted() instead. Falls back to "postgres://***"
if parsing fails — we never log a raw DSN.
2026-05-19 13:04:12 +02:00
Mathias
7a13c75655 fix(scripts): brain-embeddings-init.sql psql-level conditionals
All checks were successful
CI / Lint / Test / Vet (push) Successful in 24s
CI / Mirror to GitHub (push) Successful in 3s
CREATE DATABASE doesn't work inside a DO $$ ... $$ block (transactional
restriction). And psql `:'var'` substitutions resolve client-side, so
they can't reach inside a DO block either.

Replace both DO blocks with psql-native idioms:
- `\gexec` for the conditional CREATE DATABASE
- `\if` + `\gset` for the create-or-rotate-password branch on the
  brain_app role

Verified end-to-end on koala postgres18: brain DB created, vector
0.8.1 extension installed, brain_app role login works.
2026-05-18 23:28:56 +02:00
Mathias
57462b52ff feat(brain): hybrid BM25 + pgvector retrieval (opt-in)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 15s
CI / Mirror to GitHub (push) Successful in 3s
Wires nomic-embed-text (iguana ollama) + pgvector on the shared
postgres18 into brain_query / brain_answer via Reciprocal Rank Fusion.
Pure BM25 stays the default; setting BRAIN_PG_DSN and BRAIN_EMBED_URL
together opts in. Setting one without the other is misconfiguration →
exit 1.

New packages:

- internal/embed
  Client.Embed(ctx, text) → []float32 via POST {URL}/api/embed.
  Defaults to nomic-embed-text:latest (768 dim). nil-on-empty-URL so
  callers gate on a single nil check.

- internal/vectorstore
  PGStore wraps a pgxpool against postgres18. Init creates
  brain_embeddings(path PK, vector(768), updated_at) + HNSW cosine
  index idempotently. Upsert / Delete / Search / KnownPaths.
  Sync(brainDir, store, embedder) diffs brain/wiki/ against the store
  and upserts new files / deletes removed ones; StartSync runs it on
  a ticker (default 300s). Integration tests gated by BRAIN_PG_TEST_DSN.

- scripts/brain-embeddings-init.sql
  One-time DBA setup: brain DB, brain_app role, vector extension,
  GRANTs. Idempotent.

Search layer:

- search.QueryOptions gains Vector + Embedder fields.
- QueryContext is the cancellable variant; Query stays for callers.
- When both are set, BM25 (top-N) and pgvector (top-4N) candidates
  merge via Reciprocal Rank Fusion (k=60, Cormack et al. 2009 — no
  tuning knob, robust to scale differences between rankers).
- Vector-only hits are hydrated from disk so callers see uniform
  Result records (path, title, excerpt, wing, hall, score).
- Wing/hall filters still apply to vector candidates via path-prefix.
- On embedder/vector errors the search falls back to BM25 — embedding
  outage degrades quality but doesn't take the brain offline.

MCP wiring:

- mcp.Server.WithHybridRetrieval(v, e) opt-in setter, same shape as
  WithReranker.
- brainQuery and brainAnswer pass the wired vector/embedder through
  to search.QueryContext.

REST:

- POST /backfill-embeddings drives Sync synchronously. Returns
  {added, deleted, errors[]}. 503 when feature is unconfigured.

cmd/server/main.go:

- BRAIN_PG_DSN + BRAIN_EMBED_URL together enable hybrid; one alone
  → exit 1.
- vectorAdapter bridges *PGStore (returns []Hit) to
  search.VectorSearcher (which takes []VectorHit) without either
  package importing the other.
- BRAIN_EMBED_SYNC_INTERVAL (default 300s) controls the background
  Sync ticker.

Backend pivot from Qdrant to pgvector recorded in DECISIONS.md
2026-05-18 (supersedes 2026-04-08): postgres18 already runs in
databases/ ns, Qdrant was never deployed, one engine beats two.

Dependency: github.com/jackc/pgx/v5 — modern, native pgvector via
parametric vector literals.

Tests:
- embed.Client: empty-URL nil, request shape, dimension, upstream
  error propagation, empty-text rejection.
- vectorstore.PGStore: dimension validation (unit); upsert/search/
  KnownPaths (integration, BRAIN_PG_TEST_DSN-gated).
- vectorstore.Sync: adds new files, skips known, deletes
  disappeared, skips _index.md, no-op when nil, collects embedder
  errors.
- search.Query: hybrid promotes vector-only hits via RRF; falls
  back to BM25 on embedder error.

Closes hyperguild#8.
2026-05-18 23:11:25 +02:00
Mathias
a56a4db963 feat(brain_answer): Qwen3-Reranker cross-encoder filter (opt-in)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
Adds an opt-in cross-encoder rerank step between BM25 retrieval and LLM
synthesis. With BRAIN_RERANKER_URL set, brain_answer retrieves BM25
top-20, scores each excerpt against the query via Qwen3-Reranker on
Ollama, drops the "no" answers, and forwards up to 5 surviving sources
to the LLM. Unset, behaviour is unchanged (BM25 top-10 → LLM).

The reranker is a *filter*, not a re-ranker: Qwen3-Reranker emits a
binary yes/no token under its native chat template, and ties within the
"yes" set are broken by BM25 rank — what got retrieved first stays
ahead.

New package ingestion/internal/reranker:
- Client with URL, Model, HTTP fields.
- New(url, model) returns nil on empty url so callers can treat
  "feature disabled" as a single nil check.
- Score(ctx, query, docs) issues one /api/generate call per doc using
  the Qwen3-Reranker yes/no chat template (verbatim, because the model
  was trained on this exact wording). Parses the first non-think token.

Wiring:
- mcp.Server gains a WithReranker fluent setter to keep NewServer
  signature stable.
- brain_answer's BM25 limit jumps to 20 only when a reranker is wired,
  to give the filter something to do.
- cmd/server/main.go reads BRAIN_RERANKER_URL (+ optional
  BRAIN_RERANKER_MODEL, default dengcao/Qwen3-Reranker-0.6B:F16).

Tests cover: nil-on-empty-url, ordered yes/no scoring, request shape
(model, prompt contents, yes/no template), ambiguous response → 0,
empty doc slice, upstream-error propagation, plus an end-to-end
brain_answer integration that proves only the relevant note reaches the
LLM when noise.md is rejected.

Closes hyperguild#7.
2026-05-18 22:55:46 +02:00
Mathias
58c57412a9 feat(brain-mcp): OAuth 2.0 client_credentials flow for claude.ai
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 3s
Adds a minimal RFC 8414 + RFC 6749 client_credentials flow so claude.ai's
custom-MCP integration (no static-Bearer field in the UI) can exchange a
client_id + client_secret pair for the existing BRAIN_MCP_TOKEN and use
it as a Bearer on /mcp. No JWTs, no refresh, no expiry — the rest of
the auth middleware is unchanged.

New package ingestion/internal/oauth:
- MetadataHandler(issuer): serves /.well-known/oauth-authorization-server
  with grant_types=[client_credentials] and both
  token_endpoint_auth_methods (post + basic).
- TokenHandler(cfg): serves /oauth/token. Validates client_id and
  client_secret via constant-time compare; returns BRAIN_MCP_TOKEN as
  access_token. RFC 6749 §5.2 error JSON on bad grant / bad creds.

Wiring in cmd/server/main.go: opt-in by setting both OAUTH_CLIENT_ID and
OAUTH_CLIENT_SECRET. Setting only one is misconfiguration → exit 1.
Mounts both endpoints with no auth; MCP_RESOURCE_URL supplies the
issuer.

Also pivots issue #8's vector backend from Qdrant to pgvector (see
DECISIONS.md 2026-05-18) — Qdrant was never deployed and postgres18 with
pgvector already runs as the project default; supersedes 2026-04-08 for
this use case.

Tests cover post-auth, basic-auth, wrong secret, bad grant, GET
rejection, malformed Basic header, and Basic without colon.

Closes hyperguild#5.
2026-05-18 22:21:54 +02:00
Mathias
ddd07ae7eb feat(brain): cross-wing tunnels — bidirectional wikilinks + auto-detect
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 3s
Adds the `brain_tunnel` MCP tool and auto-tunnel behaviour for
`brain_write`, so concepts that appear in multiple wings become
navigable from any of them.

New surface in package brain:
- WriteTunnel(brainDir, src, tgt) — appends a `## See also` bidirectional
  wikilink between two notes in different wings. Idempotent (link not
  duplicated on re-call) and reuses an existing See also section.
- DetectTunnels(brainDir, content) — walks brain/wiki/, returns
  TunnelCandidates for notes whose title appears in content. Tags
  whole-word case-insensitive hits as Exact=true and substring-only hits
  as Exact=false.
- AutoTunnel(brainDir, src, content) — wraps DetectTunnels: writes
  cross-wing exact matches, stages fuzzy matches into
  brain/raw/tunnel-candidates-<YYYY-MM-DD>.md for human review.

MCP wiring:
- `brain_tunnel` tool: explicit manual link (source, target).
- `brain_write` with wing+hall now triggers AutoTunnel on the new
  content. Failures are logged and never abort the primary write.

readTitleAndCreated also humanises the slug fallback (hyphens → spaces)
so titleless notes participate in content matching.

Closes hyperguild#16.

Tests: idempotency, same-wing rejection, missing-note rejection,
See-also reuse, exact/fuzzy detection, slug fallback, MCP tool happy
path, auto-tunnel hook (cross-wing exact → linked; same-wing → skipped;
fuzzy → candidates file).
2026-05-18 21:32:49 +02:00
Mathias
61b6247df9 fix(brain-mcp): static Bearer short-circuits before OAuth challenge
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 3s
Reorders BearerAuth so a valid BRAIN_MCP_TOKEN match wins instantly and
never emits WWW-Authenticate. Adds RFC 9728 resource_metadata challenge
header on 401 (only when MCP_RESOURCE_URL is configured) so claude.ai's
OAuth-discovery path still works.

Why: claude CLI on koala/flamingo with `.mcp.json` `Authorization: Bearer
$BRAIN_MCP_TOKEN` was being kicked into RFC 7591 dynamic client
registration against Dex (static-only) and dying. Cause was the auth
middleware running JWT validation first and emitting an OAuth challenge
on the fall-through 401 even when the caller had a valid static token.
Inverting the precedence and gating the challenge on resourceMetadataURL
keeps the LAN/Tailscale CLI path silent and only invites OAuth discovery
on actually-unauthenticated requests.

Regression guards in the test file:
- valid static Bearer 200 has no WWW-Authenticate
- 401 with resourceMetadataURL set carries the challenge
- 401 with empty resourceMetadataURL emits no challenge

Closes hyperguild#9 in code. Live verification (claude CLI on koala
listing brain tools) blocked on ingestion image rebuild + redeploy.
2026-05-18 21:00:05 +02:00
Mathias
75685e7b67 feat(brain): structured wing/hall taxonomy + obsidian-compatible layout
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
Adds a two-dimensional address (wing, hall) to brain notes. A wing is a
topic domain (e.g. jepa-fx, hyperguild); a hall is one of a closed
vocabulary of memory types (facts, decisions, failures, hypotheses,
sources). Notes route to brain/wiki/<wing>/<hall>/<slug>.md with
wing/hall/created_at YAML frontmatter, making the directory a valid
Obsidian vault.

Changes:
- new package ingestion/internal/brain (NotePath, ValidHalls, Sanitise,
  BuildWingIndex, BuildAllWingIndexes)
- api.WriteNote refactored to WriteNoteOptions; wing+hall routes to
  brain/wiki/, otherwise falls back to brain/knowledge/ (legacy)
- search.Query → QueryOptions with optional Wing/Hall filtering; Result
  carries wing/hall extracted from frontmatter or path segments
- MCP tools brain_write and brain_query gain optional wing/hall params
  (hall enum-validated); new brain_index tool regenerates _index.md MOC
- POST /index REST endpoint mirrors brain_index
- brain_write auto-rebuilds the wing's _index.md after a wing+hall write
- scripts/migrate-brain-halls.sh migrates flat brain/wiki/{concepts,entities}/
  into the new layout (dry-run by default, --commit applies)

All existing tests pass; new tests cover wing/hall write routing, scope
filtering, invalid hall rejection, _index.md generation, and migration
script paths.

Closes hyperguild#1.
2026-05-18 20:47:08 +02:00
Mathias
fe18e4ee77 test(routing): de-flake TestRoutingPodEndToEnd
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
- Random port via net.Listen(":0") replaces hardcoded 33310 (was the
  primary failure mode under parallel test load).
- Bump waitForPort deadline 5s → 30s — `go build` under -race can exceed
  5s on a loaded machine.
- Replace osPath() (always returned empty PATH because exec.Command("env").Env
  is the *child's* env, not the parent's) with explicit PATH+HOME via
  os.Getenv. Don't inherit full env: would leak ROUTING_MCP_TOKEN from the
  parent shell and flip the routing pod into auth-required mode, breaking
  the test.

Closes #15. Verified: 10 cold-cache test runs pass, 3 consecutive task check
runs pass.
2026-05-18 20:00:18 +02:00
Mathias
937355cabe fix(project_create): commit staging namespace directly to infra main
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 3s
Drops the intermediate `staging/<name>` branch so Flux begins reconciling the
namespace within ~60s of `project_create` instead of waiting on a human PR
merge. Consistent with project-wide trunk-based development.

Rationale: ADR 2026-05-18 in DECISIONS.md.

Closes hyperguild#14 (item 1). Item 2 (GITEA_MCP_TOKEN in SOPS) verified
already-present in infra@408a527 secrets.enc.yaml.

Note: TestRoutingPodEndToEnd is failing on main pre-existing this commit
(context deadline waiting for port 33310 in <5s). Not caused by this change;
project skill tests pass. To track in a separate issue.
2026-05-18 17:20:53 +02:00
Mathias
5950ef5f0f feat(mcpclient): fail-fast on empty bearer token
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 4s
mcpclient.New previously accepted an empty token and silently omitted
the Authorization header at request time. When the env var sourcing
the token was missing from a Kubernetes Secret (envFrom doesn't warn
on missing keys), this surfaced as an opaque 401 from the upstream
MCP server with no log trail — see hyperguild #13 and brain entry
"mcpclient-empty-token-silent-401-envfrom-missing-key".

mcpclient.New now returns ErrTokenRequired when token is empty.
The routing pod's project_create init checks the error and exits
with a clear message pointing at routing-secrets, turning a runtime
401 storm into a startup crashloop the operator can fix immediately.

Tests pass a dummy "test" token (httptest servers don't enforce
bearer auth, so any non-empty value works). Added a regression
test asserting empty-token construction returns ErrTokenRequired.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 16:28:09 +02:00
Mathias
a220fcaf2b feat(routing): create GitHub destination repo before configuring push-mirror
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Has been skipped
Gitea's push-mirror cannot push to a non-existent remote — it just
runs 'git push' against whatever URL it's given. So a project_create
flow that only configures the mirror leaves the GitHub side as an
unfulfillable URL.

New internal/githubclient package: single-purpose client that POSTs
/user/repos to create an empty private repo (auto_init=false so the
first mirror push doesn't conflict with a generated README). Treats
422 'name already exists' as idempotent success via ErrAlreadyExists.
401/403 are surfaced as 'PAT missing repo scope or invalid' so the
operator sees the real cause instead of a vague upstream error.

Skill wiring:
- New stepCreateGitHub between stepCreateRepo and stepMirror in the
  orchestrator.
- Skipped entirely when Config.GitHub is nil (degraded mode — the
  routing pod runs without GITHUB_PAT, mirror config still lands,
  but the actual sync to github fails until the repo exists).
- cmd/routing/main.go constructs githubclient.New(GitHubPAT) only
  when the PAT is set; the skill receives nil otherwise.

Tests:
- happy path: fake github 201 + assertions that the 'reached' array
  is [create_repo, create_github_repo, mirror, infra_commit, issue].
- github 422 already-exists: idempotent, all gitea steps still run.
- github 401: returns failed_step=create_github_repo, no mirror or
  later steps.
- degraded mode (Config.GitHub nil): reached omits create_github_repo,
  rest of the flow runs unchanged.

Updated existing tests to read [skill, gh] from newSkill instead of
just skill, and adjusted reached-array expectations to include the
new step.

Tracks #10.
2026-05-18 13:42:03 +02:00
Mathias
d1c8e3396f fix(cd): drop retired supervisor build, add routing rollout verification
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
Plan 7 (2026-05-12) retired the supervisor pod, deleted cmd/supervisor/
and the root Dockerfile, but cd.yml still tried to:

- buildctl a supervisor image using the (non-existent) root Dockerfile
- sed gitea.d-ma.be/mathias/supervisor: in k3s/apps/supervisor/deployment.yaml
  (also non-existent — k3s/apps/supervisor/ only ships ingestion-* files now)
- wait for and rollout-verify a supervisor Deployment that no longer exists

Result: every CD run since the retirement has been failing at 'Build and push
supervisor image', leaving ingestion + routing un-deployed despite the binaries
being built. The routing pod was last deployed at sha 189ff89c (weeks stale).

This commit:
- Removes the supervisor build step and supervisor sed/git add lines.
- Adds 'Wait for Flux to apply new routing image' + 'Verify routing rollout'
  steps that mirror the ingestion equivalents, so failures land loudly rather
  than 5 min later when something tries to call the new tool.
- Updates the chore(deploy) commit message to 'ingestion+routing' to match
  reality.

Unblocks deployment of feat: project_create (#10).
2026-05-18 11:48:57 +02:00
Mathias
3b79311fdd feat(routing): project_create MCP tool — gitea-first new-project pipeline (#10)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
Adds the project_create tool to the routing pod that automates the
"new project" bootstrap end-to-end from claude.ai. Gitea-first
architecture: GitHub receives the repo only via push-mirror, never
via a direct GitHub API call from this server.

Four sequential calls to the gitea-mcp server (configured via
GITEA_MCP_URL):

  1. create_project_from_template — Gitea repo from
     template-go-{agent,web} per the 'stack' arg
  2. repo_mirror_push (action=add) — push-mirror to
     github.com/<GITHUB_OWNER>/<name>.git, interval 8h, sync_on_commit
  3. file_write_branch — k3s/staging/<name>/namespace.yaml committed
     on a staging/<name> branch in the infra repo
  4. issue_create — experiment brief (hypothesis + description + stack
     + provisioning log) on the new repo, returns the issue_url

Returns gitea_url, github_url, issue_url, next_steps. The next_steps
string is the exact shell sequence the operator runs locally to
clone, scaffold via local-dev 'task new-project', and push.

Idempotency: create_project_from_template + repo_mirror_push +
file_write_branch all return JSON-RPC code -32003 (Conflict) when
their target already exists; the orchestrator swallows the conflict
and continues. Re-running on an existing repo restates the brief in
a fresh issue.

Error handling: on any non-conflict downstream failure the response
returns {reached: ["<step>",...], failed_step: "<step>"} alongside
a JSON-RPC error. No rollback — partial state stays so the operator
can resume manually.

New env vars (all optional except GITEA_MCP_URL):
  GITEA_MCP_URL    enables the tool
  GITEA_MCP_TOKEN  bearer auth for gitea-mcp
  GITEA_OWNER      default mathias
  GITHUB_OWNER     default mathiasb
  INFRA_REPO       default infra
  GITHUB_PAT       repo scope, used as mirror remote_password; never logged

Without GITEA_MCP_URL set, the tool is not registered and the
routing pod starts normally (degrades open).

internal/mcpclient/: new minimal JSON-RPC tools/call client with
bearer auth, used by project_create. Unwraps MCP's
content[0].text envelope and surfaces typed errors via mcpclient.Error.

Tests: table-driven against an httptest fake gitea-mcp covering happy
path (4-step success + correct PATCH-style arg shapes), idempotent
repo-exists, mirror failure (partial-success response with reached=
[create_repo] + failed_step=mirror), infra-commit failure (reached up
to mirror + failed_step=infra_commit), and validation errors.

Closes #10
2026-05-18 11:44:39 +02:00
Mathias
7baf8d7e7a chore: re-sync context adapters from updated root AGENT.md 2026-05-18 11:44:02 +02:00
Mathias Bergqvist
a8de04c7b6 docs: update canonical PROJECT.md for completed 7-plan migration
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
Updates MCP endpoints section: supervisor retired, brain gets HTTPS
domain + Dex JWT auth + brain_answer/brain_classify. Regenerate all
derived adapter files via context:sync.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:53:46 +02:00
Mathias Bergqvist
87cf9d0afc docs: update CLAUDE.md and DECISIONS.md for completed 7-plan migration
Some checks failed
CI / Mirror to GitHub (push) Has been cancelled
CI / Lint / Test / Vet (push) Has been cancelled
Reflects Plan 7 (supervisor retirement) and brain_answer/brain_classify
addition. Supervisor MCP endpoint removed; brain now exposes HTTPS domain
with Dex JWT auth. Routing decisions documented for LLM berget→iguana chain.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 14:53:08 +02:00
Mathias Bergqvist
46adaf2148 chore(mcp): remove supervisor entry from .mcp.json
All checks were successful
CI / Lint / Test / Vet (push) Successful in 9s
CI / Mirror to GitHub (push) Successful in 3s
2026-05-12 14:49:46 +02:00
Mathias Bergqvist
c11763472c feat(plan7): retire supervisor pod — delete cmd/supervisor, tdd/spec skills, Dockerfile
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
Removes the supervisor binary and its two exclusive skill packages (tdd,
spec) now that all functionality is covered by SKILL.md files, the routing
pod, and the brain MCP. Routing pod reuses review/debug/retrospective/trainer
skill packages which are intentionally preserved.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 12:18:30 +02:00
Mathias Bergqvist
189ff89c34 feat(brain): add brain_answer and brain_classify MCP tools
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 3s
Adds two new LLM-backed MCP tools to the ingestion service:

- brain_answer(query): BM25 retrieval + LLM synthesis → answer + sources
- brain_classify(text): classifies doc into type/title/tags via LLM

Adds llm.Router for primary→fallback routing (berget.ai → iguana).
Wired via BRAIN_LLM_PRIMARY_URL/BRAIN_LLM_FALLBACK_URL env vars;
no-op when unset so existing deployments are unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 11:06:17 +02:00
Mathias Bergqvist
c7e0192486 feat(auth): add Dex JWT middleware to supervisor, routing pod, and brain MCP
All checks were successful
CI / Lint / Test / Vet (push) Successful in 13s
CI / Mirror to GitHub (push) Successful in 3s
Closes #6 on gitea.d-ma.be/mathias/hyperguild.

Dex is deployed at auth.d-ma.be. All three MCP servers now accept JWTs
issued by Dex in addition to static bearer tokens, enabling claude.ai
OAuth 2.0 integration without abandoning backward-compat CLI auth.

Changes:
- internal/auth/: new Validator (JWKS auto-refresh via lestrrat-go/jwx/v2),
  ProtectedResourceHandler (RFC 9728 /.well-known/oauth-protected-resource)
- internal/mcp/Server: adds optional *auth.Validator; checkAuth tries JWT
  first, then static token fallback; both-nil = auth disabled (unchanged default)
- cmd/supervisor, cmd/routing: construct Validator from DEX_ISSUER_URL +
  MCP_AUDIENCE env vars; register protected-resource handler when set
- ingestion/internal/auth/: same Validator + handler (separate module)
- ingestion/internal/mcp/BearerAuth: same JWT-or-static chain
- ingestion/cmd/server: same wiring pattern

New env vars (all optional; absent = static-token-only, same as before):
  DEX_ISSUER_URL   — Dex issuer URL (e.g. https://auth.d-ma.be)
  MCP_AUDIENCE     — expected aud claim (e.g. brain, supervisor)
  MCP_RESOURCE_URL — resource identifier for RFC 9728 metadata response

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 20:10:05 +02:00
1c3c9de550 Merge pull request 'refactor(routing): rename local/claude to fast/thinking model pair' (#4) from agent/thinking-fast-routing into main
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 4s
2026-05-08 14:43:29 +00:00
d0edc1a725 Merge pull request 'chore(mcp): switch MCP endpoints to HTTPS domain URLs' (#3) from agent/mcp-domain-urls into main
Some checks failed
CI / Lint / Test / Vet (push) Has been cancelled
CI / Mirror to GitHub (push) Has been cancelled
2026-05-08 14:43:18 +00:00
Mathias Bergqvist
5b207425ed refactor(routing): rename local/claude to fast/thinking model pair
All checks were successful
CI / Lint / Test / Vet (pull_request) Successful in 10s
CI / Mirror to GitHub (pull_request) Has been skipped
The routing decision is about reasoning capacity, not cost or provider.
Fast model (koala/qwen35-9b-fast) handles high-pass-rate calls; thinking
model (iguana/gemma4-26b) handles low-pass-rate calls. Removes the
implicit Anthropic dependency from the routing pod — both models go
through LiteLLM.

Renames: HYPERGUILD_LOCAL_MODEL → HYPERGUILD_FAST_MODEL,
HYPERGUILD_CLAUDE_MODEL → HYPERGUILD_THINKING_MODEL,
Router.LocalModel → FastModel, Router.ClaudeModel → ThinkingModel,
log decision "claude_fallback" → "thinking_fallback".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 16:39:42 +02:00
Mathias Bergqvist
cb51ff7ba1 chore(mcp): switch MCP endpoints to HTTPS domain URLs
All checks were successful
CI / Lint / Test / Vet (pull_request) Successful in 10s
CI / Mirror to GitHub (pull_request) Has been skipped
Brain and supervisor now behind NPM with Let's Encrypt. Use canonical
hostnames (brain-mcp.d-ma.be, supervisor-mcp.d-ma.be) over NodePorts so
connections work across networks without Tailscale for DNS.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-08 14:10:25 +02:00
Mathias Bergqvist
43a8255272 fix(mcp): add SSE GET handler for streamable HTTP transport
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 4s
claude.ai probes with GET before initialize; without this the supervisor
returned application/json parse error instead of text/event-stream, causing
"Couldn't reach the MCP server" in the claude.ai connector setup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-07 23:27:56 +02:00
Mathias Bergqvist
78be3d1f9c fix(ingestion): support GET/SSE on /mcp endpoint for claude.ai compatibility
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
2026-05-07 23:20:47 +02:00
Mathias Bergqvist
7139a3ca74 ci: add environment gate and Flux rollout verification to cd pipeline
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
Aligns hyperguild's cd.yml with the cobalt-dingo reference pattern:
- Add environment: staging to the deploy job
- Add Flux reconcile trigger after infra repo push
- Add polling wait for supervisor and ingestion image tags to propagate
- Add rollout status verification for both deployments with failure
  diagnostics (pod status, events, describe)
2026-05-07 21:52:52 +02:00
Mathias Bergqvist
c509ae2a5f refactor(ingestion): use strings.CutPrefix for explicit Bearer scheme check 2026-05-07 21:02:14 +02:00
Mathias Bergqvist
228ee57d4c feat(ingestion): add bearer token auth middleware for MCP endpoint 2026-05-07 20:58:16 +02:00
Mathias Bergqvist
bee4bb3c1f chore(routing): pre-merge cleanup — Plan 7 reminders, code_review→review, operator note
All checks were successful
CI / Lint / Test / Vet (push) Successful in 11s
CI / Mirror to GitHub (push) Successful in 4s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 23:22:15 +02:00
Mathias Bergqvist
d72454d929 docs(routing): document Mode 2 routing pod + env vars
Add routing pod to README architecture diagram and env vars table.
Add routing MCP endpoint to .context/PROJECT.md. Regenerate derived
context adapters via task context:sync.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 23:00:48 +02:00
Mathias Bergqvist
cf94d14922 chore(routing): drop unused BIN_PID assignment in smoke script 2026-05-05 22:56:44 +02:00
Mathias Bergqvist
78a43d6a42 test(routing): live-contract smoke target
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 22:52:23 +02:00
Mathias Bergqvist
ca933eef46 build(routing): Dockerfile + CD workflow
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 07:19:18 +02:00
Mathias Bergqvist
88782de07c feat(hyperguild): mode client-local writes routing headers
Plan 6 is now deployed; replace the _routing_pending placeholder in the
routing MCP entry with a real headers block carrying X-Hyperguild-Mode:
client-local. The pod treats absent or unknown values as client-local,
so this is forward-compat for future modes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 07:13:24 +02:00
Mathias Bergqvist
083c2d7db9 feat(routing): cmd/routing binary
Wires Config → LiteLLMExecutor → Router → four skills (review, debug,
retrospective, trainer) → Registry → MCP server with bearer auth and
/healthz. Each skill's CompleteFunc is wrapped so the Router decides
local-vs-Claude per call and logs every decision to the brain /mcp.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 23:43:59 +02:00
Mathias Bergqvist
751f410ca6 test(routing): pin tool-schema parity with supervisor
Captures the four routed skills' (review, debug, retrospective, trainer)
tool definitions as a JSON snapshot and asserts the routing pod's registry
advertises byte-equal schemas. A deliberate schema change fails this test,
requiring an intentional snapshot update in lockstep with consumers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 22:59:06 +02:00
Mathias Bergqvist
3a99d5e20e refactor(routing): surface logger errors via slog.Warn
Replace silent `_ = r.Logger.LogDecision(...)` discards with an
if-err check that emits slog.Warn on failure. A brain outage now
produces a visible warn line instead of swallowing the telemetry
error entirely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 22:55:35 +02:00
Mathias Bergqvist
9a258ca32a feat(routing): router dispatch wrapper
Composes Fetcher + Policy + Logger + CompleteFunc into a single Run method.
Falls open to Claude on local-model errors; defaults to local when brain is
unreachable. Skill packages will receive Router.Run as their CompleteFunc.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 22:51:01 +02:00
Mathias Bergqvist
2a5a74f7c0 feat(routing): decision logger via brain MCP session_log
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:52:09 +02:00
Mathias Bergqvist
d40a5ac890 test(routing): cover TTL expiry in fetcher
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:50:01 +02:00
Mathias Bergqvist
b77820534a feat(routing): pass-rate fetcher with TTL cache
HTTP client that calls GET /pass-rate?skill=X&window=Y on the brain pod.
Caches *float64 results (including nil) per-skill for the configured TTL
(default 60s). On non-200 or network error returns (nil, err) so the
upstream router can fall through to default-to-local.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:46:11 +02:00
Mathias Bergqvist
db64ecb1d9 feat(routing): canonical request hash
SHA-256 of (system, user) joined with 0x00 separator, truncated to
uint64. Drives deterministic sample-band routing: identical prompt pair
→ same hash → same local-vs-Claude decision on every call.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:41:42 +02:00
Mathias Bergqvist
ea29e5ebb8 feat(routing): decision policy
Pure-function Policy{Floor,Ceil} with Decide(*float64, uint64) Decision.
Rules in priority order: nil → local; ≥floor → local; <ceil → claude;
sample band → low bit of requestHash.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:36:59 +02:00
Mathias Bergqvist
ccf080db59 refactor(routing): clarify Floor/Ceil semantics + extend test coverage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:34:22 +02:00
Mathias Bergqvist
69c038478b feat(routing): RoutingConfig + LoadRouting
Typed config struct and env parser for the routing pod. Kept separate
from the supervisor Config to avoid forcing routing fields onto the
supervisor and vice versa. Uses the existing envOr helper from config.go.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-04 15:25:31 +02:00
Mathias Bergqvist
b6bcc93048 docs(plan6): implementation plan for Mode 2 routing pod
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
14 TDD-shaped tasks across two worktrees: hyperguild for code
(internal/routing package, cmd/routing binary, Dockerfile, CD
workflow, mode template, smoke test, docs) and infra for the
k3s manifests (deployment, service, nodeport, SOPS-encrypted
secret). Plan 7 amendment baked in: internal/skills/{review,
debug,retrospective,trainer} survive Plan 6 — Plan 7 only
deletes tdd, spec, and the supervisor binary.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:53:03 +02:00
Mathias Bergqvist
51e01233a4 docs(plan6): spec for Mode 2 routing pod
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
Drafted via superpowers:feature-spec. Plan 6 of 7 in the skill
migration. Surface frozen at 4 cost-routable skills (code_review,
debug, retrospective, trainer); LiteLLM proxies model choice; pass-
rate drives the route decision with default-to-local plus an env
kill switch for the empty-data window. Plan 7 amendment baked in:
internal/skills/{review,debug,retrospective,trainer} survive Plan 6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:29:26 +02:00
Mathias Bergqvist
f49850d23b chore(mcp): require bearer token for supervisor MCP
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 4s
The pod now enforces SUPERVISOR_MCP_TOKEN; this matches the .mcp.json
header so a Claude Code session in this repo authenticates correctly.
Token comes from the operator's shell env, not the repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 13:57:14 +02:00
Mathias Bergqvist
928f23ab1b feat(mcp): optional bearer-token auth via SUPERVISOR_MCP_TOKEN
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 3s
Enables exposing the supervisor MCP via Tailscale Funnel for claude.ai
custom-connector tests. Auth is opt-in: empty SUPERVISOR_MCP_TOKEN
preserves the existing unauthenticated behavior for tailnet-internal
callers and local dev.

When the token is set, every request must carry
"Authorization: Bearer <token>" or it is rejected with HTTP 401 and a
JSON-RPC -32001 error. Comparison uses crypto/subtle.ConstantTimeCompare;
the token value and the supplied header are never logged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 07:31:29 +02:00
Mathias Bergqvist
1b9c4905a5 docs(plans): patch Plan 5 — test helper sessions/ subdir + sessionlog docstring step
All checks were successful
CI / Lint / Test / Vet (push) Successful in 9s
CI / Mirror to GitHub (push) Successful in 4s
Two corrections applied during Plan 5 execution:

  - Task 2 test helper writeSession now joins a sessions/ subdir so it
    matches the handler's <brainDir>/sessions/*.jsonl scan path.
    (The original heredoc would have produced 0 records in tests.)

  - Task 6 grew a Step 1.5 to update the session_log MCP tool's
    final_status description, picking up a spec requirement that
    didn't translate into a task in the original plan.
2026-05-03 22:59:21 +02:00
Mathias Bergqvist
400025715a feat: pass-rate /pass-rate endpoint and CLI — Plan 5 of hyperguild migration
All checks were successful
CI / Lint / Test / Vet (push) Successful in 9s
CI / Mirror to GitHub (push) Successful in 3s
Adds the read side of pass-rate logging that consumes the SKILL.md
instrumentation landed in the local-dev companion merge:

  - GET /pass-rate?skill=X&window=Y on the brain pod (ingestion module)
    walks brain/sessions/*.jsonl, normalizes legacy {ok,error,skipped}
    to canonical {pass,fail,skip}, returns aggregate counts plus
    pass_rate (null when pass+fail == 0).

  - hyperguild brain pass-rate <skill> [--window 7d] [--json] CLI
    subcommand (third nested verb under brain).

  - session_log MCP tool docstring updated to lead with the new
    pass|fail|skip vocabulary; legacy values still accepted by both
    the tool's loose validator and the aggregator's normalization.

This is the brain HTTP REST API's first GET endpoint — pure reads
follow REST semantics; legacy POST routes (query, write, ingest, etc.)
all take JSON bodies. Future read endpoints SHOULD use GET.

Plan 6 routing pod is the consumer; one week of usage data between
this merge and Plan 6 deploy will mean Plan 6 lands on real numbers.

Plan: docs/superpowers/plans/2026-05-03-pass-rate-logging.md
Spec: docs/superpowers/specs/2026-05-03-pass-rate-logging-design.md
2026-05-03 22:58:24 +02:00
Mathias Bergqvist
986e3e1d12 docs(hyperguild): document brain pass-rate subcommand and /pass-rate endpoint
Adds pass-rate to the CLI README's subcommand block. Updates CLAUDE.md
to note the new /pass-rate endpoint alongside the existing brain
HTTP REST API surface. Updates the session_log MCP tool's
final_status description to reflect the new pass|fail|skip vocabulary
introduced by Plan 5's SKILL.md instrumentation; the aggregator
still accepts legacy ok|error|skipped values for backwards compat.
2026-05-03 22:55:35 +02:00
Mathias Bergqvist
593d1a4c6d feat(hyperguild): brain pass-rate subcommand
Adds 'hyperguild brain pass-rate <skill> [--window 7d] [--json]'
calling the new /pass-rate endpoint. Human output:
  tdd: 47 / 50 = 94% (window: 7d)
or 'no data (window: 7d)' when pass_rate is null.

PassRateResult mirrors the response envelope; PassRate is *float64
so null is preserved across decode.
2026-05-03 22:44:04 +02:00
Mathias Bergqvist
417bf224eb feat(brain): register GET /pass-rate route
Adds the route entry alongside the existing POST routes. Note: this
is the brain HTTP REST API's first GET endpoint — it follows REST
semantics for pure reads, while the legacy POST routes (query, write,
ingest, etc.) all take JSON bodies. Future read endpoints SHOULD use
GET; future write endpoints continue with POST.
2026-05-03 22:40:31 +02:00
Mathias Bergqvist
37dbd22eff feat(brain): /pass-rate aggregator and handler
Adds a new HTTP GET handler at the ingestion pod that walks
brain/sessions/*.jsonl, filters by skill name and timestamp window
(default 7d, accepts Nh and Nd), normalizes legacy status vocabulary
(ok->pass, error->fail, skipped->skip), and returns aggregated counts
plus pass_rate.

Pass rate is null when pass+fail == 0, distinguishing 'no data' from
'always passes'. Plan 6 routing pod will check for null before
making decisions.

Route registration in cmd/server/main.go lands in a follow-up commit.
2026-05-03 22:37:41 +02:00
Mathias Bergqvist
cbf5cab5e7 docs(plans): pass-rate logging implementation plan — Plan 5
All checks were successful
CI / Lint / Test / Vet (push) Successful in 10s
CI / Mirror to GitHub (push) Successful in 4s
Seven-task plan spanning two worktrees: Phase 1 (local-dev) for
SKILL.md instrumentation (pilot tdd, then rollout to 6 binary-outcome
skills), Phase 2 (hyperguild) for the /pass-rate endpoint + CLI
subcommand. Uses GET for the pure-read endpoint (REST semantics)
despite the brain API's POST-everywhere precedent for legacy
endpoints.

Pilot-then-rollout structure: tdd SKILL.md lands first; the
endpoint TDD task naturally exercises the new logging contract
as the dogfood validation step before the other 6 skills get
instrumented.
2026-05-03 22:29:56 +02:00
Mathias Bergqvist
af52f501fe docs(specs): pass-rate logging design — Plan 5 of hyperguild migration
Skill instrumentation pattern, brain /pass-rate HTTP endpoint, and
optional hyperguild CLI subcommand for shell access. Pilot with tdd
SKILL.md, then roll out to 6 binary-outcome skills (code-review,
debug, feature-spec, session-retrospective, trainer, spec-driven-dev).

Decisions: SKILL.md as source of truth for the logging contract;
on-demand aggregation from JSONL (no materialized counters until
proven necessary); pass|fail|skip vocabulary forward, with
ok|error|skipped accepted by the read-side aggregator for backwards
compat.

Seven success criteria, ten out-of-scope items, six risks.
2026-05-03 22:23:28 +02:00
Mathias Bergqvist
b3b1fde825 feat: hyperguild CLI — Plan 4 of hyperguild migration
All checks were successful
CI / Lint / Test / Vet (push) Successful in 15s
CI / Mirror to GitHub (push) Successful in 3s
A small Go binary at cmd/hyperguild/ providing four subcommands:

  - tier              probe Anthropic + LiteLLM, print operating tier
  - brain query       BM25 search the brain via HTTP REST
  - brain write       write a knowledge entry from stdin
  - mode <name>       bootstrap .mcp.json (cloud|client-local|sovereign)

Reuses internal/tier verbatim. Brain access uses the HTTP REST API
at ${BRAIN_URL:-http://koala:30330} with POST /query and POST /write.
Mode templates include a Plan 6 routing-pod placeholder (client-local)
and a Crush-fallback note (sovereign).

Stdlib only — flag, net/http, encoding/json. 32 unit tests via
httptest.Server fakes; smoke-tested end-to-end against the live brain.

Replaces the supervisor's tier MCP. Plans 5 (pass-rate logging) and 6
(routing pod) build on the brain client and tier subcommand.

Plan: docs/superpowers/plans/2026-05-03-hyperguild-cli.md
Spec: docs/superpowers/specs/2026-05-03-hyperguild-cli-design.md
2026-05-03 22:06:48 +02:00
Mathias Bergqvist
ab4cfaaeb7 fix(hyperguild): remove redundant subcommand prefixes from error messages
dispatch() already prefixes errors with 'hyperguild <subcmd>: ', so
handlers re-prefixing with their own name produced stuttered output
like 'hyperguild brain: brain query: topic required'. Strip the
redundant prefixes from the seven affected errors.New / fmt.Errorf
calls in brain.go and mode.go.

Also fix the LITELLM_BASE_URL usage text — it's optional (empty
falls through to airplane tier), not required, matching the
README.
2026-05-03 22:06:33 +02:00
Mathias Bergqvist
eb844edb29 docs(specs): correct brain Query method (POST not GET)
The brain HTTP REST /query endpoint accepts POST with JSON body
{query, limit}, not GET with URL query string. Surfaced empirically
during Plan 4 Task 7 smoke testing. Spec text now matches the
implementation.
2026-05-03 22:00:05 +02:00
Mathias Bergqvist
317ec20392 fix(hyperguild): brain Query uses POST /query with JSON body
The brain HTTP REST /query endpoint accepts POST with JSON
{query, limit}, not GET with URL query string. Surfaced by
Task 7 smoke testing — GET returned 405 Method Not Allowed.

The response shape ({results:[...]}) is unchanged; only the
request side flips to POST + JSON body. brainClient.Write was
already using POST + JSON body and is unaffected.

Tests updated to assert POST + JSON body on the Query path.
2026-05-03 21:59:17 +02:00
Mathias Bergqvist
eab8775f5f feat(hyperguild): README + Taskfile integration
Adds cmd/hyperguild/README.md (subcommands, env vars, install path)
and three Taskfile targets:

  task hyperguild:dev     — go run from source
  task hyperguild:build   — build into ./bin/hyperguild
  task hyperguild:install — go install into $GOBIN

Concludes Plan 4 of the hyperguild migration. The binary replaces
the supervisor's tier MCP and surfaces brain HTTP REST access plus
mode bootstrap to shell pipelines and ad-hoc agent prompts.
2026-05-03 21:56:20 +02:00
Mathias Bergqvist
a0d0914a85 feat(hyperguild): mode subcommand
'hyperguild mode <cloud|client-local|sovereign>' writes a per-mode
.mcp.json template:

  - cloud:        brain MCP only
  - client-local: brain + routing placeholder with _routing_pending
                  pointer to Plan 6
  - sovereign:    brain only + top-level _mode_note explaining Crush
                  is primary; .mcp.json is Claude Code fallback

Default output is ./.mcp.json; --out overrides; --force overwrites.
Brain URL sourced from BRAIN_URL (default http://koala:30330) so the
template stays in lockstep with the user's brain host.

All three subcommands now wired; notYet/errNotImplemented removed
from main.go.
2026-05-03 21:50:05 +02:00
Mathias Bergqvist
8f9642df69 feat(hyperguild): brain write subcommand
Reads markdown from stdin, POSTs to the brain's /write endpoint with
type + slug, prints the resulting path. Pairs with 'brain query' for
shell-friendly read/write access to the brain HTTP REST API.

Tests cover success, missing args, backend error propagation, and
empty stdin (which produces an empty content payload — the brain
server's responsibility to validate).
2026-05-03 21:42:36 +02:00
Mathias Bergqvist
cd5f3c0175 feat(hyperguild): brain query subcommand
Adds 'hyperguild brain query <topic>' against the brain HTTP REST
/query endpoint. Default human output prints path + score + title;
--json passes through the response envelope. --limit overrides the
default 5-result cap.

runBrainWrite remains a stub for Task 5.
2026-05-03 21:37:39 +02:00
Mathias Bergqvist
ed4966927c feat(hyperguild): brain HTTP REST client
Adds brainClient with Query and Write methods against the brain's
HTTP REST endpoints (/query, /write). Constructor reads BRAIN_URL env
var, defaulting to http://koala:30330 — the Tailscale-exposed
NodePort that serves both MCP and REST.

Tests cover success, transport error, and non-200 cases via
httptest fakes; URL override is verified via t.Setenv.
2026-05-03 21:32:48 +02:00
Mathias Bergqvist
3c4e8e8bb8 feat(hyperguild): tier subcommand
Adds the tier subcommand to the hyperguild CLI. Reuses
internal/tier.Detect verbatim, sources probe URLs from
ANTHROPIC_PROBE_URL (default https://api.anthropic.com) and
LITELLM_BASE_URL (no default — empty triggers airplane).

Human-readable output by default; --json emits the same Info struct
as the supervisor's tier MCP returns. Tests cover all three tier
states via httptest fakes.
2026-05-03 21:27:33 +02:00
Mathias Bergqvist
5c88eff46f feat(hyperguild): subcommand router skeleton
Lays down the cmd/hyperguild/ entry point. Defines the subcommand
contract (ctx, args, stdin, stdout, stderr) error, the dispatch()
function that's testable without os.Exit, and stubs for tier / brain /
mode that return errNotImplemented. Subsequent commits replace each
stub.

Part of Plan 4 (hyperguild CLI) of the hyperguild migration.
2026-05-03 21:21:08 +02:00
Mathias Bergqvist
646a86f2c3 docs(specs): fix brain URL port (3300 → 30330)
Pod-internal port is 3300; Tailscale-exposed NodePort is 30330.
External clients including the planned hyperguild CLI hit 30330.
2026-05-03 21:01:51 +02:00
Mathias Bergqvist
adf0504116 docs(specs): hyperguild CLI design — Plan 4 of hyperguild migration
Implementation-level spec for the hyperguild CLI: a stdlib Go binary
at cmd/hyperguild/ with subcommands tier, brain query/write, and mode
(cloud|client-local|sovereign). Replaces the supervisor's tier MCP and
provides shell-friendly access to the brain HTTP REST API.

Six measurable success criteria, seven out-of-scope items, six risks.
Decisions logged: stdlib flag + inline router (no cobra), reuse
internal/tier verbatim, BRAIN_URL env override, mode subcommand writes
.mcp.json with per-mode template plus placeholder for the Plan 6
routing pod.
2026-05-03 20:59:45 +02:00
Mathias Bergqvist
d44427e71f docs: document brain MCP endpoint at koala:30330
All checks were successful
CI / Lint / Test / Vet (push) Successful in 9s
CI / Mirror to GitHub (push) Successful in 3s
- README architecture diagram now shows two MCP servers (supervisor +
  brain) with the brain hosted by ingestion directly.
- Connect-a-project example includes both servers.
- .context/PROJECT.md replaces the boilerplate "Knowledge base access"
  block with the actual hyperguild MCP endpoints.
- Adapters regenerated via task context:sync.

Captures the transitional state where two MCPs coexist; the supervisor
MCP will shrink as skill workers move to SKILL.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 23:03:45 +02:00
Mathias Bergqvist
2635cdcaa7 chore: add brain MCP server alongside supervisor
The brain MCP at koala:30330 hosts the brain_* and session_log tools
formerly on supervisor. Supervisor stays connected during the
transition; its skill workers and the brain duplication will be
removed in a later plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 23:02:28 +02:00
Mathias Bergqvist
e922471229 fix(context-sync): short-circuit when root AGENT.md is unreachable
All checks were successful
CI / Lint / Test / Vet (push) Successful in 9s
CI / Mirror to GitHub (push) Successful in 3s
In CI's clean checkout the tree-walk for ~/dev/.context/AGENT.md
finds nothing, leaving ROOT_CONTEXT empty. The script previously
proceeded to regenerate AGENTS.md, .cursorrules,
.aider.conventions.md, and .context/system-prompt.txt as
project-only — but the committed versions are root+project, so
the drift gate added in cc401d9 fails CI on every push.

When no root context is reachable, only regenerate CLAUDE.md
(which is project-only by design — Claude Code walks up the tree
itself to find the root). The root-bearing adapters are left
untouched, eliminating the false-positive drift.

Local runs (with root context reachable) are unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 22:55:43 +02:00
Mathias Bergqvist
87ff1f907c fix(ingestion): silence errcheck on resp.Body.Close in integration test
Some checks failed
CI / Lint / Test / Vet (push) Failing after 3s
CI / Mirror to GitHub (push) Has been skipped
CI's golangci-lint flagged the un-checked deferred Close. Match the
existing project pattern (defer func() { _ = ... }()).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 14:55:29 +02:00
135 changed files with 17953 additions and 984 deletions

View File

@@ -36,6 +36,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -46,9 +58,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -58,9 +71,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -68,7 +84,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -100,18 +116,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild``knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules
@@ -216,13 +278,30 @@ Key skills:
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## Knowledge base access
## MCP endpoints
This project can query the shared knowledge base via MCP or HTTP:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **MCP endpoint**: `mcp://localhost:3100/knowledge`
- **HTTP fallback**: `http://localhost:3100/api/v1/search`
- **Scoping**: queries are filtered to collection `personal` + `public`
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -45,13 +45,30 @@
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## Knowledge base access
## MCP endpoints
This project can query the shared knowledge base via MCP or HTTP:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **MCP endpoint**: `mcp://localhost:3100/knowledge`
- **HTTP fallback**: `http://localhost:3100/api/v1/search`
- **Scoping**: queries are filtered to collection `personal` + `public`
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -41,6 +41,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -51,9 +63,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -63,9 +76,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -73,7 +89,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -105,18 +121,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild` → `knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules
@@ -221,13 +283,30 @@ Key skills:
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## Knowledge base access
## MCP endpoints
This project can query the shared knowledge base via MCP or HTTP:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **MCP endpoint**: `mcp://localhost:3100/knowledge`
- **HTTP fallback**: `http://localhost:3100/api/v1/search`
- **Scoping**: queries are filtered to collection `personal` + `public`
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -39,6 +39,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -49,9 +61,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -61,9 +74,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -71,7 +87,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -103,18 +119,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild` → `knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules
@@ -219,13 +281,30 @@ Key skills:
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## Knowledge base access
## MCP endpoints
This project can query the shared knowledge base via MCP or HTTP:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **MCP endpoint**: `mcp://localhost:3100/knowledge`
- **HTTP fallback**: `http://localhost:3100/api/v1/search`
- **Scoping**: queries are filtered to collection `personal` + `public`
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -11,37 +11,16 @@ jobs:
name: Build and deploy
runs-on: self-hosted
if: ${{ github.event.workflow_run.conclusion == 'success' && github.event.workflow_run.event == 'push' }}
environment: staging
env:
SERVICE: supervisor
IMAGE: gitea.d-ma.be/mathias/supervisor
INGESTION_IMAGE: gitea.d-ma.be/mathias/ingestion
ROUTING_IMAGE: gitea.d-ma.be/mathias/routing
INFRA_REPO: git@gitea.d-ma.be:mathias/infra.git
BUILDKIT_HOST: unix:///run/buildkit/buildkitd.sock
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Build and push supervisor image
run: |
set -e
trap 'rm -f /tmp/supervisor-image.tar' EXIT
IMAGE_TAG="${{ github.sha }}"
echo "Building ${IMAGE}:${IMAGE_TAG}"
buildctl --addr "${BUILDKIT_HOST}" build \
--frontend dockerfile.v0 \
--local context=. \
--local dockerfile=. \
--opt build-arg:VERSION="${IMAGE_TAG}" \
--output type=oci,dest=/tmp/supervisor-image.tar
skopeo copy \
oci-archive:/tmp/supervisor-image.tar \
docker://${IMAGE}:${IMAGE_TAG} \
--dest-creds "${{ secrets.REGISTRY_CREDS }}"
echo "Built and pushed ${IMAGE}:${IMAGE_TAG}"
- name: Build and push ingestion image
run: |
set -e
@@ -62,6 +41,28 @@ jobs:
echo "Built and pushed ${INGESTION_IMAGE}:${IMAGE_TAG}"
- name: Build and push routing image
run: |
set -e
trap 'rm -f /tmp/routing-image.tar' EXIT
IMAGE_TAG="${{ github.sha }}"
echo "Building ${ROUTING_IMAGE}:${IMAGE_TAG}"
buildctl --addr "${BUILDKIT_HOST}" build \
--frontend dockerfile.v0 \
--local context=. \
--local dockerfile=. \
--opt filename=Dockerfile.routing \
--opt build-arg:VERSION="${IMAGE_TAG}" \
--output type=oci,dest=/tmp/routing-image.tar
skopeo copy \
oci-archive:/tmp/routing-image.tar \
docker://${ROUTING_IMAGE}:${IMAGE_TAG} \
--dest-creds "${{ secrets.REGISTRY_CREDS }}"
echo "Built and pushed ${ROUTING_IMAGE}:${IMAGE_TAG}"
- name: Update infra repo
run: |
set -e
@@ -77,17 +78,89 @@ jobs:
cd /tmp/infra-update
sed -i "s|gitea.d-ma.be/mathias/supervisor:.*|gitea.d-ma.be/mathias/supervisor:${IMAGE_TAG}|" \
"k3s/apps/${SERVICE}/deployment.yaml"
sed -i "s|gitea.d-ma.be/mathias/ingestion:.*|gitea.d-ma.be/mathias/ingestion:${IMAGE_TAG}|" \
"k3s/apps/${SERVICE}/ingestion-deployment.yaml"
"k3s/apps/supervisor/ingestion-deployment.yaml"
sed -i "s|gitea.d-ma.be/mathias/routing:.*|gitea.d-ma.be/mathias/routing:${IMAGE_TAG}|" \
"k3s/apps/routing/deployment.yaml"
git config user.email "cd-bot@d-ma.be"
git config user.name "CD Bot"
git add "k3s/apps/${SERVICE}/deployment.yaml" "k3s/apps/${SERVICE}/ingestion-deployment.yaml"
git commit -m "chore(deploy): ${SERVICE}+ingestion → ${IMAGE_TAG}"
git add "k3s/apps/supervisor/ingestion-deployment.yaml" \
"k3s/apps/routing/deployment.yaml"
git commit -m "chore(deploy): ingestion+routing → ${IMAGE_TAG}"
GIT_SSH_COMMAND="ssh -i ~/.ssh/infra_deploy_key -o IdentitiesOnly=yes" \
git push
echo "Infra repo updated: ${SERVICE}+ingestion → ${IMAGE_TAG}"
echo "Infra repo updated: ingestion+routing → ${IMAGE_TAG}"
- name: Trigger Flux reconcile (immediate)
run: |
kubectl -n flux-system annotate gitrepository flux-system \
reconcile.fluxcd.io/requestedAt="$(date +%s)" --overwrite
kubectl -n flux-system annotate kustomization apps \
reconcile.fluxcd.io/requestedAt="$(date +%s)" --overwrite
- name: Wait for Flux to apply new ingestion image
run: |
EXPECTED="gitea.d-ma.be/mathias/ingestion:${{ github.sha }}"
for i in $(seq 1 60); do
CURRENT=$(kubectl get deploy ingestion -n supervisor \
-o jsonpath='{.spec.template.spec.containers[0].image}' 2>/dev/null || echo "")
if [ "$CURRENT" = "$EXPECTED" ]; then
echo "✓ Flux applied ingestion image after ${i}s"
break
fi
sleep 1
done
kubectl get deploy ingestion -n supervisor \
-o jsonpath='{.spec.template.spec.containers[0].image}' \
| grep -qx "$EXPECTED" \
|| { echo "✗ Flux did not apply ingestion image within 60s"; exit 1; }
- name: Verify ingestion rollout
run: |
kubectl rollout status deployment/ingestion \
--namespace supervisor \
--timeout=120s \
|| {
echo "── pod status ──"
kubectl get pods -n supervisor -o wide
echo "── events ──"
kubectl get events -n supervisor --sort-by='.lastTimestamp' | tail -20
echo "── describe ──"
kubectl describe pods -n supervisor -l app=ingestion | tail -40
exit 1
}
- name: Wait for Flux to apply new routing image
run: |
EXPECTED="gitea.d-ma.be/mathias/routing:${{ github.sha }}"
for i in $(seq 1 60); do
CURRENT=$(kubectl get deploy routing -n routing \
-o jsonpath='{.spec.template.spec.containers[0].image}' 2>/dev/null || echo "")
if [ "$CURRENT" = "$EXPECTED" ]; then
echo "✓ Flux applied routing image after ${i}s"
break
fi
sleep 1
done
kubectl get deploy routing -n routing \
-o jsonpath='{.spec.template.spec.containers[0].image}' \
| grep -qx "$EXPECTED" \
|| { echo "✗ Flux did not apply routing image within 60s"; exit 1; }
- name: Verify routing rollout
run: |
kubectl rollout status deployment/routing \
--namespace routing \
--timeout=120s \
|| {
echo "── pod status ──"
kubectl get pods -n routing -o wide
echo "── events ──"
kubectl get events -n routing --sort-by='.lastTimestamp' | tail -20
echo "── describe ──"
kubectl describe pods -n routing -l app=routing | tail -40
exit 1
}

View File

@@ -1,8 +1,11 @@
{
"mcpServers": {
"supervisor": {
"brain": {
"type": "http",
"url": "http://koala:30320/mcp"
"url": "https://brain-mcp.d-ma.be/mcp",
"headers": {
"Authorization": "Bearer ${BRAIN_MCP_TOKEN}"
}
}
}
}

115
AGENTS.md
View File

@@ -36,6 +36,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -46,9 +58,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -58,9 +71,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -68,7 +84,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -100,18 +116,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild``knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules
@@ -216,13 +278,30 @@ Key skills:
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## Knowledge base access
## MCP endpoints
This project can query the shared knowledge base via MCP or HTTP:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **MCP endpoint**: `mcp://localhost:3100/knowledge`
- **HTTP fallback**: `http://localhost:3100/api/v1/search`
- **Scoping**: queries are filtered to collection `personal` + `public`
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -45,13 +45,30 @@
- Client data never leaves local network unless explicitly cleared
- Dependencies: audit with `govulncheck` before adding
## Knowledge base access
## MCP endpoints
This project can query the shared knowledge base via MCP or HTTP:
Two MCP servers are live, both reachable over Tailscale and via HTTPS domain:
- **MCP endpoint**: `mcp://localhost:3100/knowledge`
- **HTTP fallback**: `http://localhost:3100/api/v1/search`
- **Scoping**: queries are filtered to collection `personal` + `public`
- **`brain`** at `https://brain-mcp.d-ma.be/mcp` (NodePort `koala:30330`) —
`brain_query`, `brain_write`, `brain_ingest`, `brain_ingest_raw`,
`brain_answer`, `brain_classify`, `session_log`. Hosted by the ingestion
service. Auth: Dex JWT (claude.ai OAuth) or static `BRAIN_MCP_TOKEN`.
- **`routing`** at `http://koala:30310/mcp` — Mode 2 routing pod. Advertises
`review`, `debug`, `retrospective`, `trainer`; per-call routes to local model
or Claude based on brain `/pass-rate`. Bearer auth via `ROUTING_MCP_TOKEN`
(opt-in). Only `mode client-local` registers this endpoint.
The supervisor MCP (`koala:30320`) was retired in Plan 7 (2026-05-12). Its
skill workers (`tdd`, `spec`) are now SKILL.md files; routed skills moved to
the routing pod; brain tools moved to the brain MCP.
The brain HTTP REST API (`/query`, `/write`, `/ingest`, `/ingest-raw`,
`/ingest-path`, `/backfill-refs`, `/pass-rate`) remains available on port 3300
for shell scripts and non-MCP clients.
`brain_answer(query)` performs BM25 retrieval + LLM synthesis (berget.ai
gemma4:31b → iguana fallback). `brain_classify(text)` infers doc type, title,
and tags. Both require `BRAIN_LLM_PRIMARY_URL` to be set in the ingestion pod.
## Agent instructions

View File

@@ -67,6 +67,50 @@ Record *why* things are the way they are. Future-you will thank present-you.
---
## Plan 6: routing pod reuses internal/skills/{review,debug,retrospective,trainer}
Plan 6 (Mode 2 routing pod, 2026-05-04) introduces a second consumer of
the four cost-routable skill packages. The routing pod constructs each
skill via `<pkg>.New(Config{...})` and hands it `routing.Router.Run` as
the `CompleteFunc`.
**Preserved code (do not delete):**
- `internal/skills/{review,debug,retrospective,trainer}/`
- `internal/registry`, `internal/mcp`, `internal/exec/litellm.go`
- `internal/routing/`, `cmd/routing/`
---
## Plan 7: supervisor pod retired (2026-05-12)
**What was deleted:** `cmd/supervisor/`, `internal/skills/{tdd,spec}/`,
root `Dockerfile`, supervisor k8s manifests (Deployment, Service, Ingress,
NodePort 30320), `supervisor` entry removed from all `.mcp.json` configs.
**Coverage:** `tdd`/`spec` → SKILL.md files in `~/dev/.skills/`; `review`,
`debug`, `retrospective`, `trainer` → routing pod; `brain_*`/`session_log`
brain MCP; `tier``hyperguild tier` CLI.
---
## 2026-05-12 — brain_answer and brain_classify: LLM routing via berget.ai → iguana
**Context:** Brain MCP returned raw BM25 excerpts with no synthesis. Adding
LLM-backed tools enables Q&A and ingestion enrichment without a separate service.
**Decision:** Two new MCP tools in the ingestion service (`ingestion/internal/mcp/`):
- `brain_answer(query)` — BM25 top-10 → LLM synthesis → answer + sources
- `brain_classify(text)` — LLM classifies doc into type/title/tags
Primary LLM: berget.ai `gemma4:31b` (EU cloud, spend tokens while available).
Fallback: iguana `gemma4:31b` (local Ollama). Reranker deferred to follow-up.
Router lives in `ingestion/internal/llm.Router`; opt-in via `BRAIN_LLM_PRIMARY_URL`.
**Consequences:** Brain becomes a knowledge assistant, not just a search index.
When berget.ai tokens run out, flip `BRAIN_LLM_PRIMARY_URL` to iguana.
---
## 2026-04-08 — Mistral Vibe gets its own adapter
**Context**: Vibe doesn't read `AGENTS.md` — it uses `~/.vibe/prompts/` and `~/.vibe/agents/` with TOML config.
@@ -74,3 +118,52 @@ Record *why* things are the way they are. Future-you will thank present-you.
**Decision**: The root context-sync generates a `mathias.md` prompt and `mathias.toml` agent config in `~/.vibe/`. This is the one tool that needs a custom adapter path.
**Consequences**: Run `vibe --agent mathias` to use your conventions. Other Vibe users on the machine aren't affected.
---
## 2026-05-18 — project_create commits staging namespace directly to infra main
**Context:** `project_create` writes a k8s namespace manifest into the infra
repo so Flux brings up a staging environment for the new project. Initial
implementation pushed to a `staging/<name>` branch, which required manual PR
merge before Flux saw the namespace — defeating the "one tool call, project
exists, staging reconciling within 60s" goal.
**Decision:** Option A — commit directly to `main`. `callInfraCommit` passes
`branch: "main"` to gitea-mcp's `file_write_branch`; no PR, no merge step.
**Consequences:** Staging namespace appears in cluster within ~60s of the
`project_create` call. Consistent with project-wide TBD policy (CLAUDE.md):
commit directly to main, every commit deployable. Acceptable because the
manifest is a fresh namespace under `k3s/staging/<name>/` — isolated, low
blast-radius, and Flux will simply recreate it if the file is bad. Manual
review gating was friction for no compensating safety gain on experiment
namespaces.
---
## 2026-05-18 — pgvector over Qdrant for brain hybrid retrieval (supersedes 2026-04-08)
**Context:** The 2026-04-08 ADR chose Qdrant for vector store. Since then,
postgres18 with pgvector has been deployed in the `databases` namespace on
koala and is already the shared default for the rest of the project
(CLAUDE.md lists `pgvector (vector), BM25` as the primary search layer and
Qdrant only as a fallback "when >1M vectors or hybrid retrieval"). Qdrant
itself has never been deployed — `kubectl get` finds no pod, service, or
manifest. Standing up a new vector engine for a single consumer is friction
that the original ADR did not weigh.
**Decision:** Use pgvector for brain hybrid retrieval. Issue #8 — and any
follow-on embedding work — targets the existing `postgres18` instance:
- one table `brain_embeddings(path TEXT PRIMARY KEY, embedding VECTOR(768), updated_at TIMESTAMPTZ)`,
IVFFlat or HNSW index by feel once volume warrants
- BM25 stays as today (file walk + token frequency); cosine via pgvector
- hybrid scoring done in SQL or Go; pick once we measure
- nomic-embed-text on iguana ollama provides 768-dim vectors
**Consequences:** One database engine instead of two. Backups, monitoring,
and connection pooling already solved. Trade-off: pgvector at >1M vectors
or under hybrid-search load may underperform Qdrant — revisit only when
benchmarks hurt. The 2026-04-08 ADR is superseded for the brain use case;
Qdrant remains the noted fallback path in CLAUDE.md if scale demands it.

View File

@@ -1,50 +0,0 @@
# syntax=docker/dockerfile:1
# ── Build stage ───────────────────────────────────────────────────────────────
FROM golang:1.26-bookworm AS builder
ARG VERSION=dev
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
go build -trimpath -ldflags="-s -w -X main.version=${VERSION}" \
-o /out/supervisor ./cmd/supervisor
# ── Runtime stage ─────────────────────────────────────────────────────────────
# Node.js 22 slim — needed for claude CLI subprocess
FROM node:22-slim
# Install claude CLI (provides the `claude` binary the supervisor shells out to)
RUN npm install -g @anthropic-ai/claude-code \
&& claude --version \
&& echo "claude CLI installed"
# Copy supervisor binary
COPY --from=builder /out/supervisor /usr/local/bin/supervisor
# Bake in config (models.yaml + skill discipline files)
COPY config/ /app/config/
# Run as non-root
RUN groupadd -r supervisor && useradd -r -g supervisor -d /app supervisor
WORKDIR /app
# brain/ is writable state — mount a PersistentVolume here
VOLUME /app/brain
ENV SUPERVISOR_CONFIG_DIR=/app/config/supervisor
ENV SUPERVISOR_MODELS_FILE=/app/config/models.yaml
ENV SUPERVISOR_BRAIN_DIR=/app/brain
ENV SUPERVISOR_SESSIONS_DIR=/app/brain/sessions
ENV SUPERVISOR_PORT=3200
USER supervisor
EXPOSE 3200
ENTRYPOINT ["/usr/local/bin/supervisor"]

30
Dockerfile.routing Normal file
View File

@@ -0,0 +1,30 @@
# syntax=docker/dockerfile:1
# ── Build stage ───────────────────────────────────────────────────────────────
FROM golang:1.26-bookworm AS builder
ARG VERSION=dev
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 \
go build -trimpath -ldflags="-s -w -X main.version=${VERSION}" \
-o /out/routing ./cmd/routing
# ── Runtime stage ─────────────────────────────────────────────────────────────
FROM gcr.io/distroless/base-debian12
COPY --from=builder /out/routing /usr/local/bin/routing
COPY config/ /app/config/
ENV SUPERVISOR_CONFIG_DIR=/app/config/supervisor
ENV ROUTING_PORT=3210
EXPOSE 3210
USER 65532:65532
ENTRYPOINT ["/usr/local/bin/routing"]

View File

@@ -11,9 +11,11 @@ into a searchable brain.
Your Claude Code session (in any project)
│ MCP over HTTP (Tailscale)
supervisor :3200 (NodePort 30320 on koala) — skill workers: tdd, retrospective
ingestion :3300 — brain HTTP API: query wiki, write notes
├──▶ supervisor :3200 (NodePort 30320 on koala) — skill workers: tdd, debug, spec, …
├──▶ routing :3210 (NodePort 30310 on koala) — Mode 2 only: review, debug, retrospective, trainer
└──▶ brain :3300 (NodePort 30330 on koala) — brain_query, brain_write, brain_ingest, session_log
└─ also serves the legacy REST endpoints (/query, /write, /ingest, …)
brain/
@@ -57,16 +59,26 @@ Create `.mcp.json` in your project root:
"supervisor": {
"type": "http",
"url": "http://koala:30320/mcp"
},
"brain": {
"type": "http",
"url": "http://koala:30330/mcp"
}
}
}
```
The supervisor MCP server is reachable over Tailscale at `koala:30320` (NodePort
to the in-cluster service on port 3200). No local binary or stdio shim is
required — Claude Code talks to it directly via HTTP.
Two MCP servers are exposed today, both reachable over Tailscale:
Open Claude Code in your project — run `/mcp` to confirm `supervisor` is listed.
- **`supervisor`** at `koala:30320` — skill workers (`tdd_red/green/refactor`,
`review`, `debug`, `spec`, `retrospective`, `trainer`, `tier`).
- **`brain`** at `koala:30330` — knowledge access (`brain_query`, `brain_write`,
`brain_ingest`, `brain_ingest_raw`) and `session_log`. Hosted by the ingestion
service directly, no separate pod.
No local binary or stdio shim is required — Claude Code talks to both via HTTP.
Open Claude Code in your project — run `/mcp` to confirm both servers are listed.
## A typical TDD session
@@ -100,6 +112,17 @@ The supervisor probes connectivity at call time:
| `SUPERVISOR_SESSIONS_DIR` | `./brain/sessions` | JSONL session logs |
| `INGEST_BASE_URL` | `http://localhost:3300` | Supervisor → ingestion |
| `LITELLM_BASE_URL` | — | LiteLLM proxy for Tier 2 model routing |
| `SUPERVISOR_MCP_TOKEN` | — | Optional bearer token for the supervisor MCP HTTP endpoint; when empty, no auth is enforced |
| `ROUTING_PORT` | `3210` | Routing pod's listen port |
| `ROUTING_MCP_TOKEN` | — | Optional bearer token for the routing MCP HTTP endpoint |
| `BRAIN_URL` | `http://ingestion.supervisor:3300` | Routing pod → brain (in-cluster) |
| `HYPERGUILD_FAST_MODEL` | `koala/qwen35-9b-fast` | Fast model for high-pass-rate skill calls |
| `HYPERGUILD_THINKING_MODEL` | `iguana/gemma4-26b` | Thinking model for low-pass-rate skill calls |
| `HYPERGUILD_ROUTE_LOCAL_FLOOR` | `0.90` | At/above pass rate, route to fast model |
| `HYPERGUILD_ROUTE_LOCAL_CEIL` | `0.70` | Below pass rate, route to thinking model. Between CEIL and FLOOR is the sample band. |
| `HYPERGUILD_PASS_RATE_TTL_SECONDS` | `60` | Per-skill pass-rate cache TTL |
> **Operator note:** LiteLLM at `LITELLM_BASE_URL` must register both `HYPERGUILD_FAST_MODEL` and `HYPERGUILD_THINKING_MODEL` for routing to do useful work. If a model is missing, LiteLLM returns 4xx, the routing pod's fast route fails, the fail-open retry on the thinking model likely also fails (since both are missing), and the only signal is `final_status: "fail"` on `_routing` entries in the brain.
## Phase 2 (planned)

View File

@@ -39,6 +39,22 @@ tasks:
cmds:
- go run ./cmd/supervisor
hyperguild:dev:
desc: Run hyperguild CLI from source (e.g. task hyperguild:dev -- tier)
cmds:
- go run ./cmd/hyperguild {{.CLI_ARGS}}
hyperguild:build:
desc: Build the hyperguild binary into ./bin/hyperguild
cmds:
- mkdir -p bin
- go build -o bin/hyperguild ./cmd/hyperguild
hyperguild:install:
desc: Install hyperguild into $GOBIN
cmds:
- go install ./cmd/hyperguild
ingestion:dev:
desc: Run ingestion server in development mode
dir: ingestion
@@ -112,6 +128,11 @@ tasks:
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | jq .
smoke:routing:
desc: Boot the routing pod against live LiteLLM + brain and verify _routing logs land
cmds:
- bash scripts/smoke-routing.sh
# ── Git / Release ──────────────────────────────────────────────────────────
tag:

View File

@@ -0,0 +1,167 @@
# baseline-pre-fix — 20 questions, k=5
top-1 hit rate: 4/20 = 20%
top-3 hit rate: 13/20 = 65%
## per-question detail
· rank=3 expected=dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
q: how do I stop dex from logging users out on every pod restart?
1. homelab-network-perimeter-model
2. 2026-05-12-koala-machine-state
3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart <-- expected
4. infra-litellm-absorption-2026-05-16
5. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
★ rank=1 expected=postgres-least-privilege-migration-tenant-grant-bypass-2026-05
q: my postgres-exporter broke after revoking PUBLIC CONNECT — why?
1. postgres-least-privilege-migration-tenant-grant-bypass-2026-05 <-- expected
2. infra-litellm-absorption-2026-05-16
3. brain-mcp-activation-runbook
4. extension-version-lags-platform-major-upgrade
5. ntfy-deny-all-rollout-ordering-keep-alert-pipeline-live-during-auth-flip
★ rank=1 expected=homelab-network-perimeter-model
q: when is a NodePort acceptable vs needing a public ingress with bearer gate?
1. homelab-network-perimeter-model <-- expected
2. qwen3-thinking-model-empty-content-trap
3. mcpclient-empty-token-silent-401-envfrom-missing-key
4. 2026-05-12-koala-machine-state
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=3 expected=exit-255-unknown-reason-not-oom
q: what does container exit code 255 with reason Unknown mean?
1. qwen3-thinking-model-empty-content-trap
2. infra-litellm-absorption-2026-05-16
3. exit-255-unknown-reason-not-oom <-- expected
4. mcpclient-empty-token-silent-401-envfrom-missing-key
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=3 expected=gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
q: can gitea push-mirror create the github repo automatically?
1. infra-litellm-absorption-2026-05-16
2. Autoresearch
3. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo <-- expected
4. adr-new-project-gitea-first-github-mirror
5. adr-github-as-primary-remote
✗ rank=0 expected=flux-healthcheck-stale-on-resource-removal
q: a flux kustomization is stuck after I removed a resource — why?
1. qwen3-thinking-model-empty-content-trap
2. 2026-05-12-koala-machine-state
3. homelab-architecture-principles-2026-05
4. gitea-mcp: full stack shipped end-to-end (2026-05-05)
5. k8s-configmap-mount-no-reload-needs-pod-restart
· rank=2 expected=go-bytes-buffer-bytes-reset-aliasing-trap
q: the bytes buffer aliasing trap with Reset in a loop — what's the bug?
1. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
2. go-bytes-buffer-bytes-reset-aliasing-trap <-- expected
3. homelab-security-chains-not-bugs
4. training-on-rtx-5070-pretraining-vs-finetuning
5. Hash Encoding
★ rank=1 expected=homelab-architecture-principles-2026-05
q: what are the homelab architecture principles from may 2026?
1. homelab-architecture-principles-2026-05 <-- expected
2. homelab-network-perimeter-model
3. Claude Managed Agents — architecture notes relevant to homelab agent platform
4. homelab-core-glossary
5. 2026-05-12-koala-machine-state
✗ rank=0 expected=2026-05-04-sops-age-key-from-flux-cluster
q: where does the sops age private key live in the cluster?
1. 2026-05-12-koala-machine-state
2. homelab-network-perimeter-model
3. postgres-least-privilege-migration-tenant-grant-bypass-2026-05
4. brain-mcp-activation-runbook
5. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
✗ rank=0 expected=grafana-dashboards-as-code-not-ui-state
q: why do my grafana dashboards disappear after a pod restart?
1. infra-litellm-absorption-2026-05-16
2. 2026-05-12-koala-machine-state
3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
4. brain-mcp-activation-runbook
5. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
· rank=2 expected=double-diamond-methodology
q: what is the double diamond methodology?
1. Harnessing the Power of Hash Encoding for Categorical Data in Data Science
2. double-diamond-methodology <-- expected
3. unified-methodology-diamond-futures-autoresearch
4. futures-thinking-extended-double-diamond
5. insight-exploration-as-diamond-1
· rank=3 expected=2026-05-04-mcp-transport-version-claude-ai-strict
q: my MCP server works from claude code but fails on claude.ai — what's different?
1. qwen3-thinking-model-empty-content-trap
2. mcp-resource-url-empty-breaks-claude-ai-discovery-silently
3. 2026-05-04-mcp-transport-version-claude-ai-strict <-- expected
4. 2026-05-04-claude-ai-custom-mcp-connectors
5. finding-github-mcp-claudeai-vs-claudecode
· rank=2 expected=homelab-security-chains-not-bugs
q: how should I rate security findings — isolated bugs or exploit chains?
1. homelab-network-perimeter-model
2. homelab-security-chains-not-bugs <-- expected
3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
4. policy-audit-mode-blocks-nothing
5. homelab-document-accepted-risk-to-break-audit-cycle
· rank=2 expected=2026-05-03-canonical-vs-derived-context-flow
q: how should canonical context files relate to derived adapter files?
1. qwen3-thinking-model-empty-content-trap
2. 2026-05-03-canonical-vs-derived-context-flow <-- expected
3. 2026-05-12-koala-machine-state
4. 2026-05-04-claude-ai-custom-mcp-connectors
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=2 expected=homelab-core-glossary
q: what is the homelab core vocabulary glossary?
1. homelab-architecture-principles-2026-05
2. homelab-core-glossary <-- expected
3. Claude Managed Agents — architecture notes relevant to homelab agent platform
4. 2026-05-12-koala-machine-state
5. Autoresearch
★ rank=1 expected=koala-llama-swap-native-tool-calls-survey-2026-05
q: which models on koala llama-swap actually emit native tool_calls correctly?
1. koala-llama-swap-native-tool-calls-survey-2026-05 <-- expected
2. 2026-05-12-koala-machine-state
3. infra-litellm-absorption-2026-05-16
4. training-on-rtx-5070-pretraining-vs-finetuning
5. qwen3-thinking-model-empty-content-trap
✗ rank=0 expected=qwen35-9b-fast
q: what is qwen35-9b-fast and what's it used for?
1. koala-llama-swap-native-tool-calls-survey-2026-05
2. qwen3-thinking-model-empty-content-trap
3. Qwen35-9b-fast
4. infra-litellm-absorption-2026-05-16
5. 2026-05-12-koala-machine-state
✗ rank=0 expected=go-defer-errcheck-body-close
q: in go, how do I prevent defer body close from silently dropping errors?
1. infra-litellm-absorption-2026-05-16
2. homelab-network-perimeter-model
3. go-bytes-buffer-bytes-reset-aliasing-trap
4. mcpclient-empty-token-silent-401-envfrom-missing-key
5. brain-mcp-activation-runbook
✗ rank=0 expected=hyperguild-level3-pipeline-rewrite
q: what was the level 3 rewrite of hyperguild's ingestion pipeline?
1. 2026-05-12-koala-machine-state
2. homelab-core-glossary
3. brain-mcp-activation-runbook
4. koala-llama-swap-native-tool-calls-survey-2026-05
5. infra-litellm-absorption-2026-05-16
? rank=4 expected=adr-new-project-gitea-first-github-mirror
q: what's the new-project ADR — is it gitea-first or github-first?
1. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
2. gitea-mcp: full stack shipped end-to-end (2026-05-05)
3. mcp-tool-design-get-needs-list-partner
4. adr-new-project-gitea-first-github-mirror <-- expected
5. 2026-05-04-gitea-mcp-build-session

167
brain/eval/post-fix.txt Normal file
View File

@@ -0,0 +1,167 @@
# post-fix — 20 questions, k=5
top-1 hit rate: 4/20 = 20%
top-3 hit rate: 14/20 = 70%
## per-question detail
· rank=3 expected=dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
q: how do I stop dex from logging users out on every pod restart?
1. homelab-network-perimeter-model
2. 2026-05-12-koala-machine-state
3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart <-- expected
4. infra-litellm-absorption-2026-05-16
5. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
★ rank=1 expected=postgres-least-privilege-migration-tenant-grant-bypass-2026-05
q: my postgres-exporter broke after revoking PUBLIC CONNECT — why?
1. postgres-least-privilege-migration-tenant-grant-bypass-2026-05 <-- expected
2. infra-litellm-absorption-2026-05-16
3. brain-mcp-activation-runbook
4. extension-version-lags-platform-major-upgrade
5. ntfy-deny-all-rollout-ordering-keep-alert-pipeline-live-during-auth-flip
★ rank=1 expected=homelab-network-perimeter-model
q: when is a NodePort acceptable vs needing a public ingress with bearer gate?
1. homelab-network-perimeter-model <-- expected
2. qwen3-thinking-model-empty-content-trap
3. mcpclient-empty-token-silent-401-envfrom-missing-key
4. 2026-05-12-koala-machine-state
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=3 expected=exit-255-unknown-reason-not-oom
q: what does container exit code 255 with reason Unknown mean?
1. qwen3-thinking-model-empty-content-trap
2. infra-litellm-absorption-2026-05-16
3. exit-255-unknown-reason-not-oom <-- expected
4. mcpclient-empty-token-silent-401-envfrom-missing-key
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=3 expected=gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
q: can gitea push-mirror create the github repo automatically?
1. infra-litellm-absorption-2026-05-16
2. Autoresearch
3. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo <-- expected
4. adr-new-project-gitea-first-github-mirror
5. adr-github-as-primary-remote
✗ rank=0 expected=flux-healthcheck-stale-on-resource-removal
q: a flux kustomization is stuck after I removed a resource — why?
1. qwen3-thinking-model-empty-content-trap
2. 2026-05-12-koala-machine-state
3. homelab-architecture-principles-2026-05
4. gitea-mcp: full stack shipped end-to-end (2026-05-05)
5. k8s-configmap-mount-no-reload-needs-pod-restart
· rank=2 expected=go-bytes-buffer-bytes-reset-aliasing-trap
q: the bytes buffer aliasing trap with Reset in a loop — what's the bug?
1. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
2. go-bytes-buffer-bytes-reset-aliasing-trap <-- expected
3. homelab-security-chains-not-bugs
4. training-on-rtx-5070-pretraining-vs-finetuning
5. Hash Encoding
★ rank=1 expected=homelab-architecture-principles-2026-05
q: what are the homelab architecture principles from may 2026?
1. homelab-architecture-principles-2026-05 <-- expected
2. homelab-network-perimeter-model
3. Claude Managed Agents — architecture notes relevant to homelab agent platform
4. homelab-core-glossary
5. 2026-05-12-koala-machine-state
✗ rank=0 expected=2026-05-04-sops-age-key-from-flux-cluster
q: where does the sops age private key live in the cluster?
1. 2026-05-12-koala-machine-state
2. homelab-network-perimeter-model
3. postgres-least-privilege-migration-tenant-grant-bypass-2026-05
4. brain-mcp-activation-runbook
5. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
✗ rank=0 expected=grafana-dashboards-as-code-not-ui-state
q: why do my grafana dashboards disappear after a pod restart?
1. infra-litellm-absorption-2026-05-16
2. 2026-05-12-koala-machine-state
3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
4. brain-mcp-activation-runbook
5. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
· rank=2 expected=double-diamond-methodology
q: what is the double diamond methodology?
1. Harnessing the Power of Hash Encoding for Categorical Data in Data Science
2. double-diamond-methodology <-- expected
3. unified-methodology-diamond-futures-autoresearch
4. futures-thinking-extended-double-diamond
5. insight-exploration-as-diamond-1
· rank=3 expected=2026-05-04-mcp-transport-version-claude-ai-strict
q: my MCP server works from claude code but fails on claude.ai — what's different?
1. qwen3-thinking-model-empty-content-trap
2. mcp-resource-url-empty-breaks-claude-ai-discovery-silently
3. 2026-05-04-mcp-transport-version-claude-ai-strict <-- expected
4. 2026-05-04-claude-ai-custom-mcp-connectors
5. finding-github-mcp-claudeai-vs-claudecode
· rank=2 expected=homelab-security-chains-not-bugs
q: how should I rate security findings — isolated bugs or exploit chains?
1. homelab-network-perimeter-model
2. homelab-security-chains-not-bugs <-- expected
3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
4. policy-audit-mode-blocks-nothing
5. homelab-document-accepted-risk-to-break-audit-cycle
· rank=2 expected=2026-05-03-canonical-vs-derived-context-flow
q: how should canonical context files relate to derived adapter files?
1. qwen3-thinking-model-empty-content-trap
2. 2026-05-03-canonical-vs-derived-context-flow <-- expected
3. 2026-05-12-koala-machine-state
4. 2026-05-04-claude-ai-custom-mcp-connectors
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=2 expected=homelab-core-glossary
q: what is the homelab core vocabulary glossary?
1. homelab-architecture-principles-2026-05
2. homelab-core-glossary <-- expected
3. Claude Managed Agents — architecture notes relevant to homelab agent platform
4. 2026-05-12-koala-machine-state
5. Autoresearch
★ rank=1 expected=koala-llama-swap-native-tool-calls-survey-2026-05
q: which models on koala llama-swap actually emit native tool_calls correctly?
1. koala-llama-swap-native-tool-calls-survey-2026-05 <-- expected
2. 2026-05-12-koala-machine-state
3. infra-litellm-absorption-2026-05-16
4. training-on-rtx-5070-pretraining-vs-finetuning
5. qwen3-thinking-model-empty-content-trap
· rank=2 expected=qwen35-9b-fast
q: what is qwen35-9b-fast and what's it used for?
1. koala-llama-swap-native-tool-calls-survey-2026-05
2. qwen35-9b-fast <-- expected
3. qwen3-thinking-model-empty-content-trap
4. infra-litellm-absorption-2026-05-16
5. 2026-05-12-koala-machine-state
✗ rank=0 expected=go-defer-errcheck-body-close
q: in go, how do I prevent defer body close from silently dropping errors?
1. infra-litellm-absorption-2026-05-16
2. homelab-network-perimeter-model
3. go-bytes-buffer-bytes-reset-aliasing-trap
4. mcpclient-empty-token-silent-401-envfrom-missing-key
5. brain-mcp-activation-runbook
✗ rank=0 expected=hyperguild-level3-pipeline-rewrite
q: what was the level 3 rewrite of hyperguild's ingestion pipeline?
1. 2026-05-12-koala-machine-state
2. homelab-core-glossary
3. brain-mcp-activation-runbook
4. koala-llama-swap-native-tool-calls-survey-2026-05
5. infra-litellm-absorption-2026-05-16
? rank=4 expected=adr-new-project-gitea-first-github-mirror
q: what's the new-project ADR — is it gitea-first or github-first?
1. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
2. gitea-mcp: full stack shipped end-to-end (2026-05-05)
3. mcp-tool-design-get-needs-list-partner
4. adr-new-project-gitea-first-github-mirror <-- expected
5. 2026-05-04-gitea-mcp-build-session

167
brain/eval/post-m4.txt Normal file
View File

@@ -0,0 +1,167 @@
# post-m4-tier-weighting — 20 questions, k=5
top-1 hit rate: 6/20 = 30%
top-3 hit rate: 15/20 = 75%
## per-question detail
· rank=3 expected=dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
q: how do I stop dex from logging users out on every pod restart?
1. homelab-network-perimeter-model
2. 2026-05-12-koala-machine-state
3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart <-- expected
4. infra-litellm-absorption-2026-05-16
5. k8s-configmap-mount-no-reload-needs-pod-restart
· rank=2 expected=postgres-least-privilege-migration-tenant-grant-bypass-2026-05
q: my postgres-exporter broke after revoking PUBLIC CONNECT — why?
1. infra-litellm-absorption-2026-05-16
2. postgres-least-privilege-migration-tenant-grant-bypass-2026-05 <-- expected
3. extension-version-lags-platform-major-upgrade
4. ntfy-deny-all-rollout-ordering-keep-alert-pipeline-live-during-auth-flip
5. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
★ rank=1 expected=homelab-network-perimeter-model
q: when is a NodePort acceptable vs needing a public ingress with bearer gate?
1. homelab-network-perimeter-model <-- expected
2. qwen3-thinking-model-empty-content-trap
3. mcpclient-empty-token-silent-401-envfrom-missing-key
4. 2026-05-12-koala-machine-state
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=3 expected=exit-255-unknown-reason-not-oom
q: what does container exit code 255 with reason Unknown mean?
1. qwen3-thinking-model-empty-content-trap
2. infra-litellm-absorption-2026-05-16
3. exit-255-unknown-reason-not-oom <-- expected
4. mcpclient-empty-token-silent-401-envfrom-missing-key
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=2 expected=gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
q: can gitea push-mirror create the github repo automatically?
1. infra-litellm-absorption-2026-05-16
2. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo <-- expected
3. adr-new-project-gitea-first-github-mirror
4. adr-github-as-primary-remote
5. 2026-05-12-koala-machine-state
✗ rank=0 expected=flux-healthcheck-stale-on-resource-removal
q: a flux kustomization is stuck after I removed a resource — why?
1. qwen3-thinking-model-empty-content-trap
2. 2026-05-12-koala-machine-state
3. homelab-architecture-principles-2026-05
4. k8s-configmap-mount-no-reload-needs-pod-restart
5. training-on-rtx-5070-pretraining-vs-finetuning
★ rank=1 expected=go-bytes-buffer-bytes-reset-aliasing-trap
q: the bytes buffer aliasing trap with Reset in a loop — what's the bug?
1. go-bytes-buffer-bytes-reset-aliasing-trap <-- expected
2. homelab-security-chains-not-bugs
3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
4. training-on-rtx-5070-pretraining-vs-finetuning
5. flux-healthcheck-stale-on-resource-removal
★ rank=1 expected=homelab-architecture-principles-2026-05
q: what are the homelab architecture principles from may 2026?
1. homelab-architecture-principles-2026-05 <-- expected
2. homelab-network-perimeter-model
3. homelab-core-glossary
4. 2026-05-12-koala-machine-state
5. pattern-reddit-tmux-multiagent-conductor
? rank=4 expected=2026-05-04-sops-age-key-from-flux-cluster
q: where does the sops age private key live in the cluster?
1. 2026-05-12-koala-machine-state
2. homelab-network-perimeter-model
3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
4. 2026-05-04-sops-age-key-from-flux-cluster <-- expected
5. homelab-security-chains-not-bugs
★ rank=1 expected=grafana-dashboards-as-code-not-ui-state
q: why do my grafana dashboards disappear after a pod restart?
1. grafana-dashboards-as-code-not-ui-state <-- expected
2. infra-litellm-absorption-2026-05-16
3. 2026-05-12-koala-machine-state
4. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
5. k8s-configmap-mount-no-reload-needs-pod-restart
★ rank=1 expected=double-diamond-methodology
q: what is the double diamond methodology?
1. double-diamond-methodology <-- expected
2. unified-methodology-diamond-futures-autoresearch
3. futures-thinking-extended-double-diamond
4. insight-exploration-as-diamond-1
5. workflow-idea-to-running-service
· rank=3 expected=2026-05-04-mcp-transport-version-claude-ai-strict
q: my MCP server works from claude code but fails on claude.ai — what's different?
1. qwen3-thinking-model-empty-content-trap
2. mcp-resource-url-empty-breaks-claude-ai-discovery-silently
3. 2026-05-04-mcp-transport-version-claude-ai-strict <-- expected
4. 2026-05-04-claude-ai-custom-mcp-connectors
5. finding-github-mcp-claudeai-vs-claudecode
· rank=2 expected=homelab-security-chains-not-bugs
q: how should I rate security findings — isolated bugs or exploit chains?
1. homelab-network-perimeter-model
2. homelab-security-chains-not-bugs <-- expected
3. policy-audit-mode-blocks-nothing
4. homelab-document-accepted-risk-to-break-audit-cycle
5. audit-shortcut-tls-blocks-zero-equals-edge-only
· rank=2 expected=2026-05-03-canonical-vs-derived-context-flow
q: how should canonical context files relate to derived adapter files?
1. qwen3-thinking-model-empty-content-trap
2. 2026-05-03-canonical-vs-derived-context-flow <-- expected
3. 2026-05-12-koala-machine-state
4. 2026-05-04-claude-ai-custom-mcp-connectors
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=2 expected=homelab-core-glossary
q: what is the homelab core vocabulary glossary?
1. homelab-architecture-principles-2026-05
2. homelab-core-glossary <-- expected
3. 2026-05-12-koala-machine-state
4. flux-kustomization-depends-on-bootstrap-ordering
5. brain-ingest-ntfy-service
★ rank=1 expected=koala-llama-swap-native-tool-calls-survey-2026-05
q: which models on koala llama-swap actually emit native tool_calls correctly?
1. koala-llama-swap-native-tool-calls-survey-2026-05 <-- expected
2. 2026-05-12-koala-machine-state
3. infra-litellm-absorption-2026-05-16
4. training-on-rtx-5070-pretraining-vs-finetuning
5. qwen3-thinking-model-empty-content-trap
✗ rank=0 expected=qwen35-9b-fast
q: what is qwen35-9b-fast and what's it used for?
1. koala-llama-swap-native-tool-calls-survey-2026-05
2. qwen3-thinking-model-empty-content-trap
3. infra-litellm-absorption-2026-05-16
4. 2026-05-12-koala-machine-state
5. index
✗ rank=0 expected=go-defer-errcheck-body-close
q: in go, how do I prevent defer body close from silently dropping errors?
1. homelab-network-perimeter-model
2. infra-litellm-absorption-2026-05-16
3. go-bytes-buffer-bytes-reset-aliasing-trap
4. mcpclient-empty-token-silent-401-envfrom-missing-key
5. koala-llama-swap-native-tool-calls-survey-2026-05
✗ rank=0 expected=hyperguild-level3-pipeline-rewrite
q: what was the level 3 rewrite of hyperguild's ingestion pipeline?
1. 2026-05-12-koala-machine-state
2. homelab-core-glossary
3. koala-llama-swap-native-tool-calls-survey-2026-05
4. infra-litellm-absorption-2026-05-16
5. homelab-architecture-principles-2026-05
· rank=3 expected=adr-new-project-gitea-first-github-mirror
q: what's the new-project ADR — is it gitea-first or github-first?
1. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
2. mcp-tool-design-get-needs-list-partner
3. adr-new-project-gitea-first-github-mirror <-- expected
4. 2026-05-04-gitea-mcp-build-session
5. adr-local-dev-vs-hyperguild-new-project

167
brain/eval/post-m4b.txt Normal file
View File

@@ -0,0 +1,167 @@
# post-m4b-entities-promoted — 20 questions, k=5
top-1 hit rate: 7/20 = 35%
top-3 hit rate: 16/20 = 80%
## per-question detail
· rank=3 expected=dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
q: how do I stop dex from logging users out on every pod restart?
1. homelab-network-perimeter-model
2. 2026-05-12-koala-machine-state
3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart <-- expected
4. infra-litellm-absorption-2026-05-16
5. k8s-configmap-mount-no-reload-needs-pod-restart
· rank=2 expected=postgres-least-privilege-migration-tenant-grant-bypass-2026-05
q: my postgres-exporter broke after revoking PUBLIC CONNECT — why?
1. infra-litellm-absorption-2026-05-16
2. postgres-least-privilege-migration-tenant-grant-bypass-2026-05 <-- expected
3. extension-version-lags-platform-major-upgrade
4. ntfy-deny-all-rollout-ordering-keep-alert-pipeline-live-during-auth-flip
5. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
★ rank=1 expected=homelab-network-perimeter-model
q: when is a NodePort acceptable vs needing a public ingress with bearer gate?
1. homelab-network-perimeter-model <-- expected
2. qwen3-thinking-model-empty-content-trap
3. mcpclient-empty-token-silent-401-envfrom-missing-key
4. 2026-05-12-koala-machine-state
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=3 expected=exit-255-unknown-reason-not-oom
q: what does container exit code 255 with reason Unknown mean?
1. qwen3-thinking-model-empty-content-trap
2. infra-litellm-absorption-2026-05-16
3. exit-255-unknown-reason-not-oom <-- expected
4. mcpclient-empty-token-silent-401-envfrom-missing-key
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=2 expected=gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
q: can gitea push-mirror create the github repo automatically?
1. infra-litellm-absorption-2026-05-16
2. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo <-- expected
3. adr-new-project-gitea-first-github-mirror
4. adr-github-as-primary-remote
5. 2026-05-12-koala-machine-state
✗ rank=0 expected=flux-healthcheck-stale-on-resource-removal
q: a flux kustomization is stuck after I removed a resource — why?
1. qwen3-thinking-model-empty-content-trap
2. 2026-05-12-koala-machine-state
3. homelab-architecture-principles-2026-05
4. k8s-configmap-mount-no-reload-needs-pod-restart
5. training-on-rtx-5070-pretraining-vs-finetuning
★ rank=1 expected=go-bytes-buffer-bytes-reset-aliasing-trap
q: the bytes buffer aliasing trap with Reset in a loop — what's the bug?
1. go-bytes-buffer-bytes-reset-aliasing-trap <-- expected
2. homelab-security-chains-not-bugs
3. Financial Sentiment Analysis on Stock Market Headlines With FinBERT & HuggingFace
4. training-on-rtx-5070-pretraining-vs-finetuning
5. flux-healthcheck-stale-on-resource-removal
★ rank=1 expected=homelab-architecture-principles-2026-05
q: what are the homelab architecture principles from may 2026?
1. homelab-architecture-principles-2026-05 <-- expected
2. homelab-network-perimeter-model
3. homelab-core-glossary
4. 2026-05-12-koala-machine-state
5. pattern-reddit-tmux-multiagent-conductor
? rank=4 expected=2026-05-04-sops-age-key-from-flux-cluster
q: where does the sops age private key live in the cluster?
1. 2026-05-12-koala-machine-state
2. homelab-network-perimeter-model
3. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
4. 2026-05-04-sops-age-key-from-flux-cluster <-- expected
5. homelab-security-chains-not-bugs
★ rank=1 expected=grafana-dashboards-as-code-not-ui-state
q: why do my grafana dashboards disappear after a pod restart?
1. grafana-dashboards-as-code-not-ui-state <-- expected
2. infra-litellm-absorption-2026-05-16
3. 2026-05-12-koala-machine-state
4. dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
5. k8s-configmap-mount-no-reload-needs-pod-restart
★ rank=1 expected=double-diamond-methodology
q: what is the double diamond methodology?
1. double-diamond-methodology <-- expected
2. unified-methodology-diamond-futures-autoresearch
3. futures-thinking-extended-double-diamond
4. insight-exploration-as-diamond-1
5. workflow-idea-to-running-service
· rank=3 expected=2026-05-04-mcp-transport-version-claude-ai-strict
q: my MCP server works from claude code but fails on claude.ai — what's different?
1. qwen3-thinking-model-empty-content-trap
2. mcp-resource-url-empty-breaks-claude-ai-discovery-silently
3. 2026-05-04-mcp-transport-version-claude-ai-strict <-- expected
4. 2026-05-04-claude-ai-custom-mcp-connectors
5. finding-github-mcp-claudeai-vs-claudecode
· rank=2 expected=homelab-security-chains-not-bugs
q: how should I rate security findings — isolated bugs or exploit chains?
1. homelab-network-perimeter-model
2. homelab-security-chains-not-bugs <-- expected
3. policy-audit-mode-blocks-nothing
4. homelab-document-accepted-risk-to-break-audit-cycle
5. audit-shortcut-tls-blocks-zero-equals-edge-only
· rank=2 expected=2026-05-03-canonical-vs-derived-context-flow
q: how should canonical context files relate to derived adapter files?
1. qwen3-thinking-model-empty-content-trap
2. 2026-05-03-canonical-vs-derived-context-flow <-- expected
3. 2026-05-12-koala-machine-state
4. 2026-05-04-claude-ai-custom-mcp-connectors
5. koala-llama-swap-native-tool-calls-survey-2026-05
· rank=2 expected=homelab-core-glossary
q: what is the homelab core vocabulary glossary?
1. homelab-architecture-principles-2026-05
2. homelab-core-glossary <-- expected
3. 2026-05-12-koala-machine-state
4. qwen35-9b-fast
5. flux-kustomization-depends-on-bootstrap-ordering
★ rank=1 expected=koala-llama-swap-native-tool-calls-survey-2026-05
q: which models on koala llama-swap actually emit native tool_calls correctly?
1. koala-llama-swap-native-tool-calls-survey-2026-05 <-- expected
2. 2026-05-12-koala-machine-state
3. infra-litellm-absorption-2026-05-16
4. training-on-rtx-5070-pretraining-vs-finetuning
5. qwen3-thinking-model-empty-content-trap
★ rank=1 expected=qwen35-9b-fast
q: what is qwen35-9b-fast and what's it used for?
1. qwen35-9b-fast <-- expected
2. koala-llama-swap-native-tool-calls-survey-2026-05
3. qwen3-thinking-model-empty-content-trap
4. infra-litellm-absorption-2026-05-16
5. 2026-05-12-koala-machine-state
✗ rank=0 expected=go-defer-errcheck-body-close
q: in go, how do I prevent defer body close from silently dropping errors?
1. homelab-network-perimeter-model
2. infra-litellm-absorption-2026-05-16
3. go-bytes-buffer-bytes-reset-aliasing-trap
4. mcpclient-empty-token-silent-401-envfrom-missing-key
5. koala-llama-swap-native-tool-calls-survey-2026-05
✗ rank=0 expected=hyperguild-level3-pipeline-rewrite
q: what was the level 3 rewrite of hyperguild's ingestion pipeline?
1. 2026-05-12-koala-machine-state
2. homelab-core-glossary
3. koala-llama-swap-native-tool-calls-survey-2026-05
4. infra-litellm-absorption-2026-05-16
5. homelab-architecture-principles-2026-05
· rank=3 expected=adr-new-project-gitea-first-github-mirror
q: what's the new-project ADR — is it gitea-first or github-first?
1. gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
2. mcp-tool-design-get-needs-list-partner
3. adr-new-project-gitea-first-github-mirror <-- expected
4. 2026-05-04-gitea-mcp-build-session
5. adr-local-dev-vs-hyperguild-new-project

76
brain/eval/qa-2026-05.md Normal file
View File

@@ -0,0 +1,76 @@
# Brain retrieval eval set — 2026-05-24
20 hand-authored Q→expected-top-1-slug pairs. Used by `score.sh` to
measure brain_query top-1 + top-3 hit rate against the live brain.
Authoring rules:
- Each question maps to **one** clear-best entry. Avoid ambiguous
questions where multiple slugs could be the right answer.
- Questions are phrased the way a future-me would actually ask, not
the way the entry's title reads. Some lexical distance is the point.
- `expected` is the slug as stored in `brain_entities.slug`. Update
if the slug renames.
## Pairs
```
q: how do I stop dex from logging users out on every pod restart?
expected: dex-in-memory-storage-wipes-oauth-tokens-on-every-pod-restart
q: my postgres-exporter broke after revoking PUBLIC CONNECT — why?
expected: postgres-least-privilege-migration-tenant-grant-bypass-2026-05
q: when is a NodePort acceptable vs needing a public ingress with bearer gate?
expected: homelab-network-perimeter-model
q: what does container exit code 255 with reason Unknown mean?
expected: exit-255-unknown-reason-not-oom
q: can gitea push-mirror create the github repo automatically?
expected: gitea-push-mirror-cannot-create-remote-repo-needs-pre-existing-github-repo
q: a flux kustomization is stuck after I removed a resource — why?
expected: flux-healthcheck-stale-on-resource-removal
q: the bytes buffer aliasing trap with Reset in a loop — what's the bug?
expected: go-bytes-buffer-bytes-reset-aliasing-trap
q: what are the homelab architecture principles from may 2026?
expected: homelab-architecture-principles-2026-05
q: where does the sops age private key live in the cluster?
expected: 2026-05-04-sops-age-key-from-flux-cluster
q: why do my grafana dashboards disappear after a pod restart?
expected: grafana-dashboards-as-code-not-ui-state
q: what is the double diamond methodology?
expected: double-diamond-methodology
q: my MCP server works from claude code but fails on claude.ai — what's different?
expected: 2026-05-04-mcp-transport-version-claude-ai-strict
q: how should I rate security findings — isolated bugs or exploit chains?
expected: homelab-security-chains-not-bugs
q: how should canonical context files relate to derived adapter files?
expected: 2026-05-03-canonical-vs-derived-context-flow
q: what is the homelab core vocabulary glossary?
expected: homelab-core-glossary
q: which models on koala llama-swap actually emit native tool_calls correctly?
expected: koala-llama-swap-native-tool-calls-survey-2026-05
q: what is qwen35-9b-fast and what's it used for?
expected: qwen35-9b-fast
q: in go, how do I prevent defer body close from silently dropping errors?
expected: go-defer-errcheck-body-close
q: what was the level 3 rewrite of hyperguild's ingestion pipeline?
expected: hyperguild-level3-pipeline-rewrite
q: what's the new-project ADR — is it gitea-first or github-first?
expected: adr-new-project-gitea-first-github-mirror
```

131
brain/eval/score.py Normal file
View File

@@ -0,0 +1,131 @@
#!/usr/bin/env python3
"""Score brain_query against the qa-2026-05.md eval set.
Reads `q:` / `expected:` pairs, calls brain_query MCP for each, records
top-1 + top-3 hit rate. Run:
BRAIN_MCP_TOKEN=$(grep '^export BRAIN_MCP_TOKEN=' ~/.llmkeys | cut -d= -f2-) \\
python3 score.py qa-2026-05.md
Optionally pass --baseline <name> to save the result as a labeled run.
"""
import argparse
import json
import os
import re
import sys
import time
import urllib.request
ENDPOINT = "https://brain-mcp.d-ma.be/mcp"
def load_pairs(path):
pairs = []
q = None
with open(path) as f:
for line in f:
line = line.rstrip()
if line.startswith("q:"):
q = line[2:].strip()
elif line.startswith("expected:") and q is not None:
expected = line[len("expected:"):].strip()
pairs.append((q, expected))
q = None
return pairs
def brain_query(token, query, k=5):
body = json.dumps({
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {"name": "brain_query", "arguments": {"query": query, "k": k}},
}).encode()
req = urllib.request.Request(
ENDPOINT,
data=body,
headers={
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
"Accept": "application/json, text/event-stream",
},
method="POST",
)
with urllib.request.urlopen(req, timeout=30) as r:
raw = r.read().decode()
for line in raw.splitlines():
if line.startswith("data:"):
raw = line[5:].strip()
break
d = json.loads(raw)
if "error" in d:
raise RuntimeError(d["error"])
text = d["result"]["content"][0]["text"]
return json.loads(text).get("results", [])
def slug_of(result):
# `title` mirrors the slug in brain_entities for normal entries.
# Fall back to basename(path) if title is missing.
t = result.get("title", "")
if t:
return t
p = result.get("path", "")
return re.sub(r"\.md$", "", os.path.basename(p))
def main():
ap = argparse.ArgumentParser()
ap.add_argument("evalset")
ap.add_argument("--baseline", default="run")
ap.add_argument("--k", type=int, default=5)
args = ap.parse_args()
token = os.environ.get("BRAIN_MCP_TOKEN")
if not token:
sys.exit("BRAIN_MCP_TOKEN not set")
pairs = load_pairs(args.evalset)
if not pairs:
sys.exit(f"no pairs in {args.evalset}")
print(f"# {args.baseline}{len(pairs)} questions, k={args.k}")
print()
hits1 = 0
hits3 = 0
detail = []
for q, expected in pairs:
try:
results = brain_query(token, q, k=args.k)
except Exception as e:
detail.append((q, expected, [], f"ERR {e}"))
continue
slugs = [slug_of(r) for r in results]
rank = slugs.index(expected) + 1 if expected in slugs else 0
h1 = 1 if rank == 1 else 0
h3 = 1 if 0 < rank <= 3 else 0
hits1 += h1
hits3 += h3
detail.append((q, expected, slugs, rank))
total = len(pairs)
print(f"top-1 hit rate: {hits1}/{total} = {100*hits1/total:.0f}%")
print(f"top-3 hit rate: {hits3}/{total} = {100*hits3/total:.0f}%")
print()
print("## per-question detail")
print()
for q, expected, slugs, rank in detail:
marker = {0: "", 1: "", 2: "·", 3: "·"}.get(rank, "?")
if isinstance(rank, str):
marker = "!"
print(f"{marker} rank={rank} expected={expected}")
print(f" q: {q}")
for i, s in enumerate(slugs[:args.k], 1):
mark = " <-- expected" if s == expected else ""
print(f" {i}. {s}{mark}")
print()
if __name__ == "__main__":
main()

140
cmd/hyperguild/README.md Normal file
View File

@@ -0,0 +1,140 @@
# hyperguild CLI
A small Go binary for tier probing, brain HTTP REST access, and
`.mcp.json` mode bootstrap. Replaces the supervisor's `tier` MCP and
gives shell scripts a stable interface to the brain.
## Install
```bash
task hyperguild:install
# or: go install ./cmd/hyperguild
```
The binary lands at `$(go env GOBIN)/hyperguild` (typically
`~/go/bin/hyperguild`). Make sure that's on your PATH.
## Subcommands
### `hyperguild tier`
Probes Anthropic and LiteLLM and reports the current operating tier.
```bash
$ hyperguild tier
tier 1 (full-online) managed_agents=true
$ hyperguild tier --json
{
"tier": 1,
"label": "full-online",
"available_models": null,
"managed_agents": true
}
```
Probe URLs are read from environment:
| Var | Default |
|-----------------------|-------------------------------|
| `ANTHROPIC_PROBE_URL` | `https://api.anthropic.com` |
| `LITELLM_BASE_URL` | (empty → falls through to airplane) |
### `hyperguild brain query <topic>`
BM25 search over the brain's knowledge + wiki entries.
```bash
$ hyperguild brain query "find -H symlink"
knowledge/2026-05-03-find-h-not-l-symlinked-root.md score=12 Use find -H, not find -L
...
```
Flags:
- `--limit N` — max results (default 5)
- `--json` — emit the raw response envelope
### `hyperguild brain write <type> <slug>`
Reads markdown from stdin, writes a knowledge entry.
```bash
$ cat <<EOF | hyperguild brain write knowledge example-lesson
# Example lesson
## Lesson
...
EOF
knowledge/example-lesson.md
```
### `hyperguild brain pass-rate <skill>`
Returns the pass rate for a skill over a lookback window. Computed
on-demand from `brain/sessions/*.jsonl`.
```bash
$ hyperguild brain pass-rate tdd
tdd: 47 / 50 = 94% (window: 7d)
$ hyperguild brain pass-rate tdd --window 30d --json
{
"skill": "tdd",
"window": "30d",
"pass": 142,
"fail": 8,
"skip": 5,
"total": 155,
"pass_rate": 0.9467
}
```
Flags:
- `--window` — lookback window (default `7d`; accepts `Nh`, `Nd`)
- `--json` — emit the raw response envelope
Skills with no logged invocations return zero counts and `pass_rate: null`
(indicating "no data", distinct from "always passes").
### `hyperguild mode <cloud|client-local|sovereign>`
Writes a `.mcp.json` template for the chosen operating mode.
```bash
$ hyperguild mode cloud --out ./.mcp.json
wrote ./.mcp.json (mode: cloud)
```
Flags:
- `--out PATH` — output file (default `./.mcp.json`)
- `--force` — overwrite an existing file
Modes:
- **cloud** — brain MCP only. Claude Code with no routing.
- **client-local** — brain + routing pod. The `routing` entry points at
`koala:30310/mcp` (the routing pod, deployed in Plan 6). The
`X-Hyperguild-Mode: client-local` header is forward-compat for future
modes; the pod treats absent or unknown values as `client-local`.
- **sovereign** — brain only, with a `_mode_note` explaining that this
mode primarily uses Crush + LiteLLM and the `.mcp.json` is a Claude
Code fallback for emergency offline use.
## Environment
| Var | Default | Used by |
|-----------------------|--------------------------|---------------------|
| `BRAIN_URL` | `http://koala:30330` | `brain *`, `mode *` |
| `ANTHROPIC_PROBE_URL` | `https://api.anthropic.com` | `tier` |
| `LITELLM_BASE_URL` | (empty) | `tier` |
Override `BRAIN_URL` if your brain pod is at a different Tailscale name
or port.
## See also
- `docs/superpowers/specs/2026-05-03-hyperguild-cli-design.md` — full spec
- `docs/superpowers/plans/2026-05-03-hyperguild-cli.md` — implementation plan

106
cmd/hyperguild/brain.go Normal file
View File

@@ -0,0 +1,106 @@
package main
import (
"context"
"encoding/json"
"errors"
"flag"
"fmt"
"io"
)
func runBrain(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error {
if len(args) == 0 {
return errors.New("subcommand required (query|write|pass-rate)")
}
switch args[0] {
case "query":
return runBrainQuery(ctx, args[1:], stdin, stdout, stderr)
case "write":
return runBrainWrite(ctx, args[1:], stdin, stdout, stderr)
case "pass-rate":
return runBrainPassRate(ctx, args[1:], stdin, stdout, stderr)
default:
return fmt.Errorf("unknown subcommand: %s (expected query|write|pass-rate)", args[0])
}
}
func runBrainQuery(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
fs := flag.NewFlagSet("brain query", flag.ContinueOnError)
fs.SetOutput(stderr)
asJSON := fs.Bool("json", false, "output JSON instead of human-readable")
limit := fs.Int("limit", 5, "maximum number of results")
if err := fs.Parse(args); err != nil {
return fmt.Errorf("parse flags: %w", err)
}
if fs.NArg() < 1 {
return errors.New("topic required")
}
topic := fs.Arg(0)
res, err := newBrainClient().Query(ctx, topic, *limit)
if err != nil {
return err
}
if *asJSON {
enc := json.NewEncoder(stdout)
enc.SetIndent("", " ")
return enc.Encode(res)
}
for _, hit := range res.Results {
fmt.Fprintf(stdout, "%s score=%d %s\n", hit.Path, hit.Score, hit.Title) //nolint:errcheck
}
return nil
}
func runBrainWrite(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error {
fs := flag.NewFlagSet("brain write", flag.ContinueOnError)
fs.SetOutput(stderr)
if err := fs.Parse(args); err != nil {
return fmt.Errorf("parse flags: %w", err)
}
if fs.NArg() < 2 {
return errors.New("type and slug required (e.g. brain write knowledge my-slug)")
}
kind := fs.Arg(0)
slug := fs.Arg(1)
res, err := newBrainClient().Write(ctx, kind, slug, stdin)
if err != nil {
return err
}
fmt.Fprintln(stdout, res.Path) //nolint:errcheck
return nil
}
func runBrainPassRate(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
fs := flag.NewFlagSet("brain pass-rate", flag.ContinueOnError)
fs.SetOutput(stderr)
asJSON := fs.Bool("json", false, "output JSON instead of human-readable")
window := fs.String("window", "7d", "lookback window (e.g. 1h, 24h, 7d, 30d)")
if err := fs.Parse(args); err != nil {
return fmt.Errorf("parse flags: %w", err)
}
if fs.NArg() < 1 {
return errors.New("skill required")
}
skill := fs.Arg(0)
res, err := newBrainClient().PassRate(ctx, skill, *window)
if err != nil {
return err
}
if *asJSON {
enc := json.NewEncoder(stdout)
enc.SetIndent("", " ")
return enc.Encode(res)
}
if res.PassRate == nil {
fmt.Fprintf(stdout, "%s: no data (window: %s)\n", res.Skill, res.Window) //nolint:errcheck
return nil
}
fmt.Fprintf(stdout, "%s: %d / %d = %.0f%% (window: %s)\n", res.Skill, res.Pass, res.Total, *res.PassRate*100, res.Window) //nolint:errcheck
return nil
}

View File

@@ -0,0 +1,220 @@
package main
import (
"bytes"
"context"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func brainQueryServer(t *testing.T, body string) *httptest.Server {
t.Helper()
return httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(body))
}))
}
func TestRunBrainQuery_Human(t *testing.T) {
srv := brainQueryServer(t, `{"results":[{"path":"knowledge/a.md","title":"A","excerpt":"...","score":9},{"path":"knowledge/b.md","title":"B","excerpt":"...","score":3}]}`)
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"query", "topic"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
got := out.String()
assert.Contains(t, got, "knowledge/a.md")
assert.Contains(t, got, "score=9")
assert.Contains(t, got, "knowledge/b.md")
}
func TestRunBrainQuery_JSON(t *testing.T) {
srv := brainQueryServer(t, `{"results":[{"path":"x.md","title":"X","excerpt":"e","score":5}]}`)
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"query", "--json", "topic"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Contains(t, out.String(), `"path": "x.md"`)
assert.Contains(t, out.String(), `"score": 5`)
}
func TestRunBrainQuery_Limit(t *testing.T) {
gotLimit := -1
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
var p struct {
Query string `json:"query"`
Limit int `json:"limit"`
}
_ = json.Unmarshal(body, &p)
gotLimit = p.Limit
_, _ = w.Write([]byte(`{"results":[]}`))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"query", "--limit", "12", "topic"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Equal(t, 12, gotLimit)
}
func TestRunBrainQuery_MissingTopic(t *testing.T) {
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"query"}, strings.NewReader(""), &out, &errBuf)
assert.Error(t, err)
}
func TestRunBrain_NoSubsubcommand(t *testing.T) {
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
assert.Error(t, err)
assert.Contains(t, err.Error(), "subcommand required")
}
func TestRunBrain_UnknownSubsubcommand(t *testing.T) {
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"bogus"}, strings.NewReader(""), &out, &errBuf)
assert.Error(t, err)
}
func TestRunBrainWrite_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, http.MethodPost, r.Method)
assert.Equal(t, "/write", r.URL.Path)
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"path":"knowledge/test-slug.md"}`))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(
context.Background(),
[]string{"write", "knowledge", "test-slug"},
strings.NewReader("# Test\n\nSome body content.\n"),
&out, &errBuf,
)
require.NoError(t, err)
assert.Contains(t, out.String(), "knowledge/test-slug.md")
}
func TestRunBrainWrite_MissingArgs(t *testing.T) {
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"write", "knowledge"}, strings.NewReader("x"), &out, &errBuf)
assert.Error(t, err)
assert.Contains(t, err.Error(), "type and slug required")
}
func TestRunBrainWrite_BackendError(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusBadRequest)
_, _ = w.Write([]byte("invalid slug"))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(
context.Background(),
[]string{"write", "knowledge", "bad slug"},
strings.NewReader("body"),
&out, &errBuf,
)
assert.Error(t, err)
assert.Contains(t, err.Error(), "400")
}
func TestRunBrainWrite_EmptyStdin(t *testing.T) {
gotLen := -1
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
var p struct {
Content string `json:"content"`
}
_ = json.Unmarshal(body, &p)
gotLen = len(p.Content)
_, _ = w.Write([]byte(`{"path":"x.md"}`))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"write", "knowledge", "empty"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Equal(t, 0, gotLen, "empty stdin should produce empty content payload")
}
func TestRunBrainPassRate_Human(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":47,"fail":3,"skip":0,"total":50,"pass_rate":0.94}`))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"pass-rate", "tdd"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
got := out.String()
assert.Contains(t, got, "tdd")
assert.Contains(t, got, "47 / 50")
assert.Contains(t, got, "94%")
}
func TestRunBrainPassRate_NoData(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":0,"fail":0,"skip":0,"total":0,"pass_rate":null}`))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"pass-rate", "tdd"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Contains(t, out.String(), "no data")
}
func TestRunBrainPassRate_JSON(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":47,"fail":3,"skip":0,"total":50,"pass_rate":0.94}`))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"pass-rate", "--json", "tdd"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Contains(t, out.String(), `"pass_rate": 0.94`)
}
func TestRunBrainPassRate_MissingSkill(t *testing.T) {
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"pass-rate"}, strings.NewReader(""), &out, &errBuf)
assert.Error(t, err)
assert.Contains(t, err.Error(), "skill required")
}
func TestRunBrainPassRate_WindowFlag(t *testing.T) {
gotWindow := ""
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotWindow = r.URL.Query().Get("window")
_, _ = w.Write([]byte(`{"skill":"tdd","window":"30d","pass":0,"fail":0,"skip":0,"total":0,"pass_rate":null}`))
}))
defer srv.Close()
t.Setenv("BRAIN_URL", srv.URL)
var out, errBuf bytes.Buffer
err := runBrain(context.Background(), []string{"pass-rate", "--window", "30d", "tdd"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Equal(t, "30d", gotWindow)
}

159
cmd/hyperguild/http.go Normal file
View File

@@ -0,0 +1,159 @@
package main
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"net/url"
"os"
"time"
)
const defaultBrainURL = "http://koala:30330"
// brainClient calls the brain HTTP REST API exposed alongside the MCP
// endpoint at the same host:port. /mcp serves MCP framing; /query and /write
// serve plain REST. We use the REST surface because the CLI is a
// shell-friendly client; MCP framing is unnecessary.
type brainClient struct {
baseURL string
http *http.Client
}
func newBrainClient() *brainClient {
u := os.Getenv("BRAIN_URL")
if u == "" {
u = defaultBrainURL
}
return &brainClient{
baseURL: u,
http: &http.Client{Timeout: 5 * time.Second},
}
}
// QueryHit mirrors a single result from the brain's /query endpoint.
type QueryHit struct {
Path string `json:"path"`
Title string `json:"title"`
Excerpt string `json:"excerpt"`
Score int `json:"score"`
}
// QueryResult mirrors the /query response envelope.
type QueryResult struct {
Results []QueryHit `json:"results"`
}
func (c *brainClient) Query(ctx context.Context, topic string, limit int) (*QueryResult, error) {
payload, err := json.Marshal(struct {
Query string `json:"query"`
Limit int `json:"limit"`
}{Query: topic, Limit: limit})
if err != nil {
return nil, fmt.Errorf("marshal payload: %w", err)
}
u := c.baseURL + "/query"
req, err := http.NewRequestWithContext(ctx, http.MethodPost, u, bytes.NewReader(payload))
if err != nil {
return nil, fmt.Errorf("build request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
resp, err := c.http.Do(req)
if err != nil {
return nil, fmt.Errorf("brain POST /query: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
return nil, fmt.Errorf("brain POST /query: status %d: %s", resp.StatusCode, string(body))
}
var out QueryResult
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
return nil, fmt.Errorf("decode /query response: %w", err)
}
return &out, nil
}
// WriteResult mirrors the /write response envelope.
type WriteResult struct {
Path string `json:"path"`
}
func (c *brainClient) Write(ctx context.Context, kind, slug string, content io.Reader) (*WriteResult, error) {
body, err := io.ReadAll(content)
if err != nil {
return nil, fmt.Errorf("read content: %w", err)
}
payload, err := json.Marshal(struct {
Type string `json:"type"`
Slug string `json:"slug"`
Content string `json:"content"`
}{Type: kind, Slug: slug, Content: string(body)})
if err != nil {
return nil, fmt.Errorf("marshal payload: %w", err)
}
u := c.baseURL + "/write"
req, err := http.NewRequestWithContext(ctx, http.MethodPost, u, bytes.NewReader(payload))
if err != nil {
return nil, fmt.Errorf("build request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
resp, err := c.http.Do(req)
if err != nil {
return nil, fmt.Errorf("brain POST /write: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
respBody, _ := io.ReadAll(resp.Body)
return nil, fmt.Errorf("brain POST /write: status %d: %s", resp.StatusCode, string(respBody))
}
var out WriteResult
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
return nil, fmt.Errorf("decode /write response: %w", err)
}
return &out, nil
}
// PassRateResult mirrors the /pass-rate response envelope.
type PassRateResult struct {
Skill string `json:"skill"`
Window string `json:"window"`
Pass int `json:"pass"`
Fail int `json:"fail"`
Skip int `json:"skip"`
Total int `json:"total"`
PassRate *float64 `json:"pass_rate"`
}
func (c *brainClient) PassRate(ctx context.Context, skill, window string) (*PassRateResult, error) {
q := url.Values{}
q.Set("skill", skill)
q.Set("window", window)
u := c.baseURL + "/pass-rate?" + q.Encode()
req, err := http.NewRequestWithContext(ctx, http.MethodGet, u, nil)
if err != nil {
return nil, fmt.Errorf("build request: %w", err)
}
resp, err := c.http.Do(req)
if err != nil {
return nil, fmt.Errorf("brain GET /pass-rate: %w", err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := io.ReadAll(resp.Body)
return nil, fmt.Errorf("brain GET /pass-rate: status %d: %s", resp.StatusCode, string(body))
}
var out PassRateResult
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
return nil, fmt.Errorf("decode /pass-rate response: %w", err)
}
return &out, nil
}

131
cmd/hyperguild/http_test.go Normal file
View File

@@ -0,0 +1,131 @@
package main
import (
"context"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestBrainClient_Query_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, http.MethodPost, r.Method)
assert.Equal(t, "/query", r.URL.Path)
body, _ := io.ReadAll(r.Body)
var got struct {
Query string `json:"query"`
Limit int `json:"limit"`
}
require.NoError(t, json.Unmarshal(body, &got))
assert.Equal(t, "find-h", got.Query)
assert.Equal(t, 3, got.Limit)
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"results":[{"path":"knowledge/x.md","title":"x","excerpt":"...","score":7}]}`))
}))
defer srv.Close()
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
res, err := c.Query(context.Background(), "find-h", 3)
require.NoError(t, err)
require.Len(t, res.Results, 1)
assert.Equal(t, "knowledge/x.md", res.Results[0].Path)
assert.Equal(t, 7, res.Results[0].Score)
}
func TestBrainClient_Query_TransportError(t *testing.T) {
c := &brainClient{baseURL: "http://127.0.0.1:1", http: http.DefaultClient}
_, err := c.Query(context.Background(), "x", 5)
assert.Error(t, err)
}
func TestBrainClient_Query_Non200(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusInternalServerError)
_, _ = w.Write([]byte("boom"))
}))
defer srv.Close()
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
_, err := c.Query(context.Background(), "x", 5)
require.Error(t, err)
assert.Contains(t, err.Error(), "500")
}
func TestBrainClient_Write_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, "/write", r.URL.Path)
assert.Equal(t, http.MethodPost, r.Method)
body, _ := io.ReadAll(r.Body)
var got struct {
Type string `json:"type"`
Slug string `json:"slug"`
Content string `json:"content"`
}
require.NoError(t, json.Unmarshal(body, &got))
assert.Equal(t, "knowledge", got.Type)
assert.Equal(t, "find-h", got.Slug)
assert.Equal(t, "# body\n", got.Content)
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"path":"knowledge/find-h.md"}`))
}))
defer srv.Close()
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
res, err := c.Write(context.Background(), "knowledge", "find-h", strings.NewReader("# body\n"))
require.NoError(t, err)
assert.Equal(t, "knowledge/find-h.md", res.Path)
}
func TestBrainClient_PassRate_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, http.MethodGet, r.Method)
assert.Equal(t, "/pass-rate", r.URL.Path)
assert.Equal(t, "tdd", r.URL.Query().Get("skill"))
assert.Equal(t, "7d", r.URL.Query().Get("window"))
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":47,"fail":3,"skip":0,"total":50,"pass_rate":0.94}`))
}))
defer srv.Close()
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
res, err := c.PassRate(context.Background(), "tdd", "7d")
require.NoError(t, err)
assert.Equal(t, "tdd", res.Skill)
assert.Equal(t, 47, res.Pass)
assert.Equal(t, 3, res.Fail)
require.NotNil(t, res.PassRate)
assert.InDelta(t, 0.94, *res.PassRate, 0.001)
}
func TestBrainClient_PassRate_NullRate(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"skill":"tdd","window":"7d","pass":0,"fail":0,"skip":0,"total":0,"pass_rate":null}`))
}))
defer srv.Close()
c := &brainClient{baseURL: srv.URL, http: srv.Client()}
res, err := c.PassRate(context.Background(), "tdd", "7d")
require.NoError(t, err)
assert.Nil(t, res.PassRate)
}
func TestNewBrainClient_DefaultURL(t *testing.T) {
t.Setenv("BRAIN_URL", "")
c := newBrainClient()
assert.Equal(t, "http://koala:30330", c.baseURL)
}
func TestNewBrainClient_OverrideURL(t *testing.T) {
t.Setenv("BRAIN_URL", "http://localhost:9999")
c := newBrainClient()
assert.Equal(t, "http://localhost:9999", c.baseURL)
}

71
cmd/hyperguild/main.go Normal file
View File

@@ -0,0 +1,71 @@
// Package main implements the hyperguild CLI: tier probe, brain HTTP REST
// access, and .mcp.json mode bootstrap. See docs/superpowers/specs/.
package main
import (
"context"
"fmt"
"io"
"os"
)
// subcommand is the contract every hyperguild subcommand satisfies.
// Functions take an explicit context, args (without the subcommand name
// itself), and explicit IO so tests can exercise full flows without
// touching os.Stdin / os.Stdout / os.Exit.
type subcommand func(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error
func subcommands() map[string]subcommand {
return map[string]subcommand{
"tier": runTier,
"brain": runBrain,
"mode": runMode,
}
}
const usage = `Usage: hyperguild <subcommand> [options]
Subcommands:
tier Probe Anthropic + LiteLLM, print current operating tier.
brain query <q> BM25 search the brain (HTTP REST).
brain write <t> <s>
Write stdin as a knowledge entry of type <t>, slug <s>.
mode <name> Bootstrap .mcp.json for a chosen mode:
cloud | client-local | sovereign
Environment:
BRAIN_URL Brain HTTP REST + MCP base URL.
Default: http://koala:30330
ANTHROPIC_PROBE_URL Tier probe URL for the Anthropic API.
Default: https://api.anthropic.com
LITELLM_BASE_URL Tier probe URL for the LiteLLM gateway.
Optional; if empty, falls through to airplane tier.
`
// dispatch routes args to a subcommand and returns the process exit code.
// Split from main() so tests can drive it without process exit.
func dispatch(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) int {
if len(args) == 0 {
fmt.Fprint(stderr, usage) //nolint:errcheck
return 2
}
switch args[0] {
case "-h", "--help", "help":
fmt.Fprint(stdout, usage) //nolint:errcheck
return 0
}
cmd, ok := subcommands()[args[0]]
if !ok {
fmt.Fprintf(stderr, "hyperguild: unknown subcommand: %s\n%s", args[0], usage) //nolint:errcheck
return 2
}
if err := cmd(ctx, args[1:], stdin, stdout, stderr); err != nil {
fmt.Fprintf(stderr, "hyperguild %s: %v\n", args[0], err) //nolint:errcheck
return 1
}
return 0
}
func main() {
os.Exit(dispatch(context.Background(), os.Args[1:], os.Stdin, os.Stdout, os.Stderr))
}

View File

@@ -0,0 +1,45 @@
package main
import (
"bytes"
"context"
"strings"
"testing"
"github.com/stretchr/testify/assert"
)
func TestDispatch_Help_PrintsUsageAndReturnsZero(t *testing.T) {
var out, errBuf bytes.Buffer
code := dispatch(context.Background(), []string{"--help"}, strings.NewReader(""), &out, &errBuf)
assert.Equal(t, 0, code)
assert.Contains(t, out.String(), "Usage: hyperguild")
assert.Contains(t, out.String(), "tier")
assert.Contains(t, out.String(), "brain")
assert.Contains(t, out.String(), "mode")
}
func TestDispatch_NoArgs_PrintsUsageAndReturnsTwo(t *testing.T) {
var out, errBuf bytes.Buffer
code := dispatch(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
assert.Equal(t, 2, code)
assert.Contains(t, errBuf.String(), "Usage: hyperguild")
}
func TestDispatch_UnknownSubcommand_ReturnsTwo(t *testing.T) {
var out, errBuf bytes.Buffer
code := dispatch(context.Background(), []string{"bogus"}, strings.NewReader(""), &out, &errBuf)
assert.Equal(t, 2, code)
assert.Contains(t, errBuf.String(), "unknown subcommand: bogus")
}
func TestDispatch_KnownSubcommand_RoutesToHandler(t *testing.T) {
// "mode" without args fails → exit 1, message on stderr.
// (Confirms dispatch reached the handler rather than printing "unknown
// subcommand: mode".)
var out, errBuf bytes.Buffer
code := dispatch(context.Background(), []string{"mode"}, strings.NewReader(""), &out, &errBuf)
assert.Equal(t, 1, code)
assert.Contains(t, errBuf.String(), "name required")
assert.NotContains(t, errBuf.String(), "unknown subcommand")
}

101
cmd/hyperguild/mode.go Normal file
View File

@@ -0,0 +1,101 @@
package main
import (
"context"
"encoding/json"
"errors"
"flag"
"fmt"
"io"
"os"
)
func runMode(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
fs := flag.NewFlagSet("mode", flag.ContinueOnError)
fs.SetOutput(stderr)
out := fs.String("out", ".mcp.json", "output file path")
force := fs.Bool("force", false, "overwrite an existing file")
// Pull the first positional (mode name) out so flags after it still parse
// with stdlib flag (which stops at the first non-flag arg).
if len(args) < 1 {
return errors.New("name required (cloud|client-local|sovereign)")
}
name := args[0]
if err := fs.Parse(args[1:]); err != nil {
return fmt.Errorf("parse flags: %w", err)
}
brainURL := os.Getenv("BRAIN_URL")
if brainURL == "" {
brainURL = defaultBrainURL
}
var doc map[string]any
switch name {
case "cloud":
doc = modeCloud(brainURL)
case "client-local":
doc = modeClientLocal(brainURL)
case "sovereign":
doc = modeSovereign(brainURL)
default:
return fmt.Errorf("unknown mode: %s (expected cloud|client-local|sovereign)", name)
}
if !*force {
if _, err := os.Stat(*out); err == nil {
return fmt.Errorf("%s exists (use --force to overwrite)", *out)
}
}
body, err := json.MarshalIndent(doc, "", " ")
if err != nil {
return fmt.Errorf("marshal mode doc: %w", err)
}
if err := os.WriteFile(*out, append(body, '\n'), 0o644); err != nil {
return fmt.Errorf("write %s: %w", *out, err)
}
fmt.Fprintf(stdout, "wrote %s (mode: %s)\n", *out, name) //nolint:errcheck
return nil
}
func modeCloud(brainURL string) map[string]any {
return map[string]any{
"mcpServers": map[string]any{
"brain": map[string]any{
"url": brainURL + "/mcp",
"description": "Brain MCP — knowledge query, write, ingestion, session log",
},
},
}
}
func modeClientLocal(brainURL string) map[string]any {
return map[string]any{
"mcpServers": map[string]any{
"brain": map[string]any{
"url": brainURL + "/mcp",
"description": "Brain MCP — knowledge query, write, ingestion, session log",
},
"routing": map[string]any{
"url": "http://koala:30310/mcp",
"description": "Mode 2 routing pod — routes skill calls to LiteLLM/local",
"headers": map[string]any{
"X-Hyperguild-Mode": "client-local",
},
},
},
}
}
func modeSovereign(brainURL string) map[string]any {
return map[string]any{
"_mode_note": "Sovereign mode primarily uses Crush + LiteLLM. This .mcp.json is provided as Claude Code fallback (e.g. emergency offline editing).",
"mcpServers": map[string]any{
"brain": map[string]any{
"url": brainURL + "/mcp",
"description": "Brain MCP — knowledge query, write, ingestion, session log",
},
},
}
}

148
cmd/hyperguild/mode_test.go Normal file
View File

@@ -0,0 +1,148 @@
package main
import (
"bytes"
"context"
"encoding/json"
"os"
"path/filepath"
"strings"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func readJSON(t *testing.T, path string) map[string]any {
t.Helper()
b, err := os.ReadFile(path)
require.NoError(t, err)
var out map[string]any
require.NoError(t, json.Unmarshal(b, &out))
return out
}
func TestRunMode_Cloud_Default(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
t.Setenv("BRAIN_URL", "http://koala:30330")
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"cloud", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
require.NoError(t, err)
got := readJSON(t, outPath)
servers, ok := got["mcpServers"].(map[string]any)
require.True(t, ok, "mcpServers must be a JSON object")
assert.Contains(t, servers, "brain")
assert.NotContains(t, servers, "routing")
assert.NotContains(t, got, "_mode_note")
}
func TestRunMode_ClientLocal_HasRoutingEntry(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
t.Setenv("BRAIN_URL", "http://koala:30330")
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"client-local", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
require.NoError(t, err)
got := readJSON(t, outPath)
servers := got["mcpServers"].(map[string]any)
require.Contains(t, servers, "brain")
require.Contains(t, servers, "routing")
routing := servers["routing"].(map[string]any)
assert.NotContains(t, routing, "_routing_pending", "placeholder should be removed once Plan 6 ships")
headers, ok := routing["headers"].(map[string]any)
require.True(t, ok, "routing entry should have headers block")
assert.Equal(t, "client-local", headers["X-Hyperguild-Mode"])
}
func TestModeClientLocalHasRoutingHeader(t *testing.T) {
tmp := t.TempDir() + "/mcp.json"
out := &bytes.Buffer{}
stderr := &bytes.Buffer{}
require.NoError(t, runMode(context.Background(), []string{"client-local", "--out", tmp}, nil, out, stderr))
body, err := os.ReadFile(tmp)
require.NoError(t, err)
var doc map[string]any
require.NoError(t, json.Unmarshal(body, &doc))
servers := doc["mcpServers"].(map[string]any)
routing := servers["routing"].(map[string]any)
assert.Equal(t, "http://koala:30310/mcp", routing["url"])
assert.NotContains(t, routing, "_routing_pending", "placeholder should be removed once Plan 6 ships")
headers, ok := routing["headers"].(map[string]any)
require.True(t, ok, "routing entry should have headers block")
assert.Equal(t, "client-local", headers["X-Hyperguild-Mode"])
}
func TestRunMode_Sovereign_HasModeNote(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"sovereign", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
require.NoError(t, err)
got := readJSON(t, outPath)
assert.Contains(t, got, "_mode_note")
servers := got["mcpServers"].(map[string]any)
assert.Contains(t, servers, "brain")
assert.NotContains(t, servers, "routing")
}
func TestRunMode_DefaultsOutToCwd(t *testing.T) {
dir := t.TempDir()
t.Chdir(dir) // Go 1.24+ — replaces the older os.Chdir-with-cleanup pattern
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"cloud"}, strings.NewReader(""), &stdout, &stderr)
require.NoError(t, err)
_, statErr := os.Stat(filepath.Join(dir, ".mcp.json"))
assert.NoError(t, statErr, ".mcp.json should exist in cwd")
}
func TestRunMode_UnknownMode(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"bogus", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
assert.Error(t, err)
assert.Contains(t, err.Error(), "unknown mode")
}
func TestRunMode_NoArgs(t *testing.T) {
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{}, strings.NewReader(""), &stdout, &stderr)
assert.Error(t, err)
}
func TestRunMode_RefusesToOverwrite(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
require.NoError(t, os.WriteFile(outPath, []byte(`{"existing":"file"}`), 0o644))
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"cloud", "--out", outPath}, strings.NewReader(""), &stdout, &stderr)
require.Error(t, err)
assert.Contains(t, err.Error(), "exists")
}
func TestRunMode_Force(t *testing.T) {
dir := t.TempDir()
outPath := filepath.Join(dir, ".mcp.json")
require.NoError(t, os.WriteFile(outPath, []byte(`{"existing":"file"}`), 0o644))
var stdout, stderr bytes.Buffer
err := runMode(context.Background(), []string{"cloud", "--out", outPath, "--force"}, strings.NewReader(""), &stdout, &stderr)
require.NoError(t, err)
got := readJSON(t, outPath)
assert.Contains(t, got, "mcpServers")
assert.NotContains(t, got, "existing")
}

42
cmd/hyperguild/tier.go Normal file
View File

@@ -0,0 +1,42 @@
package main
import (
"context"
"encoding/json"
"flag"
"fmt"
"io"
"os"
"github.com/mathiasbq/supervisor/internal/tier"
)
const defaultAnthropicProbe = "https://api.anthropic.com"
func runTier(ctx context.Context, args []string, _ io.Reader, stdout, stderr io.Writer) error {
fs := flag.NewFlagSet("tier", flag.ContinueOnError)
fs.SetOutput(stderr)
asJSON := fs.Bool("json", false, "output JSON instead of human-readable")
if err := fs.Parse(args); err != nil {
return fmt.Errorf("parse flags: %w", err)
}
anthropicURL := os.Getenv("ANTHROPIC_PROBE_URL")
if anthropicURL == "" {
anthropicURL = defaultAnthropicProbe
}
liteLLMURL := os.Getenv("LITELLM_BASE_URL") // empty → tier falls through to airplane
info := tier.Detect(ctx, anthropicURL, liteLLMURL)
if *asJSON {
enc := json.NewEncoder(stdout)
enc.SetIndent("", " ")
if err := enc.Encode(info); err != nil {
return fmt.Errorf("encode json: %w", err)
}
return nil
}
fmt.Fprintf(stdout, "tier %d (%s) managed_agents=%t\n", int(info.Tier), info.Label, info.ManagedAgents) //nolint:errcheck
return nil
}

View File

@@ -0,0 +1,77 @@
package main
import (
"bytes"
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func okServer(t *testing.T) *httptest.Server {
t.Helper()
return httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
}))
}
func TestRunTier_Full_Human(t *testing.T) {
anthropic := okServer(t)
defer anthropic.Close()
litellm := okServer(t)
defer litellm.Close()
t.Setenv("ANTHROPIC_PROBE_URL", anthropic.URL)
t.Setenv("LITELLM_BASE_URL", litellm.URL)
var out, errBuf bytes.Buffer
err := runTier(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Contains(t, out.String(), "tier 1")
assert.Contains(t, out.String(), "full-online")
assert.Contains(t, out.String(), "managed_agents=true")
}
func TestRunTier_LANOnly_JSON(t *testing.T) {
litellm := okServer(t)
defer litellm.Close()
t.Setenv("ANTHROPIC_PROBE_URL", "http://127.0.0.1:1") // unreachable
t.Setenv("LITELLM_BASE_URL", litellm.URL)
var out, errBuf bytes.Buffer
err := runTier(context.Background(), []string{"--json"}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
var got struct {
Tier int `json:"tier"`
Label string `json:"label"`
ManagedAgents bool `json:"managed_agents"`
}
require.NoError(t, json.Unmarshal(out.Bytes(), &got))
assert.Equal(t, 2, got.Tier)
assert.Equal(t, "lan-only", got.Label)
assert.False(t, got.ManagedAgents)
}
func TestRunTier_Airplane_NoLiteLLMBaseURL(t *testing.T) {
t.Setenv("ANTHROPIC_PROBE_URL", "http://127.0.0.1:1")
t.Setenv("LITELLM_BASE_URL", "")
var out, errBuf bytes.Buffer
err := runTier(context.Background(), []string{}, strings.NewReader(""), &out, &errBuf)
require.NoError(t, err)
assert.Contains(t, out.String(), "tier 3")
assert.Contains(t, out.String(), "airplane")
}
func TestRunTier_UnknownFlag_ReturnsError(t *testing.T) {
var out, errBuf bytes.Buffer
err := runTier(context.Background(), []string{"--bogus"}, strings.NewReader(""), &out, &errBuf)
assert.Error(t, err)
}

170
cmd/routing/main.go Normal file
View File

@@ -0,0 +1,170 @@
package main
// The internal/skills/{debug,retrospective,review,trainer} packages imported
// below are also imported by cmd/supervisor. Plan 7 (supervisor retirement)
// MUST NOT delete these four packages — the routing pod is their second
// consumer. Plan 7 deletes only internal/skills/{tdd,spec,tier} (the skills
// that don't route to local), the supervisor binary, and supervisor manifests.
// See docs/superpowers/specs/2026-05-04-mode-2-routing-pod-design.md (Constraints).
import (
"context"
"log/slog"
"net/http"
"os"
"time"
"github.com/mathiasbq/supervisor/internal/auth"
"github.com/mathiasbq/supervisor/internal/config"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/githubclient"
"github.com/mathiasbq/supervisor/internal/mcp"
"github.com/mathiasbq/supervisor/internal/mcpclient"
"github.com/mathiasbq/supervisor/internal/registry"
"github.com/mathiasbq/supervisor/internal/routing"
"github.com/mathiasbq/supervisor/internal/skills/debug"
"github.com/mathiasbq/supervisor/internal/skills/project"
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
"github.com/mathiasbq/supervisor/internal/skills/review"
"github.com/mathiasbq/supervisor/internal/skills/trainer"
)
func main() {
logger := slog.New(slog.NewTextHandler(os.Stderr, nil))
slog.SetDefault(logger)
cfg, err := config.LoadRouting()
if err != nil {
logger.Error("config load failed", "err", err)
os.Exit(1)
}
configDir := envOr("SUPERVISOR_CONFIG_DIR", "/app/config/supervisor")
mustRead := func(path string) string {
b, err := os.ReadFile(configDir + "/" + path)
if err != nil {
logger.Error("read prompt failed", "path", path, "err", err)
os.Exit(1)
}
return string(b)
}
llm := iexec.NewLiteLLM(cfg.LiteLLMBaseURL, cfg.LiteLLMAPIKey, 0)
router := &routing.Router{
Fetcher: routing.NewFetcher(cfg.BrainURL, "7d", time.Duration(cfg.PassRateTTLSeconds)*time.Second),
Logger: routing.NewLogger(cfg.BrainURL),
Policy: routing.Policy{Floor: cfg.RouteLocalFloor, Ceil: cfg.RouteLocalCeil},
FastModel: cfg.FastModel,
ThinkingModel: cfg.ThinkingModel,
Complete: llm.Complete,
}
// Skill packages call CompleteFunc(ctx, model, system, user) — no session_id
// or project_root in the signature. Rather than modifying every skill's API
// (and inflating Plan 6's blast radius), the routing pod logs every decision
// under a fixed session_id "_routing". Operators query
// `GET /pass-rate?skill=_routing&window=...` to inspect routing health.
const routingSessionID = "_routing"
wrap := func(skillName string) routing.CompleteFunc {
return func(ctx context.Context, _, system, user string) (string, int64, error) {
// The model param is ignored: the router picks the model based on policy.
return router.Run(ctx, routing.RunInput{
Skill: skillName,
System: system,
User: user,
SessionID: routingSessionID,
ProjectRoot: "",
})
}
}
reg := registry.New()
reg.Register(review.New(review.Config{
SkillPrompt: mustRead("review.md"),
DefaultModel: cfg.FastModel,
CompleteFunc: review.CompleteFunc(wrap("review")),
}))
reg.Register(debug.New(debug.Config{
SkillPrompt: mustRead("debug.md"),
DefaultModel: cfg.FastModel,
CompleteFunc: debug.CompleteFunc(wrap("debug")),
}))
reg.Register(retrospective.New(retrospective.Config{
SkillPrompt: mustRead("retrospective.md"),
DefaultModel: cfg.FastModel,
CompleteFunc: retrospective.CompleteFunc(wrap("retrospective")),
}))
reg.Register(trainer.New(trainer.Config{
ReaderPrompt: mustRead("trainer-reader.md"),
WriterPrompt: mustRead("trainer-writer.md"),
DefaultModel: cfg.FastModel,
CompleteFunc: trainer.CompleteFunc(wrap("trainer")),
}))
if cfg.GiteaMCPURL != "" {
mcpC, err := mcpclient.New(cfg.GiteaMCPURL, cfg.GiteaMCPToken)
if err != nil {
logger.Error("mcpclient init for project_create — GITEA_MCP_URL is set but GITEA_MCP_TOKEN is empty (check routing-secrets)", "err", err)
os.Exit(1)
}
var ghClient *githubclient.Client
if cfg.GitHubPAT != "" {
ghClient = githubclient.New(cfg.GitHubPAT)
}
reg.Register(project.New(project.Config{
Client: mcpC,
GitHub: ghClient,
GiteaOwner: cfg.GiteaOwner,
GitHubOwner: cfg.GitHubOwner,
GitHubPAT: cfg.GitHubPAT,
InfraRepo: cfg.InfraRepo,
}))
logger.Info("project_create registered", "gitea_mcp_url", cfg.GiteaMCPURL,
"gitea_owner", cfg.GiteaOwner, "github_owner", cfg.GitHubOwner,
"infra_repo", cfg.InfraRepo, "github_pat_set", cfg.GitHubPAT != "")
} else {
logger.Info("project_create skipped — GITEA_MCP_URL not set")
}
var validator *auth.Validator
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
audience := os.Getenv("MCP_AUDIENCE")
v, err := auth.NewValidator(dexURL, audience)
if err != nil {
logger.Error("build jwt validator", "err", err)
os.Exit(1)
}
validator = v
logger.Info("jwt auth enabled", "issuer", dexURL)
}
srv := mcp.NewServer(reg, cfg.MCPAuthToken, validator)
mux := http.NewServeMux()
mux.Handle("/mcp", srv)
mux.HandleFunc("/healthz", func(w http.ResponseWriter, _ *http.Request) {
w.WriteHeader(http.StatusOK)
})
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
resourceURL := os.Getenv("MCP_RESOURCE_URL")
mux.HandleFunc("GET /.well-known/oauth-protected-resource",
auth.ProtectedResourceHandler(resourceURL, dexURL))
}
addr := ":" + cfg.Port
logger.Info("routing pod starting", "addr", addr,
"fast", cfg.FastModel, "thinking", cfg.ThinkingModel,
"floor", cfg.RouteLocalFloor, "ceil", cfg.RouteLocalCeil)
if err := http.ListenAndServe(addr, mux); err != nil { //nolint:gosec
logger.Error("server stopped", "err", err)
os.Exit(1)
}
}
func envOr(key, def string) string {
if v := os.Getenv(key); v != "" {
return v
}
return def
}

135
cmd/routing/main_test.go Normal file
View File

@@ -0,0 +1,135 @@
package main_test
import (
"context"
"encoding/json"
"io"
"net"
"net/http"
"net/http/httptest"
"os"
"os/exec"
"strconv"
"strings"
"testing"
"time"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// TestRoutingPodEndToEnd boots the binary against fake LiteLLM + brain servers,
// calls tools/list and one tools/call, and verifies the brain saw a session_log POST.
func TestRoutingPodEndToEnd(t *testing.T) {
if testing.Short() {
t.Skip("end-to-end binary boot")
}
var brainHits int
llm := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
_ = json.NewEncoder(w).Encode(map[string]any{
"choices": []map[string]any{{"message": map[string]any{"role": "assistant", "content": "stub"}}},
})
}))
defer llm.Close()
brain := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
case "/pass-rate":
brainHits++
_ = json.NewEncoder(w).Encode(map[string]any{"pass_rate": 0.95})
case "/mcp":
brainHits++
_ = json.NewEncoder(w).Encode(map[string]any{"jsonrpc": "2.0", "id": 1, "result": map[string]any{}})
}
}))
defer brain.Close()
port := freePort(t)
addr := "127.0.0.1:" + port
baseURL := "http://" + addr
bin := buildRouting(t)
cmd := exec.Command(bin)
cmd.Env = []string{
"ROUTING_PORT=" + port,
"LITELLM_BASE_URL=" + llm.URL,
"LITELLM_API_KEY=stub",
"BRAIN_URL=" + brain.URL,
"SUPERVISOR_CONFIG_DIR=../../config/supervisor",
"PATH=" + os.Getenv("PATH"),
"HOME=" + os.Getenv("HOME"),
}
require.NoError(t, cmd.Start())
t.Cleanup(func() { _ = cmd.Process.Kill() })
require.NoError(t, waitForPort(t, addr, 30*time.Second))
resp := mcpCall(t, baseURL+"/mcp", `{"jsonrpc":"2.0","id":1,"method":"tools/list"}`)
assert.Contains(t, resp, `"review"`)
assert.Contains(t, resp, `"debug"`)
assert.Contains(t, resp, `"retrospective"`)
assert.Contains(t, resp, `"trainer"`)
resp = mcpCall(t, baseURL+"/mcp", `{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"review","arguments":{"project_root":"/tmp","files":["README.md"]}}}`)
_ = resp // shape varies by skill; we only need a 200
// Wait briefly for the async session_log to land.
deadline := time.Now().Add(2 * time.Second)
for time.Now().Before(deadline) && brainHits < 2 {
time.Sleep(50 * time.Millisecond)
}
assert.GreaterOrEqual(t, brainHits, 2, "expected at least one /pass-rate hit and one /mcp session_log hit")
}
func buildRouting(t *testing.T) string {
t.Helper()
bin := t.TempDir() + "/routing"
out, err := exec.Command("go", "build", "-o", bin, "github.com/mathiasbq/supervisor/cmd/routing").CombinedOutput()
require.NoError(t, err, "build failed: %s", out)
return bin
}
func waitForPort(_ *testing.T, addr string, dur time.Duration) error {
deadline := time.Now().Add(dur)
for time.Now().Before(deadline) {
c, err := http.Get("http://" + addr + "/healthz") //nolint:noctx
if err == nil {
_ = c.Body.Close()
return nil
}
conn, err := http.NewRequest(http.MethodPost, "http://"+addr+"/mcp", strings.NewReader(`{}`))
if err == nil {
r, err := http.DefaultClient.Do(conn)
if err == nil {
_ = r.Body.Close()
return nil
}
}
time.Sleep(50 * time.Millisecond)
}
return context.DeadlineExceeded
}
func mcpCall(t *testing.T, url, body string) string {
t.Helper()
r, err := http.Post(url, "application/json", strings.NewReader(body)) //nolint:noctx
require.NoError(t, err)
defer func() { _ = r.Body.Close() }()
raw, err := io.ReadAll(r.Body)
require.NoError(t, err)
return string(raw)
}
// freePort grabs an OS-assigned TCP port and releases it. There is a small
// race window before the subprocess re-binds it, but it is acceptable for
// test isolation against a hardcoded port colliding with another test or
// stray process.
func freePort(t *testing.T) string {
t.Helper()
l, err := net.Listen("tcp", "127.0.0.1:0")
require.NoError(t, err)
port := l.Addr().(*net.TCPAddr).Port
require.NoError(t, l.Close())
return strconv.Itoa(port)
}

View File

@@ -1,163 +0,0 @@
package main
import (
"context"
"log/slog"
"net/http"
"os"
"github.com/mathiasbq/supervisor/internal/config"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/mcp"
"github.com/mathiasbq/supervisor/internal/registry"
"github.com/mathiasbq/supervisor/internal/skills/brain"
"github.com/mathiasbq/supervisor/internal/skills/org"
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
skilldebug "github.com/mathiasbq/supervisor/internal/skills/debug"
"github.com/mathiasbq/supervisor/internal/skills/review"
"github.com/mathiasbq/supervisor/internal/skills/spec"
"github.com/mathiasbq/supervisor/internal/skills/trainer"
"github.com/mathiasbq/supervisor/internal/skills/sessionlog"
"github.com/mathiasbq/supervisor/internal/skills/tdd"
"github.com/mathiasbq/supervisor/internal/tier"
)
func main() {
logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
cfg, err := config.Load()
if err != nil {
logger.Error("load config", "err", err)
os.Exit(1)
}
models, err := config.LoadModels(cfg.ModelsFile)
if err != nil {
logger.Error("load models", "err", err)
os.Exit(1)
}
protocolsPrompt, err := os.ReadFile(cfg.ConfigDir + "/protocols.md")
if err != nil {
logger.Error("read protocols.md", "path", cfg.ConfigDir+"/protocols.md", "err", err)
os.Exit(1)
}
// prependProtocols prepends the shared protocols to a skill discipline file.
prependProtocols := func(skillPrompt []byte) string {
return string(protocolsPrompt) + "\n---\n\n" + string(skillPrompt)
}
tddPrompt, err := os.ReadFile(cfg.ConfigDir + "/tdd.md")
if err != nil {
logger.Error("read tdd.md", "path", cfg.ConfigDir+"/tdd.md", "err", err)
os.Exit(1)
}
retroPrompt, err := os.ReadFile(cfg.ConfigDir + "/retrospective.md")
if err != nil {
logger.Error("read retrospective.md", "path", cfg.ConfigDir+"/retrospective.md", "err", err)
os.Exit(1)
}
reviewPrompt, err := os.ReadFile(cfg.ConfigDir + "/review.md")
if err != nil {
logger.Error("read review.md", "path", cfg.ConfigDir+"/review.md", "err", err)
os.Exit(1)
}
debugPrompt, err := os.ReadFile(cfg.ConfigDir + "/debug.md")
if err != nil {
logger.Error("read debug.md", "path", cfg.ConfigDir+"/debug.md", "err", err)
os.Exit(1)
}
specPrompt, err := os.ReadFile(cfg.ConfigDir + "/spec.md")
if err != nil {
logger.Error("read spec.md", "path", cfg.ConfigDir+"/spec.md", "err", err)
os.Exit(1)
}
trainerReaderPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-reader.md")
if err != nil {
logger.Error("read trainer-reader.md", "path", cfg.ConfigDir+"/trainer-reader.md", "err", err)
os.Exit(1)
}
trainerWriterPrompt, err := os.ReadFile(cfg.ConfigDir + "/trainer-writer.md")
if err != nil {
logger.Error("read trainer-writer.md", "path", cfg.ConfigDir+"/trainer-writer.md", "err", err)
os.Exit(1)
}
litellm := iexec.NewLiteLLM(cfg.LiteLLMBaseURL, cfg.LiteLLMAPIKey, 0)
tierFn := func(ctx context.Context) tier.Info {
return tier.Detect(ctx, "https://api.anthropic.com", cfg.LiteLLMBaseURL)
}
reg := registry.New()
reg.Register(tdd.New(tdd.Config{
SkillPrompt: prependProtocols(tddPrompt),
DefaultModel: models.ModelFor("tdd", ""),
CompleteFunc: litellm.Complete,
SessionsDir: cfg.SessionsDir,
IngestBaseURL: cfg.IngestBaseURL,
}))
reg.Register(brain.New(brain.Config{
IngestBaseURL: cfg.IngestBaseURL,
IngestSvcURL: cfg.IngestSvcURL,
KBRetrievalURL: cfg.KBRetrievalURL,
}))
reg.Register(org.New(org.Config{
TierFn: tierFn,
}))
reg.Register(sessionlog.New(sessionlog.Config{
SessionsDir: cfg.SessionsDir,
}))
reg.Register(retrospective.New(retrospective.Config{
SkillPrompt: prependProtocols(retroPrompt),
DefaultModel: models.ModelFor("retrospective", ""),
SessionsDir: cfg.SessionsDir,
CompleteFunc: litellm.Complete,
}))
reg.Register(review.New(review.Config{
SkillPrompt: prependProtocols(reviewPrompt),
DefaultModel: models.ModelFor("review", ""),
CompleteFunc: litellm.Complete,
SessionsDir: cfg.SessionsDir,
IngestBaseURL: cfg.IngestBaseURL,
}))
reg.Register(skilldebug.New(skilldebug.Config{
SkillPrompt: prependProtocols(debugPrompt),
DefaultModel: models.ModelFor("debug", ""),
CompleteFunc: litellm.Complete,
SessionsDir: cfg.SessionsDir,
IngestBaseURL: cfg.IngestBaseURL,
}))
reg.Register(spec.New(spec.Config{
SkillPrompt: prependProtocols(specPrompt),
DefaultModel: models.ModelFor("spec", ""),
CompleteFunc: litellm.Complete,
SessionsDir: cfg.SessionsDir,
IngestBaseURL: cfg.IngestBaseURL,
}))
reg.Register(trainer.New(trainer.Config{
ReaderPrompt: prependProtocols(trainerReaderPrompt),
WriterPrompt: prependProtocols(trainerWriterPrompt),
DefaultModel: models.ModelFor("trainer", ""),
CompleteFunc: litellm.Complete,
SessionsDir: cfg.SessionsDir,
BrainDir: cfg.BrainDir,
}))
srv := mcp.NewServer(reg)
mux := http.NewServeMux()
mux.Handle("/mcp", srv)
addr := ":" + cfg.Port
logger.Info("supervisor starting", "addr", addr, "version", "v0.5.0")
if err := http.ListenAndServe(addr, mux); err != nil {
logger.Error("server stopped", "err", err)
os.Exit(1)
}
}

View File

@@ -1,14 +0,0 @@
package main
import (
"os/exec"
"testing"
)
func TestBinaryCompiles(t *testing.T) {
cmd := exec.Command("go", "build", "./...")
out, err := cmd.CombinedOutput()
if err != nil {
t.Fatalf("build failed: %s\n%s", err, out)
}
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,79 @@
# Spec: hyperguild CLI
> Plan 4 of 7 — Hyperguild Skill Migration. Loaded after `feature-spec` skill.
## Problem Statement
Three needs converge on a single small Go binary:
1. **Tier probing as MCP is overkill.** The supervisor's `tier` MCP runs on `koala:30320` and answers a one-shot question (which models are reachable right now?). Pulling Claude Code through MCP startup, tool listing, and a JSON-RPC call for a 2-second probe is wasteful and adds a network hop the answer doesn't need.
2. **Brain access from shell scripts has no good front door.** The brain's HTTP REST API exists (Plan 1) at `koala:3300` for non-MCP clients, but every shell script that wants to query or write to the brain re-implements the curl invocation. A CLI gives shell pipelines, ad-hoc agent prompts, and quick-debug scenarios a stable interface.
3. **Mode bootstrap is manual.** Each new project that wants to operate in a chosen mode (cloud / client-local / sovereign) needs a `.mcp.json` written by hand. Without automation, mode adoption is gated on remembering the right MCP server URLs.
**Why now:** Plans 13 are merged. The CLI is the next building block in shrinking the supervisor pod toward a thin Mode-2 routing layer. Plans 5 and 6 build on the CLI's tier and brain helpers.
## Success Criteria
- [ ] `hyperguild tier` returns the same `tier.Info` that `internal/tier.Detect` produces for the same probe URLs, in < 3 s under all three tier conditions, with both human-readable and `--json` output.
- [ ] `hyperguild brain query <topic>` returns BM25 results from the brain HTTP REST `/query` endpoint, exit 0 on success and non-zero on transport failure.
- [ ] `hyperguild brain write <type> <slug>` reads markdown content from stdin, posts to `/write` with the type and slug, and creates `brain/knowledge/<slug>.md`. A round-trip (`hyperguild brain query <slug>` immediately after) finds the entry.
- [ ] `hyperguild mode <cloud|client-local|sovereign>` writes a parseable JSON file at the target path with the per-mode `mcpServers` entries; `jq -e .mcpServers` succeeds on the output.
- [ ] All commands print usage on `--help`, exit 2 on unknown flags, exit non-zero on operational errors.
- [ ] `task check` passes (lint + test + vet) on each task and on the merged branch.
## Constraints
- **Stdlib only.** No `cobra`, `urfave/cli`, `viper`, etc. CLI router and flag parsing use `flag.NewFlagSet`.
- **Go 1.26.1**, project default.
- **Module:** `github.com/mathiasbq/supervisor`, peer to `cmd/supervisor/`. New code at `cmd/hyperguild/`. The module name keeps its historical `supervisor` value — renaming the module is out of scope and would touch every import.
- **Reuse `internal/tier`** unchanged. The CLI is a thin wrapper around `tier.Detect`.
- **Brain endpoint configurable** via `BRAIN_URL` env var (default `http://koala:30330` — Tailscale-exposed NodePort, both MCP at `/mcp` and HTTP REST at `/query`, `/write`, etc., share the port). No hostname literals embedded in the CLI body — sourced from env per the existing "logical-addresses-in-instructions" memory.
- **Test discipline:** table-driven, testify, fakes for HTTP and tier probing. No live network in tests.
- **Errors:** wrapped via `fmt.Errorf("op: %w", err)`. No naked returns. Stderr for errors, stdout for results.
## Out of Scope
- The Mode 6 routing pod itself — `mode client-local` writes a placeholder entry pointing at the future routing URL with a `_routing_pending` annotation; the CLI does not provision the pod.
- Pass-rate logging (Plan 5) — the CLI's `brain write` does not emit `session_log` events.
- Skill worker CLIs (`hyperguild tdd_red`, `hyperguild review`, etc.) — those stay on the supervisor MCP until Plan 7.
- Brain HTTP server changes — the REST endpoints already exist.
- Authentication / TLS — Tailscale provides network isolation; no auth currently.
- Windows/Linux binaries — macOS-only per the user's setup. `go build` is portable but no cross-compilation in CI.
- A `crush` config writer for Mode 3 — Mode 3 (sovereign) writes a Claude-Code-compatible `.mcp.json` with brain-only MCP, on the assumption that even Crush-primary users may fall back to Claude Code with brain access. Crush's own config is owned by the user manually.
- A unified `--config` file for the CLI — env var + flags is enough today.
## Technical Approach
- **Single binary, inline subcommand router.** `cmd/hyperguild/main.go` dispatches on `os.Args[1]` to per-subcommand functions, each owning its own `flag.NewFlagSet`. Rationale: 4 top-level subcommands (`tier`, `brain`, `mode`, plus `--help`) and one nested level (`brain query`, `brain write`); ~80 lines of routing plumbing in stdlib beats pulling cobra's ~3 KLOC of dependencies for a tiny CLI. The router is testable by injecting `args []string` instead of reading `os.Args` directly.
- **`tier` subcommand reuses `internal/tier.Detect` verbatim.** Probe URLs (`https://api.anthropic.com` and the LiteLLM base URL) come from environment: `ANTHROPIC_PROBE_URL` (default the literal Anthropic URL) and `LITELLM_BASE_URL` (no default — error if `--mode-needs-llm` and unset). Rationale: matching the supervisor's existing wiring means the CLI cannot disagree with the supervisor about tier; a single source of truth.
- **`brain` subcommand calls the HTTP REST API.** Two nested subcommands:
- `brain query <topic>` issues `POST /query` with JSON body `{query, limit}` (default `--limit 5`), prints results in human-readable form by default and with `--json` for machine consumption.
- `brain write <type> <slug>` reads stdin, posts `POST /write` with JSON body `{type, slug, content}`, prints the resulting path on success.
Rationale: HTTP REST is simpler than MCP framing for a CLI. Per CLAUDE.md, the REST endpoints are documented as the official non-MCP interface.
- **`mode <name>` writes a per-mode `.mcp.json` template.** Defaults to writing `./.mcp.json` (cwd); accepts `--out <path>`. Per-mode bodies:
- `cloud``mcpServers` contains only `brain` at `http://koala:30330/mcp`.
- `client-local``mcpServers` contains `brain` at `http://koala:30330/mcp` and a `routing` placeholder entry with `url` set to a marker (`http://koala:30310/mcp`) and an extra field `"_routing_pending": "Plan 6 — routing pod not deployed yet"`. Rationale: keeping strict-JSON parseable means using a placeholder field rather than a JSON comment, which the spec parser would reject.
- `sovereign``mcpServers` contains only `brain`, plus a top-level `"_mode_note": "Sovereign mode primarily uses Crush + LiteLLM. This .mcp.json is provided as Claude Code fallback."`.
All three are valid JSON and all three round-trip through `jq` for verification.
Rationale: a single subcommand with three clearly-different outputs is easier to evolve than three nearly-duplicate subcommands. The placeholder fields are intentional documentation in the file itself, which the user actually opens and edits.
- **No global state.** Each subcommand is a function `(ctx context.Context, args []string, stdin io.Reader, stdout, stderr io.Writer) error`, allowing table-driven tests to exercise full subcommand flows without `os.Exit` or fd capture.
- **HTTP client injection.** A package-level `http.Client` with 5s timeout for `brain` calls, overridable in tests via a constructor. Real client for `main`, `httptest.Server` for tests.
## Risks
- **`.mcp.json` schema may evolve.** Claude Code's MCP config format is defined by the harness, and Anthropic could change it. Mitigation: document the format in the CLI's `--help` text and in the spec; if it breaks, the fix is local to one template function.
- **Brain endpoint hostname drift.** If the brain moves off `koala`, the env-var override avoids breaking the CLI but the `mode` template's hardcoded `koala:30330` becomes stale. Mitigation: source the URL in the `mode` template from the same env var (`BRAIN_URL`) so all three subcommands stay in lockstep with the user's actual environment.
- **`tier` probe URL gap.** The CLI inherits the supervisor's hardcoded `https://api.anthropic.com` probe URL via `internal/tier`. If Anthropic changes the URL, both supervisor and CLI break together. Mitigation: env-var override `ANTHROPIC_PROBE_URL`; default unchanged.
- **No HTTP retry logic.** The CLI returns first-error to the user. For ad-hoc shell use this is fine; for automation a future `--retry` flag may be needed. Out of scope for this iteration.
- **Tests don't cover live network.** Pure-fake tests catch regression but not "does the brain pod actually answer." Mitigation: add a smoke-test `task hyperguild:smoke` in a follow-up that runs against the real brain — separate concern, not in Plan 4.
- **Mode 3 sovereign output may surprise users** who expect Mode 3 to skip writing a `.mcp.json` entirely (since Crush is the primary harness). Mitigation: the `_mode_note` field explains the choice; the `--out /dev/null` escape hatch lets users skip the write if they want.

View File

@@ -0,0 +1,125 @@
# Spec: Pass-rate logging
> Plan 5 of 7 — Hyperguild Skill Migration. Loaded after `feature-spec` skill.
## Problem Statement
Plan 6 (Mode 2 routing pod) needs a per-skill signal to decide whether to route a call to the local model or keep it on Claude. The natural signal is recent pass rate: a skill that succeeds 95% of the time on local is safe to route; a skill that succeeds 60% is not. Today there is no such signal — the `session_log` MCP exists (shipped in Plan 1) but skills don't reliably call it, and no endpoint computes pass rate from the resulting logs.
Two consequences:
1. **Plan 6 cannot be trusted without baseline data.** Routing decisions made on guesses will produce regressions that erode confidence in Mode 2 entirely.
2. **The skill library has no observability.** When a skill regresses (model swap, prompt drift, environment change), there's no way to notice until a downstream task explicitly fails.
**Why now:** Plans 14 are merged. Plan 5 instruments the discipline that Plan 6 will consume. Several weeks of usage data between Plan 5 merge and Plan 6 deploy will mean Plan 6 lands on real numbers, not synthetic.
## Success Criteria
- [ ] After Plan 5 merges, every invocation of `tdd` (pilot skill) calls `session_log` at the end of each phase (red, green, refactor) with `final_status` ∈ {pass, fail, skip}.
- [ ] At least 6 of the remaining "binary-outcome" skills get the same treatment: `code-review`, `debug`, `feature-spec`, `session-retrospective`, `trainer`, `spec-driven-dev`. (Skills with no clear pass/fail — `clean-code`, `cognitive-load`, `solid`, `refactoring`, `test-design`, `problem-analysis`, `user-stories`, `planning`, `atdd`, `gitea-ci` — are out of scope.)
- [ ] A new HTTP REST endpoint `GET /pass-rate?skill=X&window=7d` on the brain pod returns valid JSON `{skill, window, pass, fail, skip, total, pass_rate}` for any skill name. Skills with no logged invocations return zeros (not 404, not error). Pass rate is `pass / (pass + fail)`; if `pass + fail == 0`, returns `pass_rate: null`.
- [ ] The endpoint's aggregator normalizes legacy values: `pass``ok`, `fail``error`, `skip``skipped`. No data loss when scanning historical logs.
- [ ] An optional CLI subcommand `hyperguild brain pass-rate <skill> [--window 7d] [--json]` calls the endpoint and prints either human-readable (`tdd: 47 / 50 = 94% (window: 7d)`) or JSON.
- [ ] `task check` passes (lint + test + vet + drift + govulncheck) on each task and on the merged branch.
- [ ] One week post-merge, `GET /pass-rate?skill=tdd&window=7d` returns non-zero counts and a real `pass_rate`.
## Constraints
- **Stdlib + existing deps only.** The endpoint adds to the existing ingestion pod's HTTP handler (Go, `net/http`). No new service, no new pod, no new persistence layer.
- **No auth on `/pass-rate`.** Same model as the rest of the brain HTTP REST API: Tailscale-only network, no token.
- **Schema:** the SKILL.md template uses `pass | fail | skip` for `final_status`. The aggregator treats `pass` and `ok` as equivalent, `fail` and `error` as equivalent, `skip` and `skipped` as equivalent. New writes from skills MUST use the new vocabulary; the aggregator handles both for read-back.
- **Storage:** continues to use the existing JSONL files at `<pod>/brain/sessions/*.jsonl`. No format change. No materialized aggregates. If on-demand scans become slow (>500ms p99), revisit in a follow-up; not now.
- **Backwards compatibility:** the existing `session_log` MCP tool's signature does not change. Its docstring should be updated to reflect the new vocabulary, but argument types stay the same.
- **Pilot-before-rollout:** the first SKILL.md instrumentation (`tdd`) must dogfood successfully — at least one real `tdd` invocation post-instrumentation produces a session log entry — before the other six skills get their updates.
## Out of Scope
- Plan 6 routing pod itself (the consumer of `/pass-rate`).
- Materialized rolling counters (compute on-demand for now).
- Auth, rate limiting, or per-user filtering on `/pass-rate`.
- Dashboards or visualization (`hyperguild brain pass-rate` text/JSON is the only UI).
- Real-time streaming or push notifications (`/pass-rate` is poll-only).
- Skills with no clear binary outcome (the 10 skills listed in Success Criteria).
- Per-model or per-mode breakdown (`session_log` already records `model_used`; the endpoint aggregates across all models for now). Plan 6 may want sharper aggregation; we'll add fields when it lands.
- Migration of the one historical entry in `2026-04-17-validate-hyperguild.jsonl` from `pass` (which is the new vocabulary, by accident) — no migration needed.
## Technical Approach
### Component A — SKILL.md instrumentation pattern
Each instrumented skill gets a standardized "Logging" subsection under its existing "Brain MCP Integration" section. The subsection names the required `session_log` fields with explicit copy-paste examples:
```
**At each phase end:** call `session_log` with:
- `skill`: "<this-skill-name>"
- `phase`: "<the-phase>"
- `final_status`: "pass" | "fail" | "skip"
- `message`: "<one-line summary>"
- `duration_ms`: <wall clock>
- `project_root`: "<absolute path to the project under work>"
```
The pilot SKILL.md (`~/dev/.skills/tdd/SKILL.md`) gets instrumented first. The implementation defines the contract; the rollout commits replicate the pattern across the other six SKILL.md files.
Rationale: SKILL.md as the source of truth means the contract is visible to every agent that loads the skill — no hidden middleware. Mode-agnostic: the agent calls `session_log` whether it's Claude (Mode 1), Claude+routing (Mode 2), or Crush (Mode 3). The pattern is uniform; only the skill name + phase set differ.
### Component B — `/pass-rate` HTTP endpoint
New handler at the existing ingestion pod, peer to `/query`, `/write`, `/ingest`, etc.
```
GET /pass-rate?skill=<name>&window=<duration>
→ 200 { "skill": "tdd", "window": "7d", "pass": 47, "fail": 3, "skip": 0, "total": 50, "pass_rate": 0.94 }
```
Algorithm:
1. Parse `skill` (required) and `window` (default `7d`, accept Go-style `1h`, `12h`, `7d`, `30d`).
2. Walk `brain/sessions/*.jsonl` in the pod's volume. For each line: parse JSON, filter by `skill == query.skill` and `timestamp >= now - window`.
3. Tally `pass` (counts both `pass` and `ok`), `fail` (`fail` and `error`), `skip` (`skip` and `skipped`).
4. Compute `pass_rate = pass / (pass + fail)`; if `pass + fail == 0`, return `pass_rate: null`.
5. Return JSON.
Rationale for on-demand: the JSONL files are append-only and small (one entry per skill phase, kilobytes per session at most). For the first months of Plan 5 usage, scanning all sessions for a single query is fast enough. If it ever isn't, a materialized index is a follow-up — the endpoint shape doesn't change.
### Component C — Optional CLI subcommand
`hyperguild brain pass-rate <skill> [--window 7d] [--json]`. Adds a third nested verb under `brain` (sibling to `query` and `write`). Calls `GET /pass-rate?skill=<>&window=<>` via the existing `brainClient` infrastructure. Default human output: `tdd: 47 / 50 = 94% (window: 7d)`. `--json` passes through the response envelope.
Rationale: shell access to pass-rate without curl + jq. Optional in the strict sense — Plan 6's routing pod will call the endpoint directly, not via the CLI — but cheap to add (one new method on `brainClient`, one new dispatch case in `runBrain`).
### Schema and normalization
`session_log` JSONL line shape (unchanged today, codified by this plan):
```json
{
"session_id": "<id>",
"timestamp": "2026-05-03T20:30:00Z",
"skill": "tdd",
"phase": "red",
"project_root": "/abs/path",
"final_status": "pass",
"duration_ms": 12345,
"message": "Test written, function undefined, red confirmed."
}
```
`final_status` values:
- New writes (this plan onward): `pass | fail | skip`
- Read aggregator accepts both new and legacy: `pass`/`ok` → pass, `fail`/`error` → fail, `skip`/`skipped` → skip
- Anything else → counted as `skip` for safety (don't pollute pass/fail with malformed entries)
### Tests
- Endpoint: table-driven tests with a temp `brain/sessions/` directory containing JSONL files spanning multiple skills, multiple statuses (both vocabularies), edge cases (empty file, malformed line, timestamp outside window, future timestamp). Tests run via `httptest.NewServer` against the real handler.
- CLI: tests for `runBrainPassRate` against `httptest.Server` fake of `/pass-rate`. Human and `--json` output paths.
- Pilot dogfood: after instrumenting `tdd/SKILL.md`, one real TDD task in this plan exercises the logging path. The corresponding session log entry verifies end-to-end.
- `task check` per task.
## Risks
- **Skills that don't reliably log produce missing data.** The aggregator returns zero counts for those, which Plan 6 may misread as "this skill always passes" or "this skill is broken". Mitigation: the endpoint returns `pass_rate: null` when `pass + fail == 0`, signalling "no data" distinct from "always passes". Plan 6 must check for null.
- **Agents may forget to call `session_log` mid-skill.** No way to enforce in cloud Mode 1 — Claude may skip the call if instructions are unclear. Mitigation: SKILL.md template makes the call literal and copy-pasteable. After 1 week, if instrumentation rate is < 80% of expected calls, escalate; consider a wrapper at the routing-pod layer in Plan 6 as belt-and-suspenders.
- **Schema drift between legacy `ok` and new `pass`.** Mitigation: the aggregator's normalization rule. Documented in the endpoint's response and in the `session_log` tool docstring update.
- **`/pass-rate` walks all session files for each request.** With ~1 file per session and tens of sessions per week, this is microseconds today. At hundreds of files per day, may need a date-bounded directory layout. Mitigation: monitor; if scan time > 100ms p99, revisit. Not in this plan.
- **The pilot may fail on the first dogfood.** If `tdd` instrumentation doesn't produce a log entry (e.g. agent didn't call `session_log`, JSON shape wrong, file permissions), the rollout to the other six skills is blocked until the pilot succeeds. Mitigation: explicit "pilot validates end-to-end" gate as the last step of Component A.
- **Adding a third verb under `brain` slightly stretches the inline-router pattern.** Three verbs in a switch is still simple; if it grows to five, the CLI may want a per-verb registration map. Mitigation: deferred — three is fine.

View File

@@ -0,0 +1,240 @@
# Spec: Mode 2 routing pod
> Plan 6 of 7 — Hyperguild Skill Migration. Loaded after `feature-spec` skill.
## Problem Statement
Mode 2 (`client-local`) is the cost-and-sovereignty mode for paid client work — keep skill calls inside Tailscale, save tokens, but stay reliable. Plans 15 produced everything Mode 2 needs except the consumer: the brain MCP at `:30330` is live, four skills are instrumented to log `pass | fail | skip`, and `GET /pass-rate?skill=X&window=Y` returns honest numbers (or `null` when there is no data). What is still missing is the policy layer that reads pass-rate and acts on it.
The supervisor pod (`:30320`) historically hosted full skill workers (`tdd_red/green/refactor`, `code_review`, `debug`, `spec`, `retrospective`, `trainer`, `tier`) but with no routing — every call ran local regardless of skill quality, and Claude Code in client-local mode silently lost access to Claude-quality work even when local was wrong. That's the regression Plan 6 fixes.
**Why now:** the supervisor pod is scheduled for retirement (Plan 7) and the data plumbing for routing decisions exists but has no consumer. Without Plan 6, Plan 7 cannot land.
## Success Criteria
- [ ] A new pod `routing` is deployed via Flux at NodePort `:30310`, alongside (not replacing) the supervisor and ingestion pods. Image built by gitea CI, deployment manifest under `infra/k3s/apps/routing/`. `kubectl -n routing get deployment` shows `1/1 Ready`.
- [ ] `POST http://koala:30310/mcp` responds to `tools/list` with exactly four tools: `code_review`, `debug`, `retrospective`, `trainer`. Each tool's name + JSON schema is byte-identical to the supervisor's current advertisement (verified by snapshot test).
- [ ] Bearer-token auth via env var `ROUTING_MCP_TOKEN` (same opt-in pattern as `SUPERVISOR_MCP_TOKEN` shipped in `f49850d`). Empty token = no auth; populated token = `Authorization: Bearer <token>` required, otherwise HTTP 401 + JSON-RPC `-32001`.
- [ ] On every tool call, the pod queries `${BRAIN_URL}/pass-rate?skill=<tool>&window=7d` and applies a configurable policy:
- `pass_rate == null` → route to local (default-to-local)
- `pass_rate ≥ HYPERGUILD_ROUTE_LOCAL_FLOOR` (default `0.90`) → route to local
- `HYPERGUILD_ROUTE_LOCAL_CEIL ≤ pass_rate < FLOOR` (CEIL default `0.70`) → 50/50 deterministic sample (hash of canonical request body)
- `pass_rate < CEIL` → route to Claude
- [ ] Both routes resolve to a LiteLLM call: local route uses `HYPERGUILD_LOCAL_MODEL` (default `qwen35`), Claude route uses `HYPERGUILD_CLAUDE_MODEL` (default `claude-sonnet-4-6`). LiteLLM at `${LITELLM_BASE_URL}` (default `http://piguard:4000`) handles provider routing. The routing pod has no direct Anthropic SDK.
- [ ] Every routing decision is logged via `session_log` to the brain pod with `{skill: "_routing", phase: "decide", final_status: "skip", message: "<tool>: <decision>", duration_ms, project_root}`. `final_status: "skip"` keeps these entries out of any skill's pass-rate aggregation.
- [ ] LiteLLM unreachable → fail open to a Claude decision *and* log `final_status: "fail"` for `_routing`. The pod must still serve requests even if LiteLLM is down for hours.
- [ ] `cmd/hyperguild/mode.go` updated: `mode client-local` writes the routing entry with `"headers": {"X-Hyperguild-Mode": "client-local"}` and the `_routing_pending` placeholder field is removed. The pod accepts but does not branch on the header (forward-compat only).
- [ ] `task check` (lint + test + vet + drift + govulncheck) passes on each task and on the merged branch. The CI gate that bit Plan 1 must not bite Plan 6 (per `feedback_per_task_verification` memory).
- [ ] A new `task smoke:routing` target boots the binary against the live LiteLLM at `piguard:4000` and the live `/pass-rate` at `koala:30330`, calls each of the four advertised tools once, and verifies a `_routing` entry appears in the brain via `GET /pass-rate?skill=_routing&window=1h`. This is the live-contract test (per `2026-05-03-fake-tests-vs-real-contract` brain entry); fake-server unit tests verify policy logic, the smoke step verifies the contract.
- [ ] Mode 1 (`cloud`) and Mode 3 (`sovereign`) are byte-identically unchanged. Verified by `git diff` showing no changes to `mode.go`'s `modeCloud` or `modeSovereign` functions.
## Constraints
- **Stdlib + existing deps only.** The routing pod reuses `internal/exec/litellm.go` (`NewLiteLLM`, `Complete`), `internal/registry`, and `internal/skills/{review,debug,retrospective,trainer}/`. No new third-party dependency. Auth code may be duplicated from `internal/mcp/server.go` or extracted to a shared helper — implementer's call.
- **No new persistence.** Pass-rate data lives in the brain pod's session JSONL files (Plan 5). Routing-decision logs land in the same place via `session_log`. Routing pod has no DB, no cache, no on-disk state beyond an optional in-memory pass-rate cache (TTL = 60 seconds — protects the brain from per-call hammering during an active session).
- **MCP wire format identical to supervisor's.** Tools have the same names and JSON schemas as today. A consumer switches modes by changing only the URL in `.mcp.json` — no schema-level differences. Snapshot tests pin this.
- **Pod must start and serve degraded.** If LiteLLM is down at startup, the pod still binds to `:3210`, advertises tools, and serves requests with the fail-open-to-Claude behavior described in success criteria.
- **`internal/skills/{review,debug,retrospective,trainer}/` survives Plan 6.** Plan 7's note about deleting them is amended: those four packages are reused by the routing pod and must NOT be deleted in Plan 7. Plan 7 deletes only `internal/skills/{tdd,spec}/`, the supervisor binary, the supervisor manifests, and frees NodePort `:30320`. This spec calls out the change so Plan 7's author doesn't delete needed code (per `2026-05-03-implicit-cleanup-third-category` brain entry).
- **No retries beyond fail-open.** A LiteLLM call that errors becomes a Claude decision and a `final_status: "fail"` log. No exponential backoff, no circuit breaker — that's policy for a future plan once the failure shape is observed.
- **Determinism in sampling.** When pass-rate is in the sample band (`CEIL ≤ pr < FLOOR`), the local-vs-Claude choice for a given request is reproducible: hash a canonical JSON of the request body, low bit picks local. Same input → same decision. Avoids per-call variance confusing the operator during a debugging session.
## Out of Scope
- **Plan 7 (supervisor retirement).** Separate plan, executed after Plan 6 stabilizes. Plan 6 leaves the supervisor pod running; nothing about supervisor changes in this plan.
- **Routing for `tdd_red/green/refactor`, `spec`, `tier`.** Per `project_per_skill_routing.md`, these are SKILL.md or CLI, not routing-pod tools. They never appear in the routing pod's `tools/list`. If a future plan changes that decision, it adds them then.
- **Routing for `brain_ingest`.** Already routed at the brain pod (Plan 1). No change.
- **Per-mode policy branching.** The pod accepts `X-Hyperguild-Mode` for forward-compat but treats absent or unknown values as `client-local`. No code path differs on the header value yet.
- **OAuth, IP allowlisting, rate limiting, audit logging.** Bearer-token only; same risk model as the supervisor MCP after `f49850d`.
- **Decision-log read endpoints.** Routing decisions land in the brain via `session_log`. Reads happen via the existing `GET /pass-rate` endpoint and JSONL inspection. No new read API.
- **Materialized routing-decision aggregates.** Out of scope for the same reason Plan 5 deferred materialized counters: on-demand scans are fast enough at current data volumes.
- **Tunable per-skill thresholds.** `FLOOR` and `CEIL` are global. If the operator decides `debug` needs a different floor than `code_review`, that's a follow-up plan with real data behind the choice.
- **Sampling beyond a 50/50 hash split.** No epsilon-decay schedules, no Thompson sampling, no per-skill exploration policies. Add when data justifies.
- **Migration of any existing supervisor-skill `.mcp.json` registrations.** Consumers update their `.mcp.json` (via `hyperguild mode client-local`) when they want Mode 2 behavior. No silent redirect.
- **Routing-pod-side prompt customization.** The four skill packages already own their prompts; the routing pod just calls into them via the existing `Skill` interface. Prompt edits remain a SKILL.md or `internal/skills/<x>/` concern.
## Technical Approach
### A. Binary layout: `cmd/routing/`
A new Go binary at `cmd/routing/main.go`. Stdlib + `internal/*`. Wires:
1. Config from env (typed struct in `internal/config/routing.go` — peer to `Config` for the supervisor; deliberately a separate type because the surfaces are different and merging would force every routing-pod field onto the supervisor and vice versa).
2. `internal/exec/litellm.NewLiteLLM(...)` — same client the supervisor uses.
3. `internal/skills/{review,debug,retrospective,trainer}.New(...)` constructors, each receiving a `CompleteFunc` that wraps the routing decision (see C below).
4. `internal/registry.New()` populated with the four skills.
5. `internal/mcp.NewServer(reg, cfg.MCPAuthToken)` — reuse the existing handler with bearer auth from `f49850d`. The handler is generic; nothing in it is supervisor-specific.
**Rationale:** the supervisor's runtime is already 80% of what the routing pod needs. Reusing it saves the routing pod from re-implementing skill dispatch, MCP protocol handling, and bearer auth. The only new code is the routing decision itself (C below) and the deployment manifests (G).
### B. Configuration via env
Typed struct, parsed at startup. New env vars introduced by Plan 6:
| Env var | Default | Purpose |
|---|---|---|
| `ROUTING_PORT` | `3210` | Pod's HTTP port (NodePort `:30310` maps to this) |
| `ROUTING_MCP_TOKEN` | — | Bearer token, opt-in (empty = no auth) |
| `LITELLM_BASE_URL` | `http://piguard:4000` | LiteLLM proxy (reused) |
| `LITELLM_API_KEY` | — | Reused, sourced from `routing-secrets` Secret |
| `BRAIN_URL` | `http://ingestion.supervisor:3300` | In-cluster brain pod for `/pass-rate` and `session_log` |
| `HYPERGUILD_LOCAL_MODEL` | `qwen35` | Model name passed to LiteLLM for the local decision |
| `HYPERGUILD_CLAUDE_MODEL` | `claude-sonnet-4-6` | Model name for the Claude decision |
| `HYPERGUILD_ROUTE_LOCAL_FLOOR` | `0.90` | At/above this pass-rate, always local |
| `HYPERGUILD_ROUTE_LOCAL_CEIL` | `0.70` | Below this, always Claude. Between CEIL and FLOOR is the sample band. |
| `HYPERGUILD_PASS_RATE_TTL_SECONDS` | `60` | Per-skill in-memory cache TTL |
**Rationale:** every value an operator might want to tune is an env var, not a hardcoded constant. Defaults are the recommendations from the kickoff and the per-skill-routing memory; sensible cluster values flow in via the Flux-managed Secret. No config file to manage.
### C. Decision policy (`internal/routing/policy.go`)
Pure function, no I/O:
```go
type Decision int
const (
DecideLocal Decision = iota
DecideClaude
)
type Policy struct{ Floor, Ceil float64 }
// Decide returns the routing decision. passRate may be nil when the brain has no data.
// requestHash is a deterministic 64-bit hash of the canonical request body — used only
// when passRate is in the sample band; same hash → same decision.
func (p Policy) Decide(passRate *float64, requestHash uint64) Decision { ... }
```
Rules (in order):
1. `passRate == nil``DecideLocal` (default-to-local)
2. `*passRate >= p.Floor``DecideLocal`
3. `*passRate < p.Ceil``DecideClaude`
4. Otherwise (sample band) → `requestHash & 1` picks local on `0`, claude on `1`
**Rationale:** no I/O in the policy means the function is trivially testable (table-driven, no fixtures, no servers). Network calls happen in a wrapping layer that calls `Decide` — same separation as `internal/skills/*/skill.go` keeps prompt strings separate from `Complete` calls. Default-to-local rule is justified in `project_per_skill_routing.md`: the four advertised skills are exactly the skills marked "MCP→local" in that target architecture.
### D. Pass-rate fetcher (`internal/routing/passrate.go`)
```go
type Fetcher struct {
BaseURL string
HTTPClient *http.Client // 1s timeout
Cache *ttlCache // map[string]*float64 with 60s TTL, struct-internal
}
func (f *Fetcher) Get(ctx context.Context, skill string) (*float64, error)
```
Calls `GET ${BaseURL}/pass-rate?skill=<skill>&window=7d`. On success, caches the parsed `pass_rate` (which may be `null`) for `HYPERGUILD_PASS_RATE_TTL_SECONDS`. On error, returns `(nil, err)`; the dispatch wrapper treats this as `*passRate == nil` and routes to local (the default-to-local fallback also covers brain-pod-down).
**Rationale:** GET is correct REST per `2026-05-03-rest-semantics-vs-precedent` (this is a pure read with query params; it shouldn't follow the legacy POST-everywhere precedent). Cache TTL of 60s prevents per-call hammering during a tight Claude Code loop while staying fresh enough that a flapping pass-rate visibly affects routing within a minute. No persistence — restart loses cache, that's fine.
### E. Dispatch wrapper
The four skills are constructed with their existing `CompleteFunc` signature (`(ctx, model, system, user) (string, int64, error)`). The routing pod wraps it:
```go
func (r *Router) Complete(ctx context.Context, skill, model, system, user string) (string, int64, error) {
pr, _ := r.fetcher.Get(ctx, skill)
decision := r.policy.Decide(pr, hashCanonical(system, user))
chosenModel := r.cfg.ClaudeModel
if decision == DecideLocal {
chosenModel = r.cfg.LocalModel
}
out, ms, err := r.litellm.Complete(ctx, chosenModel, system, user)
r.logDecision(skill, decision, err, ms)
if err != nil {
// fail open: try Claude once if we routed local; if Claude also fails, return error.
if decision == DecideLocal {
chosenModel = r.cfg.ClaudeModel
out, ms, err = r.litellm.Complete(ctx, chosenModel, system, user)
r.logDecision(skill, DecideClaude, err, ms) // second log entry, marked fail if still erroring
}
return out, ms, err
}
return out, ms, nil
}
```
The skill packages don't know about routing — they receive a `CompleteFunc` and call it. The wrapper substitutes routing logic at construction time.
**Rationale:** keeps the skill packages oblivious to mode. Same `internal/skills/review/` works under the supervisor (no routing) and under the routing pod (routed) without any conditional logic in the skill itself. Plan 7's deletion of the supervisor leaves the skills' shape intact for the routing pod.
### F. Decision logging (`internal/routing/log.go`)
After every decision, POST a session log entry to `${BRAIN_URL}/write` (the brain pod's existing endpoint, which appends to `brain/sessions/<session>.jsonl`). Entry shape:
```json
{
"skill": "_routing",
"phase": "decide",
"final_status": "skip",
"message": "<original_skill>: <decision> (pass_rate=<value or 'null'>, model=<chosen>)",
"duration_ms": <litellm_round_trip>,
"project_root": "<path from request, or 'unknown'>",
"timestamp": "<RFC3339>",
"session_id": "<from request, or generated>"
}
```
`final_status: "skip"` keeps these entries out of any real skill's pass-rate aggregation (Plan 5's aggregator counts only `pass`/`fail`). Operators can still query `GET /pass-rate?skill=_routing&window=7d` for routing-failure visibility (when LiteLLM down → `final_status: "fail"` in the second log entry).
**Rationale:** closes the observability loop without adding a new endpoint or schema. `_routing` namespaces routing entries away from skill names. `skip` is the only honest classification — routing isn't itself a pass/fail event in the skill sense.
### G. Deployment
New manifest directory `infra/k3s/apps/routing/` mirroring `infra/k3s/apps/supervisor/`'s shape:
- `namespace.yaml` — namespace `routing` (peer to `supervisor`)
- `deployment.yaml` — single replica, nodeSelector koala, image from gitea registry, `envFrom: secretRef: routing-secrets`
- `service.yaml` — ClusterIP on port 3210
- `nodeport.yaml` — NodePort 30310 → service 3210
- `secrets.enc.yaml` — SOPS-encrypted, contains `LITELLM_API_KEY` and (optionally) `ROUTING_MCP_TOKEN`
- `kustomization.yaml` — bundles the above
The supervisor pod's CI image build pattern (gitea Actions → `gitea.d-ma.be/mathias/supervisor:<sha>`) is replicated for `gitea.d-ma.be/mathias/routing:<sha>`. Flux's existing image-automation will bump the manifest's image tag on each push.
**Rationale:** copying the supervisor pod's manifest shape (rather than designing from scratch) is the YAGNI move. Flux + image automation already proven on supervisor; same pattern, same operator mental model. Mode 2 setup is now a Flux change, not a one-off `kubectl` ritual.
### H. Live smoke test
`task smoke:routing` (in the project Taskfile) does:
1. Boot the binary locally with `LITELLM_BASE_URL=http://piguard:4000` and `BRAIN_URL=http://koala:30330`. Bind to a random localhost port (so it doesn't conflict with anything else).
2. Send `tools/list` and assert four tool names.
3. For each tool, send a minimal valid `tools/call`. Don't assert on response content — assert response shape (no error, has content).
4. After all four calls, query `GET http://koala:30330/pass-rate?skill=_routing&window=1h` and assert `total >= 4`.
5. Tear down.
Skipped automatically when LiteLLM is unreachable or when run outside Tailscale (tier 3) — emits a `SKIP` line and exits 0. `task check` does NOT include `task smoke:routing` (CI runner doesn't have Tailscale); operator runs it manually before bumping production.
**Rationale:** unit tests with `httptest.Server` fakes verify the policy and the dispatch wrapper logic. The smoke test is the only thing that will catch a contract drift between the routing pod's `Complete` calls and the actual LiteLLM API, or a schema drift between `/pass-rate` and what the fetcher expects (per `2026-05-03-fake-tests-vs-real-contract`).
### I. Mode-template update (`cmd/hyperguild/mode.go`)
`modeClientLocal` is amended:
- The `routing` entry's `url` stays at `http://koala:30310/mcp`.
- A new key `headers` is added with `{"X-Hyperguild-Mode": "client-local"}`.
- The placeholder `_routing_pending` field is **removed**, since the routing pod now exists.
Tests in `cmd/hyperguild/mode_test.go` are updated to assert the new structure. README in `cmd/hyperguild/README.md` updated to drop the "not deployed yet" note.
**Rationale:** Plan 4 deliberately scaffolded the placeholder for Plan 6 to fill in. This is the fill-in. Removing `_routing_pending` is the implicit cleanup the kickoff anticipates — making it explicit in the spec avoids a Plan-completeness gap (per `2026-05-03-implicit-cleanup-third-category`).
## Risks
- **Empty pass-rate window in the first weeks.** Plans 35 merged on 2026-05-03; usage data has not accumulated. With default-to-local active for all four routed skills, the first weeks of Mode 2 = "everything goes local." If local quality is rough on `code_review` or `debug`, the operator's first impression of Mode 2 is bad, and confidence in Plan 6 erodes before data lands. **Mitigation:** the FLOOR / CEIL are env-tunable. If local quality is unworkable in the first week, set `HYPERGUILD_ROUTE_LOCAL_FLOOR=2.0` (impossible threshold) and the pod becomes default-to-Claude with no code change. This is a deliberate kill switch for the early window.
- **LiteLLM-as-single-dependency.** The routing pod has exactly one upstream LLM provider: `piguard:4000`. If LiteLLM is misconfigured (wrong model name routed to wrong provider, expired Anthropic key in LiteLLM's config), every routing-pod call returns garbage. **Mitigation:** the smoke test catches gross misconfig before deploy; once deployed, LiteLLM's own `/health` endpoint is the canary (the pod doesn't probe it — operator monitors LiteLLM separately). If a deeper failure mode emerges, add a routing-pod liveness probe in a follow-up.
- **Skill-schema drift.** The routing pod's `tools/list` is asserted byte-identical to the supervisor's via snapshot test. If someone evolves the supervisor's schemas between Plan 6 merge and Plan 7 (a long window), the snapshot drifts. **Mitigation:** the spec documents that Plan 6 freezes the schemas; supervisor edits to skill schemas are out of scope until Plan 7 deletes the supervisor. This is a soft constraint enforced by the spec, not by code. If the supervisor genuinely needs a schema change before Plan 7, that's a separate plan.
- **Flux drift on `kubectl rollout restart`.** Demonstrated during the bearer-auth rollout earlier today: Flux server-side-applies the deployment every 30s and strips the `kubectl.kubernetes.io/restartedAt` annotation, which deletes the new ReplicaSet's pod. **Mitigation:** the Plan 6 implementer prompt and the README note that `kubectl delete pod -l app=routing` is the correct way to force a restart on Flux-managed deployments — the existing ReplicaSet recreates without an annotation Flux can revert. (This finding is worth a brain entry; capture in retrospective.)
- **Mode header not forwarded by Claude Code.** Plan 6 assumes Claude Code propagates `headers` from `.mcp.json`. The bearer-auth rollout proved this works for `Authorization`. The same path should work for `X-Hyperguild-Mode`. **Mitigation:** the pod treats absent header as `client-local` (the only mode that registers the pod). If forwarding silently breaks, behavior is identical — header is forward-compat only.
- **Sample-band hash collision producing skewed routing.** Hash inputs are `(system, user)` strings. If skill prompts produce highly similar bodies (debug bug A vs debug bug B with similar wording), low-bit hash distribution might cluster on one side. **Mitigation:** at the volumes Plan 6 expects (single operator, ~10s of routed calls/hour at peak), bias is statistically invisible. If volume ever rises, swap `hash & 1` for a stronger split. Not the first failure mode worth pre-engineering.
## Cross-references
- Spec for Plan 5 (consumer of `/pass-rate`): `docs/superpowers/specs/2026-05-03-pass-rate-logging-design.md`
- Spec for Plan 4 (which scaffolded the `:30310` placeholder): `docs/superpowers/specs/2026-05-03-hyperguild-cli-design.md`
- Auto-memory entries `project_three_modes`, `project_skill_migration_plans`, `project_per_skill_routing`, `feedback_per_task_verification`, `feedback_sudo`
- Brain entries `2026-05-03-rest-semantics-vs-precedent`, `2026-05-03-aggregator-normalization-backwards-compat`, `2026-05-03-fake-tests-vs-real-contract`, `2026-05-03-implicit-cleanup-third-category`, `2026-05-03-code-reviewer-output-as-candidates`, `2026-05-03-done-with-concerns-vs-blocked`, `2026-05-03-verification-depth-formula`, `2026-05-03-plan-canonical-dispatch-ephemeral`

17
go.mod
View File

@@ -2,10 +2,23 @@ module github.com/mathiasbq/supervisor
go 1.26.1
require github.com/stretchr/testify v1.11.1
require (
github.com/lestrrat-go/jwx/v2 v2.1.6
github.com/stretchr/testify v1.11.1
gopkg.in/yaml.v3 v3.0.1
)
require (
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 // indirect
github.com/goccy/go-json v0.10.3 // indirect
github.com/lestrrat-go/blackmagic v1.0.3 // indirect
github.com/lestrrat-go/httpcc v1.0.1 // indirect
github.com/lestrrat-go/httprc v1.0.6 // indirect
github.com/lestrrat-go/iter v1.0.2 // indirect
github.com/lestrrat-go/option v1.0.1 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
github.com/segmentio/asm v1.2.0 // indirect
golang.org/x/crypto v0.32.0 // indirect
golang.org/x/sys v0.31.0 // indirect
)

27
go.sum
View File

@@ -1,10 +1,37 @@
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 h1:NMZiJj8QnKe1LgsbDayM4UoHwbvwDRwnI3hwNaAHRnc=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0/go.mod h1:ZXNYxsqcloTdSy/rNShjYzMhyjf0LaoftYK0p+A3h40=
github.com/goccy/go-json v0.10.3 h1:KZ5WoDbxAIgm2HNbYckL0se1fHD6rz5j4ywS6ebzDqA=
github.com/goccy/go-json v0.10.3/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
github.com/lestrrat-go/blackmagic v1.0.3 h1:94HXkVLxkZO9vJI/w2u1T0DAoprShFd13xtnSINtDWs=
github.com/lestrrat-go/blackmagic v1.0.3/go.mod h1:6AWFyKNNj0zEXQYfTMPfZrAXUWUfTIZ5ECEUEJaijtw=
github.com/lestrrat-go/httpcc v1.0.1 h1:ydWCStUeJLkpYyjLDHihupbn2tYmZ7m22BGkcvZZrIE=
github.com/lestrrat-go/httpcc v1.0.1/go.mod h1:qiltp3Mt56+55GPVCbTdM9MlqhvzyuL6W/NMDA8vA5E=
github.com/lestrrat-go/httprc v1.0.6 h1:qgmgIRhpvBqexMJjA/PmwSvhNk679oqD1RbovdCGW8k=
github.com/lestrrat-go/httprc v1.0.6/go.mod h1:mwwz3JMTPBjHUkkDv/IGJ39aALInZLrhBp0X7KGUZlo=
github.com/lestrrat-go/iter v1.0.2 h1:gMXo1q4c2pHmC3dn8LzRhJfP1ceCbgSiT9lUydIzltI=
github.com/lestrrat-go/iter v1.0.2/go.mod h1:Momfcq3AnRlRjI5b5O8/G5/BvpzrhoFTZcn06fEOPt4=
github.com/lestrrat-go/jwx/v2 v2.1.6 h1:hxM1gfDILk/l5ylers6BX/Eq1m/pnxe9NBwW6lVfecA=
github.com/lestrrat-go/jwx/v2 v2.1.6/go.mod h1:Y722kU5r/8mV7fYDifjug0r8FK8mZdw0K0GpJw/l8pU=
github.com/lestrrat-go/option v1.0.1 h1:oAzP2fvZGQKWkvHa1/SAcFolBEca1oN+mQ7eooNBEYU=
github.com/lestrrat-go/option v1.0.1/go.mod h1:5ZHFbivi4xwXxhxY9XHDe2FHo6/Z7WWmtT7T5nBBp3I=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/segmentio/asm v1.2.0 h1:9BQrFxC+YOHJlTlHGkTrFWf59nbL3XnCoFLTwDCI7ys=
github.com/segmentio/asm v1.2.0/go.mod h1:BqMnlJP91P8d+4ibuonYZw9mfnzI9HfxselHZr5aAcs=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
golang.org/x/crypto v0.32.0 h1:euUpcYgM8WcP71gNpTqQCn6rC2t6ULUPiOzfWaXVVfc=
golang.org/x/crypto v0.32.0/go.mod h1:ZnnJkOaASj8g0AjIduWNlq2NRxL0PlBrbKVyZ6V/Ugc=
golang.org/x/sys v0.31.0 h1:ioabZlmFYtWhL+TRYpcnNlLwhyxaM9kWTDEmfnprqik=
golang.org/x/sys v0.31.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

View File

@@ -5,6 +5,15 @@ FROM golang:1.26-bookworm AS builder
ARG VERSION=dev
WORKDIR /src
# Fetch internal gitea-hosted Go modules (mcp-chassis) without going through
# proxy.golang.org and without HTTP→HTTPS surprises. The Gitea server returns
# http:// in its go-import meta tag (config-level limitation), so rewrite to
# https here and bypass the module proxy + sumdb.
RUN git config --global url."https://gitea.d-ma.be/".insteadOf "http://gitea.d-ma.be/"
ENV GOPRIVATE=gitea.d-ma.be
ENV GOPROXY=direct
ENV GOSUMDB=off
COPY go.mod go.sum ./
RUN go mod download

View File

@@ -6,17 +6,102 @@ import (
"fmt"
"log/slog"
"net/http"
"net/url"
"os"
"strconv"
"strings"
"time"
chassisauth "gitea.d-ma.be/mathias/mcp-chassis/auth"
"github.com/mathiasbq/hyperguild/ingestion/internal/api"
"github.com/mathiasbq/hyperguild/ingestion/internal/claudewatcher"
"github.com/mathiasbq/hyperguild/ingestion/internal/embed"
"github.com/mathiasbq/hyperguild/ingestion/internal/graphstore"
"github.com/mathiasbq/hyperguild/ingestion/internal/graphsync"
"github.com/mathiasbq/hyperguild/ingestion/internal/llm"
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
"github.com/mathiasbq/hyperguild/ingestion/internal/metrics"
"github.com/mathiasbq/hyperguild/ingestion/internal/oauth"
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
"github.com/mathiasbq/hyperguild/ingestion/internal/reranker"
"github.com/mathiasbq/hyperguild/ingestion/internal/search"
"github.com/mathiasbq/hyperguild/ingestion/internal/vectorstore"
"github.com/mathiasbq/hyperguild/ingestion/internal/watcher"
)
// claudeSink converts each claudewatcher.Batch into one wiki note under
// brain/wiki/claude-sessions/facts/. v1 emits one note per session
// keyed by host + session id; classifier-driven hall routing is a
// follow-up (hyperguild#27 v2).
type claudeSink struct {
brainDir string
logger *slog.Logger
}
func (s *claudeSink) Ingest(ctx context.Context, b claudewatcher.Batch) error {
if len(b.Turns) == 0 {
return nil
}
var sb strings.Builder
fmt.Fprintf(&sb, "# Claude session %s (%s)\n\n", b.SessionID, b.Host)
fmt.Fprintf(&sb, "_Project: `%s`. File: `%s`. Turns: %d._\n\n", b.ProjectID, b.FilePath, len(b.Turns))
for _, t := range b.Turns {
fmt.Fprintf(&sb, "## %s — %s\n\n", t.Type, t.Timestamp.UTC().Format(time.RFC3339))
if t.ToolName != "" {
fmt.Fprintf(&sb, "_tool: `%s`_\n\n", t.ToolName)
}
// Cap per-turn excerpt to keep page size bounded; the full
// transcript lives on disk under ~/.claude/projects/ already.
content := t.Content
if len(content) > 2000 {
content = content[:2000] + "…"
}
sb.WriteString(content)
sb.WriteString("\n\n")
}
slug := "session-" + b.Host + "-" + b.SessionID
if _, err := api.WriteNote(s.brainDir, api.WriteNoteOptions{
Filename: slug,
Wing: "claude-sessions",
Hall: "facts",
Type: "source",
Domain: b.ProjectID,
Content: sb.String(),
}); err != nil {
return fmt.Errorf("write claude session note: %w", err)
}
return nil
}
// redactDSN parses a Postgres URL and replaces its password with `***`
// for safe inclusion in logs. Falls back to a non-leaking placeholder
// if parsing fails — we never log a raw DSN.
func redactDSN(dsn string) string {
u, err := url.Parse(dsn)
if err != nil || u.User == nil {
return "postgres://***"
}
return u.Redacted()
}
// vectorAdapter bridges *vectorstore.PGStore (returns []vectorstore.Hit)
// to the search.VectorSearcher interface (which uses []search.VectorHit).
// Kept here, not in either package, so neither has to import the other.
type vectorAdapter struct{ s *vectorstore.PGStore }
func (a vectorAdapter) Search(ctx context.Context, q []float32, limit int) ([]search.VectorHit, error) {
hits, err := a.s.Search(ctx, q, limit)
if err != nil {
return nil, err
}
out := make([]search.VectorHit, len(hits))
for i, h := range hits {
out[i] = search.VectorHit{Path: h.Path, Distance: h.Distance}
}
return out, nil
}
func envOr(key, fallback string) string {
if v := os.Getenv(key); v != "" {
return v
@@ -33,6 +118,16 @@ func envInt(key string, fallback int) int {
return fallback
}
// systemHostname returns os.Hostname() with a "unknown" fallback so the
// caller never has to handle the rare error path.
func systemHostname() string {
h, err := os.Hostname()
if err != nil || h == "" {
return "unknown"
}
return h
}
func main() {
logger := slog.New(slog.NewJSONHandler(os.Stdout, nil))
@@ -55,7 +150,93 @@ func main() {
h := api.NewHandler(brainDir, logger, pipelineCfg)
mcpSrv := mcp.NewServer(brainDir, &pipelineCfg, llmClient.Complete)
var answerComplete pipeline.CompleteFunc
if primaryURL := os.Getenv("BRAIN_LLM_PRIMARY_URL"); primaryURL != "" {
primaryModel := envOr("BRAIN_LLM_PRIMARY_MODEL", "gemma4:31b")
primaryKey := os.Getenv("BERGET_API_KEY")
timeoutMS := envInt("BRAIN_LLM_TIMEOUT_MS", 10000)
timeout := time.Duration(timeoutMS) * time.Millisecond
primary := llm.New(primaryURL, primaryKey, primaryModel, timeout)
router := &llm.Router{Primary: primary}
if fallbackURL := os.Getenv("BRAIN_LLM_FALLBACK_URL"); fallbackURL != "" {
fallbackModel := envOr("BRAIN_LLM_FALLBACK_MODEL", "gemma4:31b")
router.Fallback = llm.New(fallbackURL, "", fallbackModel, timeout)
}
answerComplete = router.Complete
logger.Info("brain answer LLM configured", "primary", primaryURL, "model", primaryModel)
}
mcpSrv := mcp.NewServer(brainDir, &pipelineCfg, llmClient.Complete, answerComplete)
if rerankURL := os.Getenv("BRAIN_RERANKER_URL"); rerankURL != "" {
rerankModel := envOr("BRAIN_RERANKER_MODEL", "dengcao/Qwen3-Reranker-0.6B:F16")
mcpSrv = mcpSrv.WithReranker(reranker.New(rerankURL, rerankModel))
logger.Info("brain reranker configured", "url", rerankURL, "model", rerankModel)
}
// Hybrid retrieval (pgvector + nomic-embed-text). Both env vars must
// be set together for the path to wire on; otherwise BM25-only.
var vectorStore *vectorstore.PGStore
pgDSN := os.Getenv("BRAIN_PG_DSN")
embedURL := os.Getenv("BRAIN_EMBED_URL")
switch {
case pgDSN != "" && embedURL != "":
embedModel := envOr("BRAIN_EMBED_MODEL", "nomic-embed-text:latest")
store, err := vectorstore.New(context.Background(), pgDSN)
if err != nil {
logger.Error("vector store init", "err", err)
os.Exit(1)
}
if err := store.Init(context.Background()); err != nil {
logger.Error("vector store migrate", "err", err)
os.Exit(1)
}
vectorStore = store
embedder := embed.New(embedURL, embedModel)
mcpSrv = mcpSrv.WithHybridRetrieval(vectorAdapter{s: store}, embedder)
h.WithEmbedSync(store, embedder)
logger.Info("brain hybrid retrieval enabled",
"pg", redactDSN(pgDSN),
"embed_url", embedURL, "embed_model", embedModel)
// Graph store shares the same postgres18 DSN as the vector
// store and is opt-in via BRAIN_GRAPH_ENABLED=true. Defaults
// to off so first rollout doesn't surprise — flip on after
// the migration completes and the backfill finishes.
if envOr("BRAIN_GRAPH_ENABLED", "false") == "true" {
gstore, gerr := graphstore.New(context.Background(), pgDSN)
if gerr != nil {
logger.Error("graph store init", "err", gerr)
os.Exit(1)
}
if gerr := gstore.Init(context.Background()); gerr != nil {
logger.Error("graph store migrate", "err", gerr)
os.Exit(1)
}
mcpSrv = mcpSrv.WithGraph(gstore)
if envOr("BRAIN_GRAPH_BACKFILL", "false") == "true" {
n, berr := graphsync.BackfillFromBrainDir(context.Background(), gstore, brainDir)
if berr != nil {
logger.Warn("graph backfill incomplete", "indexed", n, "err", berr)
} else {
logger.Info("graph backfill complete", "indexed", n)
}
}
logger.Info("brain graph enabled", "pg", redactDSN(pgDSN))
}
case pgDSN == "" && embedURL == "":
// disabled — fine
default:
logger.Error("BRAIN_PG_DSN and BRAIN_EMBED_URL must be set together")
os.Exit(1)
}
mcpToken := os.Getenv("BRAIN_MCP_TOKEN")
if mcpToken == "" {
logger.Error("BRAIN_MCP_TOKEN not set")
os.Exit(1)
}
ctx := context.Background()
if watchInterval > 0 {
@@ -66,14 +247,134 @@ func main() {
})
}
// Claude Code session ingestion (hyperguild#27 / infra#73 Track E.1).
// Off by default — explicitly opt in by setting CLAUDE_SESSIONS_DIR
// to the ~/.claude/projects path. Requires BRAIN_PG_DSN for the
// cursor table (resumable offsets across restarts).
if claudeDir := os.Getenv("CLAUDE_SESSIONS_DIR"); claudeDir != "" {
if pgDSN == "" {
logger.Error("CLAUDE_SESSIONS_DIR set but BRAIN_PG_DSN missing — claudewatcher needs the cursor table")
os.Exit(1)
}
// Client-name guard. The env value is a regex alternation
// (e.g. "SEB|Mastercard"); we wrap it with word boundaries
// and case-insensitive flag so substrings inside longer
// identifiers don't false-match. Sourced from a SOPS secret
// so client identities never live in source.
if clientBlock := os.Getenv("CLAUDE_INGEST_CLIENT_BLOCK"); clientBlock != "" {
pattern := `(?i)\b(` + clientBlock + `)\b`
if err := claudewatcher.RegisterRule("client-name", pattern); err != nil {
logger.Error("claudewatcher client-block rule invalid", "err", err)
os.Exit(1)
}
logger.Info("claudewatcher client-block guard registered")
}
cursorStore, cerr := claudewatcher.NewCursorStore(ctx, pgDSN)
if cerr != nil {
logger.Error("claudewatcher cursor init", "err", cerr)
os.Exit(1)
}
if cerr := cursorStore.Init(ctx); cerr != nil {
logger.Error("claudewatcher cursor migrate", "err", cerr)
os.Exit(1)
}
host := envOr("CLAUDE_INGEST_HOST", systemHostname())
interval := time.Duration(envInt("CLAUDE_INGEST_INTERVAL", 60)) * time.Second
sink := &claudeSink{brainDir: brainDir, logger: logger}
go func() {
if err := claudewatcher.Watch(ctx, claudewatcher.Config{
SessionsDir: claudeDir,
Host: host,
Interval: interval,
Sink: sink,
Cursors: cursorStore,
Logger: logger,
}); err != nil && err != context.Canceled {
logger.Error("claudewatcher exited", "err", err)
}
}()
logger.Info("claudewatcher started",
"sessions_dir", claudeDir, "host", host, "interval", interval)
}
if vectorStore != nil {
embedSyncInterval := envInt("BRAIN_EMBED_SYNC_INTERVAL", 300)
vectorstore.StartSync(ctx, brainDir, vectorStore,
embed.New(os.Getenv("BRAIN_EMBED_URL"),
envOr("BRAIN_EMBED_MODEL", "nomic-embed-text:latest")),
time.Duration(embedSyncInterval)*time.Second)
logger.Info("embed sync started", "interval_s", embedSyncInterval)
}
mux := http.NewServeMux()
mux.HandleFunc("POST /query", h.Query)
mux.HandleFunc("POST /write", h.Write)
mux.HandleFunc("POST /index", h.Index)
mux.HandleFunc("POST /ingest", h.Ingest)
mux.HandleFunc("POST /ingest-path", h.IngestPath)
mux.HandleFunc("POST /ingest-raw", h.IngestRaw)
mux.HandleFunc("POST /backfill-refs", h.BackfillRefs)
mux.Handle("POST /mcp", mcpSrv)
mux.HandleFunc("POST /backfill-embeddings", h.BackfillEmbeddings)
mux.HandleFunc("GET /pass-rate", h.PassRate)
jwtValidator, err := chassisauth.NewJWTValidator(ctx, os.Getenv("DEX_ISSUER_URL"), os.Getenv("MCP_AUDIENCE"))
if err != nil {
logger.Error("build jwt validator", "err", err)
os.Exit(1)
}
if jwtValidator != nil {
logger.Info("jwt auth enabled", "issuer", os.Getenv("DEX_ISSUER_URL"))
}
// Resource-metadata URL is only emitted on 401 when Dex OAuth is
// configured. Static-Bearer-only deployments leave this empty so
// clients never see an OAuth challenge.
var resourceMetadataURL string
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
resourceURL := os.Getenv("MCP_RESOURCE_URL")
mux.HandleFunc("GET /.well-known/oauth-protected-resource",
chassisauth.ProtectedResourceHandler(resourceURL, dexURL))
if resourceURL != "" {
resourceMetadataURL = strings.TrimRight(resourceURL, "/") + "/.well-known/oauth-protected-resource"
}
}
mux.Handle("/mcp", chassisauth.BearerMiddleware(mcpToken, jwtValidator, "brain", resourceMetadataURL, mcpSrv))
// Opt-in OAuth 2.0 client_credentials flow for claude.ai's custom-MCP
// integration UI, which has no static-Bearer field. Setting both
// OAUTH_CLIENT_ID and OAUTH_CLIENT_SECRET enables the token exchange;
// setting only one is misconfiguration → fail fast.
oauthID := os.Getenv("OAUTH_CLIENT_ID")
oauthSecret := os.Getenv("OAUTH_CLIENT_SECRET")
switch {
case oauthID != "" && oauthSecret != "":
issuer := os.Getenv("MCP_RESOURCE_URL")
if issuer == "" {
logger.Error("OAUTH_CLIENT_ID/SECRET set but MCP_RESOURCE_URL is empty; cannot derive issuer")
os.Exit(1)
}
mux.HandleFunc("GET /.well-known/oauth-authorization-server",
oauth.MetadataHandler(issuer))
mux.HandleFunc("POST /oauth/token", oauth.TokenHandler(oauth.TokenConfig{
ClientID: oauthID,
ClientSecret: oauthSecret,
AccessToken: mcpToken,
}))
logger.Info("oauth client_credentials enabled", "issuer", strings.TrimRight(issuer, "/"))
case oauthID == "" && oauthSecret == "":
// disabled — that's fine
default:
logger.Error("OAUTH_CLIENT_ID and OAUTH_CLIENT_SECRET must be set together")
os.Exit(1)
}
// /metrics — unauthenticated Prometheus endpoint. kube-prometheus-stack
// scrapes it via the ServiceMonitor in k3s/apps/supervisor/. The metrics
// middleware below wraps every other registered handler so it observes
// real request latency. /metrics itself is excluded from its own
// observation by registering it on the outer mux (post-wrap).
reg := metrics.New()
mux.HandleFunc("GET /metrics", reg.Handler())
logger.Info("metrics endpoint registered", "path", "/metrics")
addr := ":" + port
watchIntervalLog := "disabled"
@@ -89,7 +390,7 @@ func main() {
"watch_interval", watchIntervalLog,
"mcp_enabled", true,
)
if err := http.ListenAndServe(addr, mux); err != nil {
if err := http.ListenAndServe(addr, reg.Middleware(mux)); err != nil {
logger.Error("server stopped", "err", err)
os.Exit(1)
}

View File

@@ -2,10 +2,30 @@ module github.com/mathiasbq/hyperguild/ingestion
go 1.26.1
require github.com/stretchr/testify v1.11.1
require (
github.com/lestrrat-go/jwx/v2 v2.1.6
github.com/stretchr/testify v1.11.1
)
require (
gitea.d-ma.be/mathias/mcp-chassis v0.1.0 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 // indirect
github.com/goccy/go-json v0.10.3 // indirect
github.com/jackc/pgpassfile v1.0.0 // indirect
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 // indirect
github.com/jackc/pgx/v5 v5.9.2 // indirect
github.com/jackc/puddle/v2 v2.2.2 // indirect
github.com/lestrrat-go/blackmagic v1.0.3 // indirect
github.com/lestrrat-go/httpcc v1.0.1 // indirect
github.com/lestrrat-go/httprc v1.0.6 // indirect
github.com/lestrrat-go/iter v1.0.2 // indirect
github.com/lestrrat-go/option v1.0.1 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/segmentio/asm v1.2.0 // indirect
golang.org/x/crypto v0.32.0 // indirect
golang.org/x/sync v0.17.0 // indirect
golang.org/x/sys v0.31.0 // indirect
golang.org/x/text v0.29.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)

View File

@@ -1,9 +1,54 @@
gitea.d-ma.be/mathias/mcp-chassis v0.1.0 h1:8RXO34+n7Vu8HnUMagars6fc4oemqRpMu7MVtjaj4qY=
gitea.d-ma.be/mathias/mcp-chassis v0.1.0/go.mod h1:ajbLlwr2L7FAN3TBU39KucZkKJM02wTbKbDKDEW2YvE=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 h1:NMZiJj8QnKe1LgsbDayM4UoHwbvwDRwnI3hwNaAHRnc=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0/go.mod h1:ZXNYxsqcloTdSy/rNShjYzMhyjf0LaoftYK0p+A3h40=
github.com/goccy/go-json v0.10.3 h1:KZ5WoDbxAIgm2HNbYckL0se1fHD6rz5j4ywS6ebzDqA=
github.com/goccy/go-json v0.10.3/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
github.com/jackc/pgpassfile v1.0.0 h1:/6Hmqy13Ss2zCq62VdNG8tM1wchn8zjSGOBJ6icpsIM=
github.com/jackc/pgpassfile v1.0.0/go.mod h1:CEx0iS5ambNFdcRtxPj5JhEz+xB6uRky5eyVu/W2HEg=
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761 h1:iCEnooe7UlwOQYpKFhBabPMi4aNAfoODPEFNiAnClxo=
github.com/jackc/pgservicefile v0.0.0-20240606120523-5a60cdf6a761/go.mod h1:5TJZWKEWniPve33vlWYSoGYefn3gLQRzjfDlhSJ9ZKM=
github.com/jackc/pgx/v5 v5.9.2 h1:3ZhOzMWnR4yJ+RW1XImIPsD1aNSz4T4fyP7zlQb56hw=
github.com/jackc/pgx/v5 v5.9.2/go.mod h1:mal1tBGAFfLHvZzaYh77YS/eC6IX9OWbRV1QIIM0Jn4=
github.com/jackc/puddle/v2 v2.2.2 h1:PR8nw+E/1w0GLuRFSmiioY6UooMp6KJv0/61nB7icHo=
github.com/jackc/puddle/v2 v2.2.2/go.mod h1:vriiEXHvEE654aYKXXjOvZM39qJ0q+azkZFrfEOc3H4=
github.com/lestrrat-go/blackmagic v1.0.3 h1:94HXkVLxkZO9vJI/w2u1T0DAoprShFd13xtnSINtDWs=
github.com/lestrrat-go/blackmagic v1.0.3/go.mod h1:6AWFyKNNj0zEXQYfTMPfZrAXUWUfTIZ5ECEUEJaijtw=
github.com/lestrrat-go/httpcc v1.0.1 h1:ydWCStUeJLkpYyjLDHihupbn2tYmZ7m22BGkcvZZrIE=
github.com/lestrrat-go/httpcc v1.0.1/go.mod h1:qiltp3Mt56+55GPVCbTdM9MlqhvzyuL6W/NMDA8vA5E=
github.com/lestrrat-go/httprc v1.0.6 h1:qgmgIRhpvBqexMJjA/PmwSvhNk679oqD1RbovdCGW8k=
github.com/lestrrat-go/httprc v1.0.6/go.mod h1:mwwz3JMTPBjHUkkDv/IGJ39aALInZLrhBp0X7KGUZlo=
github.com/lestrrat-go/iter v1.0.2 h1:gMXo1q4c2pHmC3dn8LzRhJfP1ceCbgSiT9lUydIzltI=
github.com/lestrrat-go/iter v1.0.2/go.mod h1:Momfcq3AnRlRjI5b5O8/G5/BvpzrhoFTZcn06fEOPt4=
github.com/lestrrat-go/jwx/v2 v2.1.6 h1:hxM1gfDILk/l5ylers6BX/Eq1m/pnxe9NBwW6lVfecA=
github.com/lestrrat-go/jwx/v2 v2.1.6/go.mod h1:Y722kU5r/8mV7fYDifjug0r8FK8mZdw0K0GpJw/l8pU=
github.com/lestrrat-go/option v1.0.1 h1:oAzP2fvZGQKWkvHa1/SAcFolBEca1oN+mQ7eooNBEYU=
github.com/lestrrat-go/option v1.0.1/go.mod h1:5ZHFbivi4xwXxhxY9XHDe2FHo6/Z7WWmtT7T5nBBp3I=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/segmentio/asm v1.2.0 h1:9BQrFxC+YOHJlTlHGkTrFWf59nbL3XnCoFLTwDCI7ys=
github.com/segmentio/asm v1.2.0/go.mod h1:BqMnlJP91P8d+4ibuonYZw9mfnzI9HfxselHZr5aAcs=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U=
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U=
golang.org/x/crypto v0.32.0 h1:euUpcYgM8WcP71gNpTqQCn6rC2t6ULUPiOzfWaXVVfc=
golang.org/x/crypto v0.32.0/go.mod h1:ZnnJkOaASj8g0AjIduWNlq2NRxL0PlBrbKVyZ6V/Ugc=
golang.org/x/sync v0.17.0 h1:l60nONMj9l5drqw6jlhIELNv9I0A4OFgRsG9k2oT9Ug=
golang.org/x/sync v0.17.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI=
golang.org/x/sys v0.31.0 h1:ioabZlmFYtWhL+TRYpcnNlLwhyxaM9kWTDEmfnprqik=
golang.org/x/sys v0.31.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
golang.org/x/text v0.29.0 h1:1neNs90w9YzJ9BocxfsQNHKuAT4pkghyXc4nhZ6sJvk=
golang.org/x/text v0.29.0/go.mod h1:7MhJOA9CD2qZyOKYazxdYMF85OwPdEr9jTtBpO7ydH4=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

View File

@@ -11,9 +11,11 @@ import (
"strings"
"time"
"github.com/mathiasbq/hyperguild/ingestion/internal/brain"
"github.com/mathiasbq/hyperguild/ingestion/internal/extract"
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
"github.com/mathiasbq/hyperguild/ingestion/internal/search"
"github.com/mathiasbq/hyperguild/ingestion/internal/vectorstore"
)
// Handler serves the ingestion HTTP API.
@@ -21,6 +23,8 @@ type Handler struct {
brainDir string
logger *slog.Logger
pipeline pipeline.Config
embedStore vectorstore.Store
embedClient vectorstore.Embedder
}
// NewHandler constructs a Handler. brainDir is the absolute path to brain/.
@@ -31,9 +35,19 @@ func NewHandler(brainDir string, logger *slog.Logger, pipelineCfg pipeline.Confi
return &Handler{brainDir: brainDir, logger: logger, pipeline: pipelineCfg}
}
// WithEmbedSync wires the optional vector store + embedder used by the
// POST /backfill-embeddings endpoint. Calling with either nil is a no-op.
func (h *Handler) WithEmbedSync(store vectorstore.Store, embedder vectorstore.Embedder) *Handler {
h.embedStore = store
h.embedClient = embedder
return h
}
type queryRequest struct {
Query string `json:"query"`
Limit int `json:"limit,omitempty"`
Wing string `json:"wing,omitempty"`
Hall string `json:"hall,omitempty"`
}
type writeRequest struct {
@@ -41,6 +55,8 @@ type writeRequest struct {
Filename string `json:"filename,omitempty"`
Type string `json:"type,omitempty"`
Domain string `json:"domain,omitempty"`
Wing string `json:"wing,omitempty"`
Hall string `json:"hall,omitempty"`
}
type ingestRequest struct {
@@ -75,7 +91,12 @@ func (h *Handler) Query(w http.ResponseWriter, r *http.Request) {
req.Limit = 5
}
results, err := search.Query(h.brainDir, req.Query, req.Limit)
results, err := search.Query(h.brainDir, search.QueryOptions{
Query: req.Query,
Limit: req.Limit,
Wing: req.Wing,
Hall: req.Hall,
})
if err != nil {
h.logger.Error("query failed", "err", err)
writeError(w, http.StatusInternalServerError, "search error")
@@ -85,13 +106,78 @@ func (h *Handler) Query(w http.ResponseWriter, r *http.Request) {
writeJSON(w, map[string]any{"results": results})
}
// WriteNote writes a markdown file to brainDir/knowledge/<filename>, optionally
// prefixed with YAML frontmatter built from typ and domain. Returns the path
// WriteNoteOptions configures how a brain note is written.
//
// When both Wing and Hall are non-empty, the note routes into the
// structured wiki at brain/wiki/<wing>/<hall>/<slug>.md and gets
// wing/hall/created_at injected into its YAML frontmatter.
//
// When either is empty, the note falls back to brain/knowledge/<filename>
// with optional type/domain frontmatter (legacy behaviour).
type WriteNoteOptions struct {
Content string
Filename string
Type string
Domain string
Wing string
Hall string
}
// WriteNote writes a markdown note into the brain. Returns the path
// relative to brainDir (forward-slashed). Filename traversal is rejected.
func WriteNote(brainDir, content, filename, typ, domain string) (string, error) {
if content == "" {
func WriteNote(brainDir string, opts WriteNoteOptions) (string, error) {
if opts.Content == "" {
return "", fmt.Errorf("content is required")
}
if opts.Wing != "" && opts.Hall != "" {
return writeHallNote(brainDir, opts)
}
if opts.Wing != "" || opts.Hall != "" {
return "", fmt.Errorf("wing and hall must be set together")
}
return writeLegacyNote(brainDir, opts)
}
// writeHallNote routes a note into brain/wiki/<wing>/<hall>/ and injects
// wing/hall/created_at frontmatter.
func writeHallNote(brainDir string, opts WriteNoteOptions) (string, error) {
slug := opts.Filename
if slug == "" {
slug = time.Now().UTC().Format("2006-01-02-150405") + "-auto"
}
dest, err := brain.NotePath(brainDir, opts.Wing, opts.Hall, slug)
if err != nil {
return "", err
}
if err := os.MkdirAll(filepath.Dir(dest), 0o755); err != nil {
return "", fmt.Errorf("create hall dir: %w", err)
}
var fm strings.Builder
fm.WriteString("---\n")
fmt.Fprintf(&fm, "wing: %s\n", brain.Sanitise(opts.Wing))
fmt.Fprintf(&fm, "hall: %s\n", opts.Hall)
fmt.Fprintf(&fm, "created_at: %s\n", time.Now().UTC().Format(time.RFC3339))
if opts.Type != "" {
fmt.Fprintf(&fm, "type: %s\n", opts.Type)
}
if opts.Domain != "" {
fmt.Fprintf(&fm, "domain: %s\n", opts.Domain)
}
fm.WriteString("---\n")
if err := os.WriteFile(dest, []byte(fm.String()+opts.Content), 0o644); err != nil {
return "", fmt.Errorf("write: %w", err)
}
rel, _ := filepath.Rel(brainDir, dest)
return filepath.ToSlash(rel), nil
}
// writeLegacyNote preserves the original brain/knowledge/ behaviour for
// callers that have not adopted the wing/hall taxonomy.
func writeLegacyNote(brainDir string, opts WriteNoteOptions) (string, error) {
filename := opts.Filename
if filename == "" {
filename = fmt.Sprintf("%s-auto.md", time.Now().UTC().Format("2006-01-02-150405"))
}
@@ -101,26 +187,24 @@ func WriteNote(brainDir, content, filename, typ, domain string) (string, error)
return "", fmt.Errorf("create raw dir: %w", err)
}
finalContent := content
if typ != "" || domain != "" {
finalContent := opts.Content
if opts.Type != "" || opts.Domain != "" {
var fm strings.Builder
fm.WriteString("---\n")
if typ != "" {
fmt.Fprintf(&fm, "type: %s\n", typ)
if opts.Type != "" {
fmt.Fprintf(&fm, "type: %s\n", opts.Type)
}
if domain != "" {
fmt.Fprintf(&fm, "domain: %s\n", domain)
if opts.Domain != "" {
fmt.Fprintf(&fm, "domain: %s\n", opts.Domain)
}
fm.WriteString("---\n")
finalContent = fm.String() + content
finalContent = fm.String() + opts.Content
}
// Reject path separators outright; any non-flat filename is misuse.
if strings.ContainsAny(filename, `/\`) {
return "", fmt.Errorf("invalid filename")
}
base := filepath.Base(filename)
// After Base, "." and ".." remain. Reject those before adding .md.
if base == "." || base == ".." || base == "" {
return "", fmt.Errorf("invalid filename")
}
@@ -143,15 +227,77 @@ func (h *Handler) Write(w http.ResponseWriter, r *http.Request) {
writeError(w, http.StatusBadRequest, "invalid JSON")
return
}
relPath, err := WriteNote(h.brainDir, req.Content, req.Filename, req.Type, req.Domain)
relPath, err := WriteNote(h.brainDir, WriteNoteOptions(req))
if err != nil {
h.logger.Error("write failed", "err", err)
writeError(w, http.StatusBadRequest, err.Error())
return
}
if req.Wing != "" && req.Hall != "" {
if err := brain.BuildWingIndex(h.brainDir, req.Wing); err != nil {
h.logger.Warn("auto-index failed", "wing", req.Wing, "err", err)
}
}
writeJSON(w, map[string]string{"path": relPath})
}
// BackfillEmbeddings handles POST /backfill-embeddings — synchronously
// embeds every note under brain/wiki/ that's not yet in the vector
// store, and deletes rows for files no longer on disk.
func (h *Handler) BackfillEmbeddings(w http.ResponseWriter, r *http.Request) {
if h.embedStore == nil || h.embedClient == nil {
writeError(w, http.StatusServiceUnavailable,
"embeddings not configured (set BRAIN_PG_DSN and BRAIN_EMBED_URL)")
return
}
res, err := vectorstore.Sync(r.Context(), h.brainDir, h.embedStore, h.embedClient)
if err != nil {
h.logger.Error("backfill failed", "err", err)
writeError(w, http.StatusInternalServerError, "backfill error")
return
}
errStrs := make([]string, 0, len(res.Errors))
for _, e := range res.Errors {
errStrs = append(errStrs, e.Error())
}
writeJSON(w, map[string]any{
"added": res.Added,
"deleted": res.Deleted,
"errors": errStrs,
})
}
type indexRequest struct {
Wing string `json:"wing,omitempty"`
}
// Index handles POST /index — regenerate the _index.md MOC for one wing
// (when "wing" is set) or for every wing (when omitted).
func (h *Handler) Index(w http.ResponseWriter, r *http.Request) {
var req indexRequest
if r.ContentLength > 0 {
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
writeError(w, http.StatusBadRequest, "invalid JSON")
return
}
}
if req.Wing == "" {
if err := brain.BuildAllWingIndexes(h.brainDir); err != nil {
h.logger.Error("index all failed", "err", err)
writeError(w, http.StatusInternalServerError, "index error")
return
}
writeJSON(w, map[string]any{"status": "ok", "scope": "all"})
return
}
if err := brain.BuildWingIndex(h.brainDir, req.Wing); err != nil {
h.logger.Error("index failed", "wing", req.Wing, "err", err)
writeError(w, http.StatusBadRequest, err.Error())
return
}
writeJSON(w, map[string]any{"status": "ok", "scope": req.Wing})
}
// Ingest handles POST /ingest — run the pipeline on provided content.
func (h *Handler) Ingest(w http.ResponseWriter, r *http.Request) {
var req ingestRequest

View File

@@ -0,0 +1,140 @@
package api
import (
"encoding/json"
"net/http"
"os"
"path/filepath"
"strings"
"time"
)
type passRateResponse struct {
Skill string `json:"skill"`
Window string `json:"window"`
Pass int `json:"pass"`
Fail int `json:"fail"`
Skip int `json:"skip"`
Total int `json:"total"`
PassRate *float64 `json:"pass_rate"`
}
// PassRate handles GET /pass-rate?skill=X&window=Y.
// Walks brainDir/sessions/*.jsonl, filters by skill name and timestamp,
// returns aggregated counts and pass rate.
func (h *Handler) PassRate(w http.ResponseWriter, r *http.Request) {
skill := r.URL.Query().Get("skill")
if skill == "" {
writeError(w, http.StatusBadRequest, "skill is required")
return
}
windowStr := r.URL.Query().Get("window")
if windowStr == "" {
windowStr = "7d"
}
window, err := parseWindow(windowStr)
if err != nil {
writeError(w, http.StatusBadRequest, "invalid window: "+err.Error())
return
}
cutoff := time.Now().UTC().Add(-window)
pass, fail, skip := 0, 0, 0
sessionsDir := filepath.Join(h.brainDir, "sessions")
entries, err := os.ReadDir(sessionsDir)
if err != nil && !os.IsNotExist(err) {
writeError(w, http.StatusInternalServerError, "read sessions dir: "+err.Error())
return
}
for _, entry := range entries {
if entry.IsDir() || !strings.HasSuffix(entry.Name(), ".jsonl") {
continue
}
body, err := os.ReadFile(filepath.Join(sessionsDir, entry.Name()))
if err != nil {
continue // skip unreadable files
}
for _, line := range strings.Split(string(body), "\n") {
line = strings.TrimSpace(line)
if line == "" {
continue
}
var rec struct {
Timestamp string `json:"timestamp"`
Skill string `json:"skill"`
FinalStatus string `json:"final_status"`
}
if err := json.Unmarshal([]byte(line), &rec); err != nil {
continue // malformed — skip
}
if rec.Skill != skill {
continue
}
ts, err := time.Parse(time.RFC3339, rec.Timestamp)
if err != nil {
continue
}
if ts.Before(cutoff) {
continue
}
switch normalizeStatus(rec.FinalStatus) {
case "pass":
pass++
case "fail":
fail++
case "skip":
skip++
}
}
}
total := pass + fail + skip
resp := passRateResponse{
Skill: skill,
Window: windowStr,
Pass: pass,
Fail: fail,
Skip: skip,
Total: total,
}
if pass+fail > 0 {
rate := float64(pass) / float64(pass+fail)
resp.PassRate = &rate
}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(resp)
}
// normalizeStatus maps both new (pass/fail/skip) and legacy (ok/error/skipped)
// vocabularies to the canonical pass/fail/skip set. Unknown values are treated
// as skip for safety.
func normalizeStatus(s string) string {
switch s {
case "pass", "ok":
return "pass"
case "fail", "error":
return "fail"
case "skip", "skipped":
return "skip"
default:
return "skip"
}
}
// parseWindow accepts Go-style durations plus "Nd" for days.
func parseWindow(s string) (time.Duration, error) {
if strings.HasSuffix(s, "d") {
// Replace "d" with "h" * 24
days := strings.TrimSuffix(s, "d")
d, err := time.ParseDuration(days + "h")
if err != nil {
return 0, err
}
return d * 24, nil
}
return time.ParseDuration(s)
}

View File

@@ -0,0 +1,172 @@
package api
import (
"encoding/json"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"testing"
"time"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// writeSession writes one or more JSONL entries to <dir>/sessions/<sessionID>.jsonl.
// The handler scans <brainDir>/sessions/, so test fixtures must mirror that layout.
func writeSession(t *testing.T, dir, sessionID string, entries ...string) {
t.Helper()
sessionsDir := filepath.Join(dir, "sessions")
require.NoError(t, os.MkdirAll(sessionsDir, 0o755))
path := filepath.Join(sessionsDir, sessionID+".jsonl")
body := ""
for _, e := range entries {
body += e + "\n"
}
require.NoError(t, os.WriteFile(path, []byte(body), 0o644))
}
func TestPassRate_HappyPath(t *testing.T) {
dir := t.TempDir()
now := time.Now().UTC()
recent := now.Add(-1 * time.Hour).Format(time.RFC3339)
writeSession(t, dir, "s1",
`{"timestamp":"`+recent+`","skill":"tdd","phase":"red","final_status":"pass"}`,
`{"timestamp":"`+recent+`","skill":"tdd","phase":"green","final_status":"pass"}`,
`{"timestamp":"`+recent+`","skill":"tdd","phase":"refactor","final_status":"fail"}`,
)
writeSession(t, dir, "s2",
`{"timestamp":"`+recent+`","skill":"code-review","phase":"review","final_status":"pass"}`,
)
h := &Handler{brainDir: dir}
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
w := httptest.NewRecorder()
h.PassRate(w, req)
resp := w.Result()
require.Equal(t, http.StatusOK, resp.StatusCode)
var got passRateResponse
require.NoError(t, json.NewDecoder(resp.Body).Decode(&got))
assert.Equal(t, "tdd", got.Skill)
assert.Equal(t, "24h", got.Window)
assert.Equal(t, 2, got.Pass)
assert.Equal(t, 1, got.Fail)
assert.Equal(t, 0, got.Skip)
assert.Equal(t, 3, got.Total)
require.NotNil(t, got.PassRate)
assert.InDelta(t, 0.6667, *got.PassRate, 0.001)
}
func TestPassRate_LegacyVocabulary(t *testing.T) {
dir := t.TempDir()
now := time.Now().UTC().Format(time.RFC3339)
writeSession(t, dir, "s1",
`{"timestamp":"`+now+`","skill":"tdd","final_status":"ok"}`,
`{"timestamp":"`+now+`","skill":"tdd","final_status":"error"}`,
`{"timestamp":"`+now+`","skill":"tdd","final_status":"skipped"}`,
)
h := &Handler{brainDir: dir}
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
w := httptest.NewRecorder()
h.PassRate(w, req)
var got passRateResponse
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
assert.Equal(t, 1, got.Pass, "ok→pass")
assert.Equal(t, 1, got.Fail, "error→fail")
assert.Equal(t, 1, got.Skip, "skipped→skip")
}
func TestPassRate_OutsideWindow_Excluded(t *testing.T) {
dir := t.TempDir()
old := time.Now().UTC().Add(-30 * 24 * time.Hour).Format(time.RFC3339)
recent := time.Now().UTC().Add(-1 * time.Hour).Format(time.RFC3339)
writeSession(t, dir, "s1",
`{"timestamp":"`+old+`","skill":"tdd","final_status":"pass"}`,
`{"timestamp":"`+recent+`","skill":"tdd","final_status":"pass"}`,
)
h := &Handler{brainDir: dir}
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
w := httptest.NewRecorder()
h.PassRate(w, req)
var got passRateResponse
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
assert.Equal(t, 1, got.Pass)
assert.Equal(t, 1, got.Total)
}
func TestPassRate_NoData_ReturnsZerosAndNullRate(t *testing.T) {
dir := t.TempDir()
h := &Handler{brainDir: dir}
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
w := httptest.NewRecorder()
h.PassRate(w, req)
var got passRateResponse
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
assert.Equal(t, 0, got.Pass)
assert.Equal(t, 0, got.Fail)
assert.Equal(t, 0, got.Skip)
assert.Equal(t, 0, got.Total)
assert.Nil(t, got.PassRate, "pass_rate must be null when pass+fail == 0")
}
func TestPassRate_DefaultsTo7d(t *testing.T) {
dir := t.TempDir()
now := time.Now().UTC().Format(time.RFC3339)
writeSession(t, dir, "s1", `{"timestamp":"`+now+`","skill":"tdd","final_status":"pass"}`)
h := &Handler{brainDir: dir}
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd", nil) // no window
w := httptest.NewRecorder()
h.PassRate(w, req)
var got passRateResponse
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
assert.Equal(t, "7d", got.Window)
assert.Equal(t, 1, got.Pass)
}
func TestPassRate_MissingSkill_ReturnsBadRequest(t *testing.T) {
dir := t.TempDir()
h := &Handler{brainDir: dir}
req := httptest.NewRequest(http.MethodGet, "/pass-rate", nil)
w := httptest.NewRecorder()
h.PassRate(w, req)
assert.Equal(t, http.StatusBadRequest, w.Result().StatusCode)
}
func TestPassRate_BadWindow_ReturnsBadRequest(t *testing.T) {
dir := t.TempDir()
h := &Handler{brainDir: dir}
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=foo", nil)
w := httptest.NewRecorder()
h.PassRate(w, req)
assert.Equal(t, http.StatusBadRequest, w.Result().StatusCode)
}
func TestPassRate_MalformedLine_Skipped(t *testing.T) {
dir := t.TempDir()
now := time.Now().UTC().Format(time.RFC3339)
writeSession(t, dir, "s1",
`{"timestamp":"`+now+`","skill":"tdd","final_status":"pass"}`,
`not valid json`,
`{"timestamp":"`+now+`","skill":"tdd","final_status":"pass"}`,
)
h := &Handler{brainDir: dir}
req := httptest.NewRequest(http.MethodGet, "/pass-rate?skill=tdd&window=24h", nil)
w := httptest.NewRecorder()
h.PassRate(w, req)
var got passRateResponse
require.NoError(t, json.NewDecoder(w.Result().Body).Decode(&got))
assert.Equal(t, 2, got.Pass, "the malformed line is silently skipped")
}

View File

@@ -0,0 +1,161 @@
package brain
import (
"bufio"
"fmt"
"os"
"path/filepath"
"sort"
"strings"
"time"
)
// noteEntry is one row in a Wing _index.md.
type noteEntry struct {
Hall string
Slug string
Title string
Created string
}
// BuildWingIndex regenerates brain/wiki/<wing>/_index.md as a Map of
// Content listing every note in that wing with its Hall and creation
// date. Returns nil if the wing directory does not exist.
func BuildWingIndex(brainDir, wing string) error {
w := Sanitise(wing)
if w == "" {
return fmt.Errorf("invalid wing %q", wing)
}
wingDir := filepath.Join(brainDir, "wiki", w)
if _, err := os.Stat(wingDir); os.IsNotExist(err) {
return nil
} else if err != nil {
return fmt.Errorf("stat wing: %w", err)
}
entries, err := collectWingEntries(wingDir)
if err != nil {
return err
}
sort.Slice(entries, func(i, j int) bool {
if entries[i].Hall != entries[j].Hall {
return entries[i].Hall < entries[j].Hall
}
return entries[i].Slug < entries[j].Slug
})
var b strings.Builder
fmt.Fprintf(&b, "# %s\n\n", w)
b.WriteString("| Hall | Note | Created |\n")
b.WriteString("|------|------|---------|\n")
for _, e := range entries {
fmt.Fprintf(&b, "| %s | [%s](%s/%s.md) | %s |\n", e.Hall, e.Title, e.Hall, e.Slug, e.Created)
}
dest := filepath.Join(wingDir, "_index.md")
return os.WriteFile(dest, []byte(b.String()), 0o644)
}
// BuildAllWingIndexes regenerates _index.md for every wing under brain/wiki/.
func BuildAllWingIndexes(brainDir string) error {
wikiDir := filepath.Join(brainDir, "wiki")
ents, err := os.ReadDir(wikiDir)
if os.IsNotExist(err) {
return nil
}
if err != nil {
return fmt.Errorf("read wiki: %w", err)
}
for _, e := range ents {
if !e.IsDir() {
continue
}
if err := BuildWingIndex(brainDir, e.Name()); err != nil {
return fmt.Errorf("index %s: %w", e.Name(), err)
}
}
return nil
}
func collectWingEntries(wingDir string) ([]noteEntry, error) {
var out []noteEntry
ents, err := os.ReadDir(wingDir)
if err != nil {
return nil, fmt.Errorf("read wing: %w", err)
}
for _, hallEnt := range ents {
if !hallEnt.IsDir() {
continue
}
hall := hallEnt.Name()
if !IsValidHall(hall) {
continue
}
hallDir := filepath.Join(wingDir, hall)
notes, err := os.ReadDir(hallDir)
if err != nil {
return nil, fmt.Errorf("read hall %s: %w", hall, err)
}
for _, n := range notes {
if n.IsDir() || !strings.HasSuffix(n.Name(), ".md") || n.Name() == "_index.md" {
continue
}
slug := strings.TrimSuffix(n.Name(), ".md")
full := filepath.Join(hallDir, n.Name())
title, created := readTitleAndCreated(full, slug)
out = append(out, noteEntry{Hall: hall, Slug: slug, Title: title, Created: created})
}
}
return out, nil
}
// readTitleAndCreated reads YAML frontmatter for title + created_at; falls
// back to slug and file mtime when absent.
func readTitleAndCreated(path, slug string) (string, string) {
f, err := os.Open(path)
if err != nil {
return slug, ""
}
defer func() { _ = f.Close() }()
title, created := "", ""
scanner := bufio.NewScanner(f)
inFrontmatter := false
for scanner.Scan() {
line := scanner.Text()
if strings.TrimSpace(line) == "---" {
if !inFrontmatter {
inFrontmatter = true
continue
}
break
}
if !inFrontmatter {
continue
}
key, val, ok := strings.Cut(line, ":")
if !ok {
continue
}
v := strings.Trim(strings.TrimSpace(val), `"'`)
switch strings.TrimSpace(key) {
case "title":
title = v
case "created_at":
if t, err := time.Parse(time.RFC3339, v); err == nil {
created = t.UTC().Format("2006-01-02")
} else {
created = v
}
}
}
if title == "" {
title = strings.ReplaceAll(slug, "-", " ")
}
if created == "" {
if info, err := os.Stat(path); err == nil {
created = info.ModTime().UTC().Format("2006-01-02")
}
}
return title, created
}

View File

@@ -0,0 +1,85 @@
package brain_test
import (
"os"
"path/filepath"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/brain"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestBuildWingIndex(t *testing.T) {
dir := t.TempDir()
for _, p := range []struct{ rel, body string }{
{"wiki/jepa-fx/decisions/val-vol.md", "---\ntitle: Val Vol R2\ncreated_at: 2026-05-06T10:00:00Z\n---\nbody\n"},
{"wiki/jepa-fx/facts/architecture.md", "---\ntitle: Architecture\ncreated_at: 2026-05-04T10:00:00Z\n---\nbody\n"},
{"wiki/jepa-fx/sources/paper.md", "---\n---\nbody\n"},
} {
full := filepath.Join(dir, p.rel)
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
require.NoError(t, os.WriteFile(full, []byte(p.body), 0o644))
}
require.NoError(t, brain.BuildWingIndex(dir, "jepa-fx"))
got, err := os.ReadFile(filepath.Join(dir, "wiki", "jepa-fx", "_index.md"))
require.NoError(t, err)
s := string(got)
assert.Contains(t, s, "# jepa-fx")
assert.Contains(t, s, "| Hall | Note | Created |")
assert.Contains(t, s, "| decisions | [Val Vol R2](decisions/val-vol.md) | 2026-05-06 |")
assert.Contains(t, s, "| facts | [Architecture](facts/architecture.md) | 2026-05-04 |")
assert.Contains(t, s, "| sources | [paper](sources/paper.md) |")
// Halls sorted alphabetically.
assert.Less(t, indexOf(s, "decisions"), indexOf(s, "facts"))
assert.Less(t, indexOf(s, "facts"), indexOf(s, "sources"))
}
func TestBuildWingIndex_SkipsInvalidHalls(t *testing.T) {
dir := t.TempDir()
wingDir := filepath.Join(dir, "wiki", "jepa-fx")
require.NoError(t, os.MkdirAll(filepath.Join(wingDir, "garbage"), 0o755))
require.NoError(t, os.WriteFile(filepath.Join(wingDir, "garbage", "x.md"), []byte("x"), 0o644))
require.NoError(t, os.MkdirAll(filepath.Join(wingDir, "facts"), 0o755))
require.NoError(t, os.WriteFile(filepath.Join(wingDir, "facts", "y.md"), []byte("y"), 0o644))
require.NoError(t, brain.BuildWingIndex(dir, "jepa-fx"))
got, err := os.ReadFile(filepath.Join(wingDir, "_index.md"))
require.NoError(t, err)
s := string(got)
assert.Contains(t, s, "facts")
assert.NotContains(t, s, "garbage")
}
func TestBuildAllWingIndexes(t *testing.T) {
dir := t.TempDir()
for _, p := range []struct{ rel, body string }{
{"wiki/a/facts/x.md", "x"},
{"wiki/b/facts/y.md", "y"},
} {
full := filepath.Join(dir, p.rel)
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
require.NoError(t, os.WriteFile(full, []byte(p.body), 0o644))
}
require.NoError(t, brain.BuildAllWingIndexes(dir))
_, err := os.Stat(filepath.Join(dir, "wiki", "a", "_index.md"))
require.NoError(t, err)
_, err = os.Stat(filepath.Join(dir, "wiki", "b", "_index.md"))
require.NoError(t, err)
}
func TestBuildWingIndex_NoWingDir(t *testing.T) {
dir := t.TempDir()
require.NoError(t, brain.BuildWingIndex(dir, "ghost"))
}
func indexOf(s, sub string) int {
for i := 0; i+len(sub) <= len(s); i++ {
if s[i:i+len(sub)] == sub {
return i
}
}
return -1
}

View File

@@ -0,0 +1,70 @@
// Package brain provides the wing/hall path taxonomy used by the brain
// wiki layout. A note's canonical location is
// brain/wiki/<wing>/<hall>/<slug>.md, where Wing is a free-form topic
// domain and Hall is one of a closed vocabulary of memory types.
package brain
import (
"fmt"
"path/filepath"
"strings"
)
// ValidHalls is the closed vocabulary of hall names. A hall captures the
// memory type of a note within any wing.
var ValidHalls = map[string]bool{
"facts": true,
"decisions": true,
"failures": true,
"hypotheses": true,
"sources": true,
}
// IsValidHall reports whether h is in the closed Hall vocabulary.
func IsValidHall(h string) bool {
return ValidHalls[h]
}
// NotePath resolves the canonical filesystem path for a note given a
// wing, hall, and slug. Returns an error if hall is not in ValidHalls
// or if wing/slug sanitise to empty strings.
//
// The returned path is brain/wiki/<wing>/<hall>/<slug>.md with all
// segments sanitised: lowercased, alphanumerics and hyphens only.
func NotePath(brainDir, wing, hall, slug string) (string, error) {
if !IsValidHall(hall) {
return "", fmt.Errorf("invalid hall %q: must be one of facts/decisions/failures/hypotheses/sources", hall)
}
w := Sanitise(wing)
if w == "" {
return "", fmt.Errorf("invalid wing %q: must contain at least one alphanumeric character", wing)
}
s := Sanitise(strings.TrimSuffix(slug, ".md"))
if s == "" {
return "", fmt.Errorf("invalid slug %q: must contain at least one alphanumeric character", slug)
}
return filepath.Join(brainDir, "wiki", w, hall, s+".md"), nil
}
// Sanitise lowercases s and keeps only [a-z0-9-], collapsing any other
// character (including path separators) to a hyphen. Leading/trailing
// hyphens and runs of hyphens are collapsed.
func Sanitise(s string) string {
s = strings.ToLower(strings.TrimSpace(s))
var b strings.Builder
prevHyphen := true
for _, r := range s {
switch {
case r >= 'a' && r <= 'z', r >= '0' && r <= '9':
b.WriteRune(r)
prevHyphen = false
case r == '-' || r == '_' || r == ' ' || r == '/' || r == '\\' || r == '.':
if !prevHyphen {
b.WriteByte('-')
prevHyphen = true
}
}
}
out := b.String()
return strings.Trim(out, "-")
}

View File

@@ -0,0 +1,73 @@
package brain_test
import (
"path/filepath"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/brain"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestNotePath_Valid(t *testing.T) {
got, err := brain.NotePath("/b", "jepa-fx", "decisions", "val-vol-r2")
require.NoError(t, err)
assert.Equal(t, filepath.Join("/b", "wiki", "jepa-fx", "decisions", "val-vol-r2.md"), got)
}
func TestNotePath_StripsMdSuffix(t *testing.T) {
got, err := brain.NotePath("/b", "x", "facts", "note.md")
require.NoError(t, err)
assert.Equal(t, filepath.Join("/b", "wiki", "x", "facts", "note.md"), got)
}
func TestNotePath_SanitisesWingAndSlug(t *testing.T) {
got, err := brain.NotePath("/b", "Jepa FX!", "facts", "Val Vol R2")
require.NoError(t, err)
assert.Equal(t, filepath.Join("/b", "wiki", "jepa-fx", "facts", "val-vol-r2.md"), got)
}
func TestNotePath_RejectsInvalidHall(t *testing.T) {
_, err := brain.NotePath("/b", "x", "garbage", "y")
require.Error(t, err)
assert.Contains(t, err.Error(), "invalid hall")
}
func TestNotePath_RejectsEmptyWing(t *testing.T) {
_, err := brain.NotePath("/b", "!!!", "facts", "y")
require.Error(t, err)
assert.Contains(t, err.Error(), "invalid wing")
}
func TestNotePath_RejectsEmptySlug(t *testing.T) {
_, err := brain.NotePath("/b", "x", "facts", "!!!")
require.Error(t, err)
assert.Contains(t, err.Error(), "invalid slug")
}
func TestSanitise(t *testing.T) {
cases := map[string]string{
"Jepa-FX": "jepa-fx",
" foo bar ": "foo-bar",
"Val/Vol\\R2.md": "val-vol-r2-md",
"!!!": "",
"___leading": "leading",
"trailing___": "trailing",
"multi---hyphen": "multi-hyphen",
"UPPER 123 mixed": "upper-123-mixed",
}
for in, want := range cases {
t.Run(in, func(t *testing.T) {
assert.Equal(t, want, brain.Sanitise(in))
})
}
}
func TestIsValidHall(t *testing.T) {
for _, h := range []string{"facts", "decisions", "failures", "hypotheses", "sources"} {
assert.True(t, brain.IsValidHall(h), h)
}
for _, h := range []string{"", "Facts", "facts ", "rooms", "concepts", "entities"} {
assert.False(t, brain.IsValidHall(h), h)
}
}

View File

@@ -0,0 +1,286 @@
package brain
import (
"bufio"
"fmt"
"os"
"path/filepath"
"strings"
"time"
)
// seeAlsoHeader is the markdown heading used to group cross-wing links.
const seeAlsoHeader = "## See also"
// TunnelCandidate is a cross-wing match surfaced by DetectTunnels. It is
// not yet a written link — the caller decides whether confidence is high
// enough to commit it via WriteTunnel.
type TunnelCandidate struct {
// TargetPath is the candidate note's path relative to brainDir
// (forward-slashed), e.g. "wiki/hyperguild/decisions/routing.md".
TargetPath string
// MatchedTerm is the title that matched in the source content.
MatchedTerm string
// Exact is true when the match was a case-insensitive whole-token
// hit on the target's frontmatter title. Fuzzy matches (substring
// only) are flagged Exact=false and should not be auto-written.
Exact bool
}
// DetectTunnels scans brain/wiki/ for notes whose title appears in
// content. Returns one TunnelCandidate per matching note. Exact is true
// when content contains the title as a whole-word case-insensitive
// token; false when only a substring matched (caller treats these as
// fuzzy and should not auto-write them).
//
// A note's title is read from YAML frontmatter `title:`; failing that,
// the filename slug (sans `.md`, hyphens → spaces) is used.
func DetectTunnels(brainDir, content string) ([]TunnelCandidate, error) {
wikiDir := filepath.Join(brainDir, "wiki")
if _, err := os.Stat(wikiDir); os.IsNotExist(err) {
return nil, nil
} else if err != nil {
return nil, fmt.Errorf("stat wiki: %w", err)
}
lowerContent := strings.ToLower(content)
var out []TunnelCandidate
err := filepath.WalkDir(wikiDir, func(path string, d os.DirEntry, err error) error {
if err != nil {
return err
}
if d.IsDir() || !strings.HasSuffix(path, ".md") || d.Name() == "_index.md" {
return nil
}
title, _ := readTitleAndCreated(path, strings.TrimSuffix(d.Name(), ".md"))
needle := strings.ToLower(strings.TrimSpace(title))
if needle == "" {
return nil
}
idx := strings.Index(lowerContent, needle)
if idx == -1 {
return nil
}
rel, err := filepath.Rel(brainDir, path)
if err != nil {
return err
}
out = append(out, TunnelCandidate{
TargetPath: filepath.ToSlash(rel),
MatchedTerm: title,
Exact: isWholeWord(lowerContent, idx, len(needle)),
})
return nil
})
if err != nil {
return nil, err
}
return out, nil
}
// isWholeWord reports whether the substring at [idx, idx+n) in s is
// bounded by non-alphanumeric characters (or string edges).
func isWholeWord(s string, idx, n int) bool {
left := idx == 0 || !isWordByte(s[idx-1])
right := idx+n == len(s) || !isWordByte(s[idx+n])
return left && right
}
func isWordByte(b byte) bool {
return (b >= 'a' && b <= 'z') ||
(b >= 'A' && b <= 'Z') ||
(b >= '0' && b <= '9')
}
// AutoTunnel runs DetectTunnels against content and, for each
// candidate, either writes a bidirectional tunnel (when the match is
// exact and in a different wing) or stages it for human review in
// brain/raw/tunnel-candidates-<YYYY-MM-DD>.md.
//
// sourcePath is the note that originated the content — used to skip
// self-matches and same-wing tunnels. Errors writing individual
// tunnels are recorded into the candidates file but never abort the
// rest of the scan; the caller's primary write has already succeeded
// and auto-linking is best-effort.
func AutoTunnel(brainDir, sourcePath, content string) error {
srcWing, err := wingOf(sourcePath)
if err != nil {
return err
}
candidates, err := DetectTunnels(brainDir, content)
if err != nil {
return err
}
var fuzzy []TunnelCandidate
for _, c := range candidates {
if c.TargetPath == sourcePath {
continue
}
tgtWing, err := wingOf(c.TargetPath)
if err != nil || tgtWing == srcWing {
continue
}
if !c.Exact {
fuzzy = append(fuzzy, c)
continue
}
if err := WriteTunnel(brainDir, sourcePath, c.TargetPath); err != nil {
fuzzy = append(fuzzy, c)
}
}
return logFuzzyCandidates(brainDir, sourcePath, fuzzy)
}
// logFuzzyCandidates appends one row per candidate to
// brain/raw/tunnel-candidates-<YYYY-MM-DD>.md, creating the file with a
// header on first write of the day. No-op when the candidate list is empty.
func logFuzzyCandidates(brainDir, sourcePath string, cs []TunnelCandidate) error {
if len(cs) == 0 {
return nil
}
rawDir := filepath.Join(brainDir, "raw")
if err := os.MkdirAll(rawDir, 0o755); err != nil {
return err
}
stamp := time.Now().UTC().Format("2006-01-02")
path := filepath.Join(rawDir, "tunnel-candidates-"+stamp+".md")
existed := fileExists(path)
f, err := os.OpenFile(path, os.O_APPEND|os.O_CREATE|os.O_WRONLY, 0o644)
if err != nil {
return err
}
defer func() { _ = f.Close() }()
if !existed {
if _, err := f.WriteString("# Tunnel candidates " + stamp + "\n\nFuzzy cross-wing matches surfaced by AutoTunnel. Review and promote to a tunnel via `brain_tunnel` if relevant.\n\n"); err != nil {
return err
}
}
for _, c := range cs {
line := fmt.Sprintf("- `%s` ↔ `%s` (term: %q)\n", sourcePath, c.TargetPath, c.MatchedTerm)
if _, err := f.WriteString(line); err != nil {
return err
}
}
return nil
}
func fileExists(p string) bool {
_, err := os.Stat(p)
return err == nil
}
// WriteTunnel appends a bidirectional wikilink between sourcePath and
// targetPath under a `## See also` section in each note. Paths are
// relative to brainDir (forward-slashed), e.g. wiki/<wing>/<hall>/<slug>.md.
//
// Idempotent: re-calling with the same pair does not duplicate links or
// section headers. Rejects same-wing pairs (a tunnel is by definition
// cross-wing) and missing notes.
func WriteTunnel(brainDir, sourcePath, targetPath string) error {
srcWing, err := wingOf(sourcePath)
if err != nil {
return fmt.Errorf("source: %w", err)
}
tgtWing, err := wingOf(targetPath)
if err != nil {
return fmt.Errorf("target: %w", err)
}
if srcWing == tgtWing {
return fmt.Errorf("tunnel must cross wings; got both in %q", srcWing)
}
srcFull := filepath.Join(brainDir, filepath.FromSlash(sourcePath))
tgtFull := filepath.Join(brainDir, filepath.FromSlash(targetPath))
if _, err := os.Stat(srcFull); err != nil {
return fmt.Errorf("source note: %w", err)
}
if _, err := os.Stat(tgtFull); err != nil {
return fmt.Errorf("target note: %w", err)
}
if err := appendSeeAlso(srcFull, wikilinkOf(targetPath)); err != nil {
return fmt.Errorf("update source: %w", err)
}
if err := appendSeeAlso(tgtFull, wikilinkOf(sourcePath)); err != nil {
return fmt.Errorf("update target: %w", err)
}
return nil
}
// wikilinkOf turns "wiki/<wing>/<hall>/<slug>.md" into "<wing>/<hall>/<slug>"
// for use inside `[[...]]`.
func wikilinkOf(relPath string) string {
p := strings.TrimSuffix(relPath, ".md")
p = strings.TrimPrefix(p, "wiki/")
return p
}
// wingOf extracts the wing segment from a relative wiki path
// "wiki/<wing>/<hall>/<slug>.md".
func wingOf(relPath string) (string, error) {
parts := strings.Split(relPath, "/")
if len(parts) < 4 || parts[0] != "wiki" {
return "", fmt.Errorf("not a wiki path: %q", relPath)
}
if parts[1] == "" {
return "", fmt.Errorf("empty wing in path: %q", relPath)
}
return parts[1], nil
}
// appendSeeAlso inserts `- [[link]]` under the file's See also section,
// creating the section if absent. No-op when the link is already present.
func appendSeeAlso(filePath, link string) error {
content, err := os.ReadFile(filePath)
if err != nil {
return err
}
wikilink := "[[" + link + "]]"
if strings.Contains(string(content), wikilink) {
return nil
}
bullet := "- " + wikilink
if !strings.Contains(string(content), seeAlsoHeader) {
// No section yet — append a fresh one. Always emit a trailing
// newline so subsequent appends don't merge into the previous line.
trimmed := strings.TrimRight(string(content), "\n")
out := trimmed + "\n\n" + seeAlsoHeader + "\n\n" + bullet + "\n"
return os.WriteFile(filePath, []byte(out), 0o644)
}
// Section exists — splice the bullet in just before the next `## `
// heading (or EOF). Reading the file line-by-line keeps this robust
// against arbitrary section ordering.
var b strings.Builder
scanner := bufio.NewScanner(strings.NewReader(string(content)))
scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
inSeeAlso, inserted := false, false
for scanner.Scan() {
line := scanner.Text()
if !inserted && inSeeAlso && strings.HasPrefix(line, "## ") &&
strings.TrimSpace(line) != seeAlsoHeader {
b.WriteString(bullet)
b.WriteByte('\n')
b.WriteByte('\n')
inserted = true
}
if strings.TrimSpace(line) == seeAlsoHeader {
inSeeAlso = true
}
b.WriteString(line)
b.WriteByte('\n')
}
if err := scanner.Err(); err != nil {
return err
}
if !inserted {
// section was the last thing in the file — just append bullet
out := strings.TrimRight(b.String(), "\n") + "\n" + bullet + "\n"
return os.WriteFile(filePath, []byte(out), 0o644)
}
return os.WriteFile(filePath, []byte(b.String()), 0o644)
}

View File

@@ -0,0 +1,177 @@
package brain_test
import (
"os"
"path/filepath"
"strings"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/brain"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// seedNote writes a minimal markdown note at brainDir/relPath with the given body.
func seedNote(t *testing.T, brainDir, relPath, body string) {
t.Helper()
full := filepath.Join(brainDir, relPath)
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
require.NoError(t, os.WriteFile(full, []byte(body), 0o644))
}
func TestWriteTunnel_AppendsBidirectionalLinks(t *testing.T) {
dir := t.TempDir()
seedNote(t, dir, "wiki/jepa-fx/decisions/val-vol.md",
"---\nwing: jepa-fx\nhall: decisions\n---\n# Val Vol R2\n\nbody.\n")
seedNote(t, dir, "wiki/hyperguild/decisions/routing.md",
"---\nwing: hyperguild\nhall: decisions\n---\n# Routing\n\nbody.\n")
err := brain.WriteTunnel(dir,
"wiki/jepa-fx/decisions/val-vol.md",
"wiki/hyperguild/decisions/routing.md",
)
require.NoError(t, err)
src, err := os.ReadFile(filepath.Join(dir, "wiki/jepa-fx/decisions/val-vol.md"))
require.NoError(t, err)
assert.Contains(t, string(src), "## See also")
assert.Contains(t, string(src), "[[hyperguild/decisions/routing]]")
tgt, err := os.ReadFile(filepath.Join(dir, "wiki/hyperguild/decisions/routing.md"))
require.NoError(t, err)
assert.Contains(t, string(tgt), "## See also")
assert.Contains(t, string(tgt), "[[jepa-fx/decisions/val-vol]]")
}
func TestWriteTunnel_Idempotent(t *testing.T) {
dir := t.TempDir()
seedNote(t, dir, "wiki/a/facts/x.md", "# X\n\nbody.\n")
seedNote(t, dir, "wiki/b/facts/y.md", "# Y\n\nbody.\n")
for i := 0; i < 3; i++ {
require.NoError(t, brain.WriteTunnel(dir,
"wiki/a/facts/x.md", "wiki/b/facts/y.md"))
}
src, err := os.ReadFile(filepath.Join(dir, "wiki/a/facts/x.md"))
require.NoError(t, err)
assert.Equal(t, 1, strings.Count(string(src), "[[b/facts/y]]"),
"link should appear exactly once after 3 calls")
assert.Equal(t, 1, strings.Count(string(src), "## See also"))
tgt, err := os.ReadFile(filepath.Join(dir, "wiki/b/facts/y.md"))
require.NoError(t, err)
assert.Equal(t, 1, strings.Count(string(tgt), "[[a/facts/x]]"))
assert.Equal(t, 1, strings.Count(string(tgt), "## See also"))
}
func TestWriteTunnel_RejectsSameWing(t *testing.T) {
dir := t.TempDir()
seedNote(t, dir, "wiki/jepa-fx/facts/x.md", "x")
seedNote(t, dir, "wiki/jepa-fx/facts/y.md", "y")
err := brain.WriteTunnel(dir,
"wiki/jepa-fx/facts/x.md", "wiki/jepa-fx/facts/y.md")
require.Error(t, err)
assert.Contains(t, err.Error(), "cross wings")
}
func TestWriteTunnel_RejectsMissingNote(t *testing.T) {
dir := t.TempDir()
seedNote(t, dir, "wiki/a/facts/x.md", "x")
err := brain.WriteTunnel(dir,
"wiki/a/facts/x.md", "wiki/b/facts/ghost.md")
require.Error(t, err)
}
func TestDetectTunnels_ExactTitleMatch(t *testing.T) {
dir := t.TempDir()
seedNote(t, dir, "wiki/jepa-fx/decisions/val-vol.md",
"---\nwing: jepa-fx\nhall: decisions\ntitle: Val Vol R2\n---\nbody.\n")
seedNote(t, dir, "wiki/jepa-fx/facts/lejpa.md",
"---\nwing: jepa-fx\nhall: facts\ntitle: LeJPA Architecture\n---\nbody.\n")
candidates, err := brain.DetectTunnels(dir,
"We need to revisit Val Vol R2 in light of new tier data.")
require.NoError(t, err)
require.Len(t, candidates, 1)
assert.Equal(t, "wiki/jepa-fx/decisions/val-vol.md", candidates[0].TargetPath)
assert.Equal(t, "Val Vol R2", candidates[0].MatchedTerm)
assert.True(t, candidates[0].Exact)
}
func TestDetectTunnels_FuzzyMatch(t *testing.T) {
dir := t.TempDir()
seedNote(t, dir, "wiki/x/facts/routing.md",
"---\ntitle: Routing\n---\nbody.\n")
// Substring of title appears in content, but not as a whole word.
candidates, err := brain.DetectTunnels(dir, "rerouting handles failover")
require.NoError(t, err)
require.Len(t, candidates, 1)
assert.False(t, candidates[0].Exact, "substring-only match should be fuzzy")
}
func TestDetectTunnels_NoFrontmatterFallsBackToSlug(t *testing.T) {
dir := t.TempDir()
seedNote(t, dir, "wiki/x/facts/widget-flags.md", "# widget flags\n\nbody.\n")
candidates, err := brain.DetectTunnels(dir,
"Documented Widget Flags after the deploy issue.")
require.NoError(t, err)
require.Len(t, candidates, 1)
assert.True(t, candidates[0].Exact)
assert.Equal(t, "widget flags", candidates[0].MatchedTerm)
}
func TestAutoTunnel_FuzzyGoesToCandidatesFile(t *testing.T) {
dir := t.TempDir()
// Existing note in a different wing whose title is "Routing".
seedNote(t, dir, "wiki/other/facts/routing.md",
"---\nwing: other\nhall: facts\ntitle: Routing\n---\nbody.\n")
// Source note in another wing whose body mentions "rerouting" (substring match only).
seedNote(t, dir, "wiki/jepa-fx/facts/new.md",
"---\nwing: jepa-fx\nhall: facts\n---\nrerouting traffic\n")
require.NoError(t, brain.AutoTunnel(dir,
"wiki/jepa-fx/facts/new.md", "rerouting traffic"))
// Source must not get auto-linked (fuzzy).
got, err := os.ReadFile(filepath.Join(dir, "wiki/jepa-fx/facts/new.md"))
require.NoError(t, err)
assert.NotContains(t, string(got), "[[other/facts/routing]]")
// Candidates file must list the pair.
matches, err := filepath.Glob(filepath.Join(dir, "raw", "tunnel-candidates-*.md"))
require.NoError(t, err)
require.Len(t, matches, 1)
body, err := os.ReadFile(matches[0])
require.NoError(t, err)
assert.Contains(t, string(body), "wiki/jepa-fx/facts/new.md")
assert.Contains(t, string(body), "wiki/other/facts/routing.md")
assert.Contains(t, string(body), "Routing")
}
func TestDetectTunnels_EmptyWiki(t *testing.T) {
dir := t.TempDir()
cs, err := brain.DetectTunnels(dir, "anything")
require.NoError(t, err)
assert.Empty(t, cs)
}
func TestWriteTunnel_AppendsToExistingSeeAlso(t *testing.T) {
dir := t.TempDir()
seedNote(t, dir, "wiki/a/facts/x.md",
"# X\n\nbody.\n\n## See also\n\n- [[a/facts/old]]\n")
seedNote(t, dir, "wiki/b/facts/y.md", "# Y\n\nbody.\n")
require.NoError(t, brain.WriteTunnel(dir,
"wiki/a/facts/x.md", "wiki/b/facts/y.md"))
src, err := os.ReadFile(filepath.Join(dir, "wiki/a/facts/x.md"))
require.NoError(t, err)
s := string(src)
assert.Equal(t, 1, strings.Count(s, "## See also"), "should reuse existing section")
assert.Contains(t, s, "[[a/facts/old]]")
assert.Contains(t, s, "[[b/facts/y]]")
}

View File

@@ -0,0 +1,110 @@
package claudewatcher
import (
"context"
"errors"
"fmt"
"github.com/jackc/pgx/v5"
"github.com/jackc/pgx/v5/pgxpool"
)
// CursorStore tracks how far the watcher has ingested into each
// session JSONL file. Keyed by (host, file_path) so the same `~/.claude`
// path on different hosts doesn't collide and resumability survives
// pod restarts. Idempotent Init lives alongside the rest of the
// claudewatcher schema; no separate migration framework.
type CursorStore struct {
pool *pgxpool.Pool
}
// NewCursorStore opens a pool against dsn. Caller closes the store.
func NewCursorStore(ctx context.Context, dsn string) (*CursorStore, error) {
pool, err := pgxpool.New(ctx, dsn)
if err != nil {
return nil, fmt.Errorf("pgxpool: %w", err)
}
if err := pool.Ping(ctx); err != nil {
pool.Close()
return nil, fmt.Errorf("ping: %w", err)
}
return &CursorStore{pool: pool}, nil
}
// NewCursorStoreFromPool wraps an existing pool (so the watcher can
// share the brain DSN pool with vectorstore/graphstore without a
// second connection set). Caller must NOT close the wrapped pool via
// the store — close the pool directly.
func NewCursorStoreFromPool(pool *pgxpool.Pool) *CursorStore {
return &CursorStore{pool: pool}
}
// Close releases the underlying connection pool when this store owns
// it. No-op when the pool was injected via NewCursorStoreFromPool —
// pgxpool.Close is idempotent so we lean on that.
func (s *CursorStore) Close() {
if s.pool != nil {
s.pool.Close()
}
}
// Init creates the claude_session_cursors table when missing.
func (s *CursorStore) Init(ctx context.Context) error {
const ddl = `
CREATE TABLE IF NOT EXISTS claude_session_cursors (
host TEXT NOT NULL,
file_path TEXT NOT NULL,
byte_offset BIGINT NOT NULL DEFAULT 0,
last_seen_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (host, file_path)
);
CREATE INDEX IF NOT EXISTS claude_session_cursors_host_idx
ON claude_session_cursors (host);
`
_, err := s.pool.Exec(ctx, ddl)
return err
}
// GetOffset returns the last recorded byte offset for (host, filePath).
// Missing rows are reported as offset=0, ok=false so the caller can
// distinguish "never ingested" from "ingested at the start of the
// file" (both produce identical behaviour but the metric is useful).
func (s *CursorStore) GetOffset(ctx context.Context, host, filePath string) (int64, bool, error) {
if host == "" || filePath == "" {
return 0, false, errors.New("host and file_path are required")
}
var offset int64
err := s.pool.QueryRow(ctx, `
SELECT byte_offset FROM claude_session_cursors WHERE host = $1 AND file_path = $2
`, host, filePath).Scan(&offset)
if errors.Is(err, pgx.ErrNoRows) {
return 0, false, nil
}
if err != nil {
return 0, false, fmt.Errorf("query: %w", err)
}
return offset, true, nil
}
// SetOffset writes the new offset for (host, filePath). Used after
// every successful parse + ingest batch so a crash mid-file rewinds
// only to the last committed checkpoint.
func (s *CursorStore) SetOffset(ctx context.Context, host, filePath string, offset int64) error {
if host == "" || filePath == "" {
return errors.New("host and file_path are required")
}
if offset < 0 {
return errors.New("offset must be >= 0")
}
_, err := s.pool.Exec(ctx, `
INSERT INTO claude_session_cursors (host, file_path, byte_offset, last_seen_at)
VALUES ($1, $2, $3, now())
ON CONFLICT (host, file_path) DO UPDATE
SET byte_offset = EXCLUDED.byte_offset,
last_seen_at = now()
`, host, filePath, offset)
if err != nil {
return fmt.Errorf("upsert offset: %w", err)
}
return nil
}

View File

@@ -0,0 +1,305 @@
// Package claudewatcher ingests Claude Code session transcripts
// (`~/.claude/projects/*/<uuid>.jsonl`) into the brain corpus.
//
// Schema (observed 2026-05-25 across ~30 session files on koala):
//
// type=user — user prompts + tool results
// type=assistant — model turns; tool_use blocks live in message.content
// type=attachment — hook outputs, ingested files
// type=system — turn-boundary metadata
// type=file-history-snapshot — git-style snapshot of edited files
// type=queue-operation, last-prompt, permission-mode, ai-title,
// bridge-session — internal bookkeeping, ignored
//
// The parser is intentionally tolerant: malformed lines are skipped
// (caller logs and advances), missing optional fields default to "",
// and unknown `type` values are returned as Turn entries with
// `Skip=true` so callers can filter cheaply.
package claudewatcher
import (
"bufio"
"encoding/json"
"errors"
"fmt"
"io"
"strings"
"time"
)
// Turn is one parsed JSONL entry from a Claude Code session log.
//
// Skip is true for entry types we never want to ingest (queue
// bookkeeping, snapshots, etc.). Callers fast-path these without
// running the scrubber or classifier.
type Turn struct {
SessionID string
Type string
ParentUUID string
Timestamp time.Time
Cwd string
GitBranch string
Content string // plain-text projection of the entry, ready for the scrubber/classifier
ToolName string // populated when an assistant turn invokes a tool
OffsetAfter int64 // byte offset in the file just past this entry
Skip bool
ParseWarning string // non-empty when the entry parsed but had a sub-field we couldn't normalise
}
// ParseStream reads JSONL lines from r starting at startOffset and
// invokes emit for each parsed entry. emit may return ErrStop to
// terminate the scan cleanly. Other emit errors propagate.
//
// startOffset is informational — the caller is expected to have already
// seeked the underlying reader to that offset. ParseStream adds the
// number of bytes consumed per line to it to compute Turn.OffsetAfter.
//
// Lines that fail to unmarshal are logged via warnf and skipped; they
// do NOT advance OffsetAfter past the malformed line by themselves,
// but the next valid line resumes correctly because bufio.Scanner
// preserves stream position.
func ParseStream(
r io.Reader,
startOffset int64,
warnf func(format string, args ...any),
emit func(Turn) error,
) (int64, error) {
scanner := bufio.NewScanner(r)
scanner.Buffer(make([]byte, 0, 64*1024), 8*1024*1024) // some lines are big (tool outputs)
offset := startOffset
for scanner.Scan() {
raw := scanner.Bytes()
lineLen := int64(len(raw)) + 1 // +1 for the newline
t, err := parseTurn(raw)
if err != nil {
if warnf != nil {
warnf("parse: %v (%d bytes)", err, len(raw))
}
offset += lineLen
continue
}
t.OffsetAfter = offset + lineLen
if err := emit(t); err != nil {
if errors.Is(err, ErrStop) {
return t.OffsetAfter, nil
}
return offset, fmt.Errorf("emit: %w", err)
}
offset = t.OffsetAfter
}
if err := scanner.Err(); err != nil {
return offset, fmt.Errorf("scan: %w", err)
}
return offset, nil
}
// ErrStop terminates a ParseStream loop without surfacing an error.
var ErrStop = errors.New("claudewatcher: stop")
// rawEntry is a permissive shape that covers every type observed in
// the JSONL files. Fields we don't care about are intentionally
// omitted to keep the unmarshal cheap.
type rawEntry struct {
Type string `json:"type"`
SessionID string `json:"sessionId"`
ParentUUID string `json:"parentUuid"`
Timestamp string `json:"timestamp"`
Cwd string `json:"cwd"`
GitBranch string `json:"gitBranch"`
Message json.RawMessage `json:"message"`
Attachment json.RawMessage `json:"attachment"`
Content string `json:"content"` // queue-operation
LastPrompt string `json:"lastPrompt"` // last-prompt
Subtype string `json:"subtype"` // system
}
// skipTypes lists every entry type we want to never ingest. Marked Skip
// at parse time so the caller's filter is a single boolean check.
var skipTypes = map[string]struct{}{
"queue-operation": {},
"last-prompt": {},
"permission-mode": {},
"ai-title": {},
"bridge-session": {},
"file-history-snapshot": {},
}
func parseTurn(raw []byte) (Turn, error) {
var e rawEntry
if err := json.Unmarshal(raw, &e); err != nil {
return Turn{}, fmt.Errorf("unmarshal: %w", err)
}
t := Turn{
Type: e.Type,
SessionID: e.SessionID,
ParentUUID: e.ParentUUID,
Cwd: e.Cwd,
GitBranch: e.GitBranch,
}
if _, skip := skipTypes[e.Type]; skip {
t.Skip = true
return t, nil
}
if e.Timestamp != "" {
if ts, err := time.Parse(time.RFC3339Nano, e.Timestamp); err == nil {
t.Timestamp = ts
} else {
t.ParseWarning = "timestamp"
}
}
switch e.Type {
case "user":
t.Content = extractMessageText(e.Message)
case "assistant":
t.Content, t.ToolName = extractAssistantTurn(e.Message)
case "attachment":
t.Content = extractAttachmentText(e.Attachment)
case "system":
t.Content = "[system " + e.Subtype + "]"
default:
// Unknown type — keep the row but mark Skip so callers ignore.
t.Skip = true
}
return t, nil
}
// extractMessageText pulls the textual projection out of a user/assistant
// message field. The shape is the Anthropic Messages API content-block
// array (an array of {type, text|tool_use|tool_result, ...}). We
// concatenate every text-bearing block and ignore the rest.
func extractMessageText(raw json.RawMessage) string {
if len(raw) == 0 {
return ""
}
var msg struct {
Role string `json:"role"`
Content json.RawMessage `json:"content"`
Stop string `json:"stop_reason"`
Model string `json:"model"`
Usage map[string]any `json:"usage"`
Meta map[string]string `json:"meta"`
}
if err := json.Unmarshal(raw, &msg); err != nil {
// Some user turns have message as plain string.
var s string
if err2 := json.Unmarshal(raw, &s); err2 == nil {
return s
}
return ""
}
// Content can be a string OR an array.
var asString string
if err := json.Unmarshal(msg.Content, &asString); err == nil {
return asString
}
var blocks []struct {
Type string `json:"type"`
Text string `json:"text"`
Content json.RawMessage `json:"content"`
}
if err := json.Unmarshal(msg.Content, &blocks); err != nil {
return ""
}
var sb strings.Builder
for _, b := range blocks {
switch b.Type {
case "text":
sb.WriteString(b.Text)
sb.WriteByte('\n')
case "tool_result":
// Tool result content may itself be a string or array of blocks.
var s string
if err := json.Unmarshal(b.Content, &s); err == nil {
sb.WriteString("[tool_result] ")
sb.WriteString(s)
sb.WriteByte('\n')
continue
}
var sub []struct {
Type string `json:"type"`
Text string `json:"text"`
}
if err := json.Unmarshal(b.Content, &sub); err == nil {
for _, s := range sub {
if s.Type == "text" {
sb.WriteString("[tool_result] ")
sb.WriteString(s.Text)
sb.WriteByte('\n')
}
}
}
}
}
return strings.TrimRight(sb.String(), "\n")
}
// extractAssistantTurn pulls text + the first tool name (if any) from
// an assistant content-block array. Multi-tool turns lose the second
// name; the goal is signal for classification, not perfect fidelity.
func extractAssistantTurn(raw json.RawMessage) (string, string) {
if len(raw) == 0 {
return "", ""
}
var msg struct {
Content json.RawMessage `json:"content"`
}
if err := json.Unmarshal(raw, &msg); err != nil {
return "", ""
}
var blocks []struct {
Type string `json:"type"`
Text string `json:"text"`
Name string `json:"name"`
Tool json.RawMessage `json:"input"`
}
if err := json.Unmarshal(msg.Content, &blocks); err != nil {
return "", ""
}
var sb strings.Builder
var firstTool string
for _, b := range blocks {
switch b.Type {
case "text":
sb.WriteString(b.Text)
sb.WriteByte('\n')
case "tool_use":
if firstTool == "" {
firstTool = b.Name
}
sb.WriteString("[tool_use:")
sb.WriteString(b.Name)
sb.WriteString("]\n")
}
}
return strings.TrimRight(sb.String(), "\n"), firstTool
}
// extractAttachmentText pulls text content from an attachment payload,
// or returns a short tag when the attachment is a hook event.
func extractAttachmentText(raw json.RawMessage) string {
if len(raw) == 0 {
return ""
}
var a struct {
Type string `json:"type"`
HookName string `json:"hookName"`
HookEvent string `json:"hookEvent"`
Content string `json:"content"`
Text string `json:"text"`
}
if err := json.Unmarshal(raw, &a); err != nil {
return ""
}
if a.Content != "" {
return a.Content
}
if a.Text != "" {
return a.Text
}
if a.HookName != "" {
return "[hook " + a.HookEvent + ":" + a.HookName + "]"
}
return ""
}

View File

@@ -0,0 +1,157 @@
package claudewatcher
import (
"errors"
"strings"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func collect(t *testing.T, body string) ([]Turn, int64, error) {
t.Helper()
var out []Turn
end, err := ParseStream(strings.NewReader(body), 0, nil, func(tr Turn) error {
out = append(out, tr)
return nil
})
return out, end, err
}
func TestParseStream_UserTurnStringContent(t *testing.T) {
body := `{"type":"user","sessionId":"S","timestamp":"2026-05-25T07:00:00Z","message":"hello world"}
`
turns, end, err := collect(t, body)
require.NoError(t, err)
require.Len(t, turns, 1)
assert.Equal(t, "user", turns[0].Type)
assert.Equal(t, "S", turns[0].SessionID)
assert.Equal(t, "hello world", turns[0].Content)
assert.False(t, turns[0].Skip)
assert.Equal(t, int64(len(body)), end)
}
func TestParseStream_UserTurnContentBlocks(t *testing.T) {
body := `{"type":"user","sessionId":"S","timestamp":"2026-05-25T07:00:00Z","message":{"role":"user","content":[{"type":"text","text":"line 1"},{"type":"text","text":"line 2"}]}}
`
turns, _, err := collect(t, body)
require.NoError(t, err)
require.Len(t, turns, 1)
assert.Equal(t, "line 1\nline 2", turns[0].Content)
}
func TestParseStream_AssistantToolUse(t *testing.T) {
body := `{"type":"assistant","sessionId":"S","timestamp":"2026-05-25T07:00:00Z","message":{"content":[{"type":"text","text":"calling now"},{"type":"tool_use","name":"Edit","input":{}}]}}
`
turns, _, err := collect(t, body)
require.NoError(t, err)
require.Len(t, turns, 1)
assert.Equal(t, "Edit", turns[0].ToolName)
assert.Contains(t, turns[0].Content, "calling now")
assert.Contains(t, turns[0].Content, "[tool_use:Edit]")
}
func TestParseStream_AssistantToolResult(t *testing.T) {
body := `{"type":"user","sessionId":"S","timestamp":"2026-05-25T07:00:00Z","message":{"content":[{"type":"tool_result","content":"output of cmd"}]}}
`
turns, _, err := collect(t, body)
require.NoError(t, err)
require.Len(t, turns, 1)
assert.Contains(t, turns[0].Content, "[tool_result] output of cmd")
}
func TestParseStream_SkipsBookkeepingTypes(t *testing.T) {
body := strings.Join([]string{
`{"type":"queue-operation","sessionId":"S","content":"x"}`,
`{"type":"last-prompt","sessionId":"S","lastPrompt":"y"}`,
`{"type":"permission-mode","sessionId":"S","permissionMode":"auto"}`,
`{"type":"ai-title","sessionId":"S","aiTitle":"My session"}`,
`{"type":"file-history-snapshot","messageId":"abc"}`,
}, "\n") + "\n"
turns, _, err := collect(t, body)
require.NoError(t, err)
require.Len(t, turns, 5)
for _, tr := range turns {
assert.True(t, tr.Skip, "expected Skip=true for %q", tr.Type)
}
}
func TestParseStream_UnknownTypeIsSkip(t *testing.T) {
body := `{"type":"future-thing","sessionId":"S"}` + "\n"
turns, _, err := collect(t, body)
require.NoError(t, err)
require.Len(t, turns, 1)
assert.True(t, turns[0].Skip)
}
func TestParseStream_MalformedLineIsSkippedNotFatal(t *testing.T) {
body := strings.Join([]string{
`{"type":"user","sessionId":"S","message":"first"}`,
`{not valid json`,
`{"type":"user","sessionId":"S","message":"third"}`,
}, "\n") + "\n"
var warnings int
var turns []Turn
_, err := ParseStream(strings.NewReader(body), 0, func(format string, args ...any) {
warnings++
}, func(tr Turn) error {
turns = append(turns, tr)
return nil
})
require.NoError(t, err)
require.Len(t, turns, 2, "first + third should make it through")
assert.Equal(t, 1, warnings)
}
func TestParseStream_EmitErrStopHaltsCleanly(t *testing.T) {
body := strings.Join([]string{
`{"type":"user","sessionId":"S","message":"a"}`,
`{"type":"user","sessionId":"S","message":"b"}`,
`{"type":"user","sessionId":"S","message":"c"}`,
}, "\n") + "\n"
count := 0
end, err := ParseStream(strings.NewReader(body), 0, nil, func(tr Turn) error {
count++
if count == 2 {
return ErrStop
}
return nil
})
require.NoError(t, err)
assert.Equal(t, 2, count)
assert.Greater(t, end, int64(0))
}
func TestParseStream_EmitOtherErrorPropagates(t *testing.T) {
body := `{"type":"user","sessionId":"S","message":"a"}` + "\n"
want := errors.New("boom")
_, err := ParseStream(strings.NewReader(body), 0, nil, func(tr Turn) error {
return want
})
require.Error(t, err)
assert.Contains(t, err.Error(), "boom")
}
func TestParseStream_AttachmentHookEvent(t *testing.T) {
body := `{"type":"attachment","sessionId":"S","timestamp":"2026-05-25T07:00:00Z","attachment":{"type":"hook_success","hookName":"SessionStart:startup","hookEvent":"SessionStart","content":"hook body"}}
`
turns, _, err := collect(t, body)
require.NoError(t, err)
require.Len(t, turns, 1)
assert.Equal(t, "hook body", turns[0].Content)
}
func TestParseStream_OffsetAdvances(t *testing.T) {
body := `{"type":"user","sessionId":"S","message":"a"}` + "\n" +
`{"type":"user","sessionId":"S","message":"b"}` + "\n"
var offsets []int64
_, err := ParseStream(strings.NewReader(body), 100, nil, func(tr Turn) error {
offsets = append(offsets, tr.OffsetAfter)
return nil
})
require.NoError(t, err)
require.Len(t, offsets, 2)
assert.Greater(t, offsets[0], int64(100))
assert.Greater(t, offsets[1], offsets[0])
}

View File

@@ -0,0 +1,114 @@
package claudewatcher
import (
"fmt"
"regexp"
"sync"
)
// Scrubber drops any turn whose content matches a known-bad pattern.
// Fail-closed by design: we'd rather lose signal than ingest credentials
// into a public-readable brain. The caller logs the drop reason.
//
// Rules cover the credential shapes most common to leak through Claude
// Code sessions: bearer tokens, postgres URIs with embedded auth, OAuth
// secret values, SOPS-encrypted secret blobs (we don't want the
// ciphertext either — it's a marker that the original message contained
// secret state), PEM-encoded private keys, and the explicit env-var
// naming conventions used in the homelab.
//
// Pattern philosophy: match by shape, not by content. A 40-char hex
// string in isolation is fine; the same string after `Authorization:
// Bearer ` is not. Tuned to catch known leak vectors from prior
// secret-hygiene incidents (POSTGRES_PASSWORD via kubectl exec env,
// INFRA_MCP_TOKEN via sops -d output) without dropping every Edit on a
// config file.
// Rule is a single named regex with a redact hint shown in the warn log.
type Rule struct {
Name string
RE *regexp.Regexp
}
// DefaultRules is the regex set applied by Scrub. Mutable for tests but
// callers should treat it as read-only at runtime.
var DefaultRules = []Rule{
// authorization-header is checked before the bare bearer rule so
// contextual hits ("Authorization: Bearer X") report the more
// specific match name in logs.
{Name: "authorization-header", RE: regexp.MustCompile(`(?i)Authorization\s*:\s*[A-Za-z]+\s+\S{8,}`)},
{Name: "bearer-token", RE: regexp.MustCompile(`(?i)Bearer\s+[A-Za-z0-9._\-]{16,}`)},
{Name: "postgres-uri-with-password", RE: regexp.MustCompile(`postgres(?:ql)?://[^:\s/]+:[^@\s/]+@`)},
{Name: "private-key", RE: regexp.MustCompile(`-----BEGIN[^-]*PRIVATE KEY-----`)},
{Name: "ssh-key", RE: regexp.MustCompile(`ssh-(?:rsa|ed25519|ecdsa)\s+[A-Za-z0-9+/=]{40,}`)},
{Name: "github-pat", RE: regexp.MustCompile(`\b(?:ghp|gho|ghu|ghr|gha)_[A-Za-z0-9]{30,}\b`)},
{Name: "openai-sk", RE: regexp.MustCompile(`\bsk-(?:proj-)?[A-Za-z0-9]{32,}\b`)},
{Name: "anthropic-sk", RE: regexp.MustCompile(`\bsk-ant-[A-Za-z0-9_\-]{32,}\b`)},
{Name: "aws-access-key", RE: regexp.MustCompile(`\bAKIA[0-9A-Z]{16}\b`)},
{Name: "homelab-env-token", RE: regexp.MustCompile(`(?i)(?:_TOKEN|_PASSWORD|_API_KEY|_SECRET)\s*[:=]\s*['"]?[A-Za-z0-9._/+\-]{12,}`)},
{Name: "sops-encrypted-marker", RE: regexp.MustCompile(`ENC\[AES256_GCM,data:[A-Za-z0-9+/=]{8,}`)},
}
// extraRules is appended to DefaultRules at process startup via
// RegisterRule. The mutex guards concurrent RegisterRule calls (rare)
// against concurrent Scrub reads (hot path). Scrub takes a read lock
// only when extraRules is non-empty, so steady-state cost is zero
// when no client-name guard is configured.
var (
extraRulesMu sync.RWMutex
extraRules []Rule
)
// RegisterRule appends a runtime-configured regex to the scrubber's
// rule set. Used by main to inject client-name guards from
// CLAUDE_INGEST_CLIENT_BLOCK env var (or equivalent SOPS-encrypted
// secret) without baking client identities into source code.
//
// pattern is compiled as-is — callers wrap with `\b...\b` and case
// flags as needed. Duplicate names are accepted (rules are positional);
// the second registration just fires after the first.
func RegisterRule(name, pattern string) error {
re, err := regexp.Compile(pattern)
if err != nil {
return fmt.Errorf("compile rule %q: %w", name, err)
}
extraRulesMu.Lock()
extraRules = append(extraRules, Rule{Name: name, RE: re})
extraRulesMu.Unlock()
return nil
}
// ResetExtraRules clears every RegisterRule-added rule. Test-only.
func ResetExtraRules() {
extraRulesMu.Lock()
extraRules = nil
extraRulesMu.Unlock()
}
// Scrub reports the first matching rule, or empty when content is clean.
// Empty string is treated as clean. Caller decides what to do on a hit;
// the convention in claudewatcher is to drop the turn entirely and emit
// a slog.Warn naming the rule.
//
// Rule order: DefaultRules first (credential shapes), then runtime
// RegisterRule additions (client-name guards). Credential leaks
// outrank client-name hits in the log because they're strictly more
// dangerous.
func Scrub(content string) string {
if content == "" {
return ""
}
for _, r := range DefaultRules {
if r.RE.MatchString(content) {
return r.Name
}
}
extraRulesMu.RLock()
defer extraRulesMu.RUnlock()
for _, r := range extraRules {
if r.RE.MatchString(content) {
return r.Name
}
}
return ""
}

View File

@@ -0,0 +1,117 @@
package claudewatcher
import (
"testing"
"github.com/stretchr/testify/assert"
)
func TestScrub_PoisonedFixtures(t *testing.T) {
// One representative bad-string per rule. If a rule fires for the
// wrong content shape later, this table localises the regression.
cases := []struct {
name string
content string
want string
}{
{"bearer-token", "curl -H 'Authorization: Bearer abcdef1234567890ghijklmnop'", "authorization-header"},
{"bearer-no-header", "header = Bearer eyJhbGciOiJIUzI1NiJ9.payload.sig", "bearer-token"},
{"postgres-uri", "DATABASE_URL=postgres://user:s3cret@10.0.1.20:5432/brain", "postgres-uri-with-password"},
{"private-key", "-----BEGIN OPENSSH PRIVATE KEY-----\nb3BlbnNzaC1rZXktdjEAAAAA", "private-key"},
{"ssh-public", "deploy: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIK1234567890abcdefghij user@host", "ssh-key"},
{"github-pat-classic", "GH_TOKEN=ghp_aBcD1234EfGh5678IjKl9012MnOp3456QrSt", "github-pat"},
{"openai-key", "OPENAI_API_KEY=sk-proj-AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIII", "openai-sk"},
{"anthropic-key", "ANTHROPIC_API_KEY=sk-ant-api03-aaaaBBBBccccDDDDeeeeFFFFggggHHHHiiiiJJJJkkkk", "anthropic-sk"},
{"aws-access-key", "AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE", "aws-access-key"},
{"homelab-env", "POSTGRES_PASSWORD=hunter2supersecretvalue", "homelab-env-token"},
{"sops-marker", "value: ENC[AES256_GCM,data:abc123def456,iv:zzz]", "sops-encrypted-marker"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := Scrub(tc.content)
assert.Equal(t, tc.want, got)
})
}
}
func TestScrub_CleanContentPassesThrough(t *testing.T) {
cases := []string{
"",
"plain text with no credentials",
"a 40 char hex string aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa is fine in isolation",
"`Bearer` token mentioned in docs without an actual value",
"file at ~/.ssh/id_ed25519",
"the function Authorization() takes no args",
"comment: see API key in 1Password",
}
for _, c := range cases {
assert.Empty(t, Scrub(c), "expected clean for %q", c)
}
}
func TestScrub_FirstMatchWins(t *testing.T) {
// Content matching multiple rules: report the first rule order in
// DefaultRules. Stability matters for log triage.
content := "Authorization: Bearer ghp_aBcD1234EfGh5678IjKl9012MnOp3456QrSt"
assert.Equal(t, "authorization-header", Scrub(content))
}
func TestRegisterRule_ClientNameGuard(t *testing.T) {
t.Cleanup(ResetExtraRules)
require := func(err error) {
if err != nil {
t.Fatalf("unexpected err: %v", err)
}
}
require(RegisterRule("client-name", `(?i)\b(SEB|Mastercard)\b`))
// Hits — case variations + word-boundary respect.
for _, hit := range []string{
"mentioned SEB in this commit",
"the Mastercard project deadline",
"working on mastercard scope",
"SEB internal review",
} {
assert.Equal(t, "client-name", Scrub(hit), "should match %q", hit)
}
// Misses — substring within a longer word should NOT match
// thanks to \b. "Sebastian" contains "seb" but \b prevents hit.
for _, miss := range []string{
"Sebastian wrote the docs",
"unrelated text",
"researcher",
"https://example.com/search?seb=1", // 'seb' bounded by ?=, still matches \b
} {
got := Scrub(miss)
if miss == "https://example.com/search?seb=1" {
// `seb=` has word-boundary at '='; this DOES match \bseb\b.
// Accept either outcome; document the tradeoff.
assert.Contains(t, []string{"", "client-name"}, got)
continue
}
assert.Empty(t, got, "should NOT match %q", miss)
}
}
func TestRegisterRule_CredentialsTakePrecedence(t *testing.T) {
t.Cleanup(ResetExtraRules)
require := func(err error) {
if err != nil {
t.Fatalf("unexpected err: %v", err)
}
}
require(RegisterRule("client-name", `\b(SEB)\b`))
// Content matches both a credential rule AND a client rule —
// credential rule wins by ordering, so log triage points at the
// strictly more dangerous leak.
content := "SEB project uses OPENAI_API_KEY=sk-proj-AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIII"
assert.Equal(t, "openai-sk", Scrub(content))
}
func TestRegisterRule_RejectsInvalidPattern(t *testing.T) {
t.Cleanup(ResetExtraRules)
err := RegisterRule("bad", "[unclosed")
assert.Error(t, err)
}

View File

@@ -0,0 +1,234 @@
package claudewatcher
import (
"context"
"fmt"
"log/slog"
"os"
"path/filepath"
"strings"
"time"
)
// Sink consumes batches of ingest-ready turns from the watcher. The
// production implementation builds wiki pages and calls pipeline.RunRaw
// against the brain. Tests substitute a counter.
//
// A Batch represents the turns ingested from one session file between
// two cursor checkpoints. Implementations must be idempotent — the
// watcher only advances the cursor on a nil return.
type Sink interface {
Ingest(ctx context.Context, b Batch) error
}
// Batch is a per-file slice of turns plus identifying metadata.
type Batch struct {
Host string // origin host, e.g. "koala"
FilePath string // absolute path to the source .jsonl file
SessionID string // first session_id seen in the batch
ProjectID string // basename of the parent dir, e.g. "-home-mathias-dev"
Turns []Turn // never empty; caller filters Skip + scrubber matches
}
// Config drives one Watch loop. SessionsDir is the absolute path to the
// Claude Code projects directory (~/.claude/projects). Host is the
// label written into cursors and ingested page frontmatter. Interval
// is the poll cadence; a zero or negative value disables the loop.
//
// Sink is required. Cursors is optional — when nil the watcher
// re-reads from byte 0 on every tick (useful for first-run testing
// without a postgres dependency).
type Config struct {
SessionsDir string
Host string
Interval time.Duration
Sink Sink
Cursors *CursorStore
Logger *slog.Logger
}
// Watch runs the polling loop until ctx is cancelled. Returns ctx.Err()
// on shutdown. Each tick walks SessionsDir for *.jsonl files, advances
// each file's cursor, and emits one Batch per file with new turns.
// Errors during a single file's parse or ingest are logged but do not
// abort the loop — a single bad file shouldn't block the others.
func Watch(ctx context.Context, cfg Config) error {
if cfg.SessionsDir == "" {
return fmt.Errorf("sessions dir is required")
}
if cfg.Sink == nil {
return fmt.Errorf("sink is required")
}
if cfg.Interval <= 0 {
return fmt.Errorf("interval must be positive")
}
if cfg.Host == "" {
cfg.Host = "unknown"
}
if cfg.Logger == nil {
cfg.Logger = slog.Default()
}
cfg.Logger.Info("claudewatcher: started",
"sessions_dir", cfg.SessionsDir,
"host", cfg.Host,
"interval", cfg.Interval)
ticker := time.NewTicker(cfg.Interval)
defer ticker.Stop()
// Run an immediate first sweep so first-launch users don't wait one
// tick before anything happens.
runTick(ctx, cfg)
for {
select {
case <-ctx.Done():
return ctx.Err()
case <-ticker.C:
runTick(ctx, cfg)
}
}
}
// runTick is one polling pass. Exposed (lowercase) for tests via
// TickOnce.
func runTick(ctx context.Context, cfg Config) {
files, err := listSessionFiles(cfg.SessionsDir)
if err != nil {
cfg.Logger.Warn("claudewatcher: list session files", "err", err)
return
}
for _, f := range files {
if ctx.Err() != nil {
return
}
if err := processFile(ctx, cfg, f); err != nil {
cfg.Logger.Warn("claudewatcher: file failed",
"path", f, "err", err)
}
}
}
// TickOnce runs one sweep synchronously and returns. Used by tests +
// by ad-hoc CLI invocations.
func TickOnce(ctx context.Context, cfg Config) error {
if cfg.SessionsDir == "" || cfg.Sink == nil {
return fmt.Errorf("config invalid")
}
if cfg.Host == "" {
cfg.Host = "unknown"
}
if cfg.Logger == nil {
cfg.Logger = slog.Default()
}
runTick(ctx, cfg)
return nil
}
func listSessionFiles(root string) ([]string, error) {
var out []string
err := filepath.WalkDir(root, func(path string, d os.DirEntry, walkErr error) error {
if walkErr != nil {
return walkErr
}
if d.IsDir() {
return nil
}
if !strings.HasSuffix(path, ".jsonl") {
return nil
}
out = append(out, path)
return nil
})
if err != nil {
return nil, fmt.Errorf("walk %s: %w", root, err)
}
return out, nil
}
func processFile(ctx context.Context, cfg Config, path string) error {
startOffset := int64(0)
if cfg.Cursors != nil {
off, _, err := cfg.Cursors.GetOffset(ctx, cfg.Host, path)
if err != nil {
return fmt.Errorf("get cursor: %w", err)
}
startOffset = off
}
stat, err := os.Stat(path)
if err != nil {
return fmt.Errorf("stat: %w", err)
}
if stat.Size() <= startOffset {
return nil // nothing new
}
f, err := os.Open(path)
if err != nil {
return fmt.Errorf("open: %w", err)
}
defer func() { _ = f.Close() }()
if _, err := f.Seek(startOffset, 0); err != nil {
return fmt.Errorf("seek: %w", err)
}
var keep []Turn
var sessionID string
var droppedScrub int
endOffset, err := ParseStream(f, startOffset,
func(format string, args ...any) {
cfg.Logger.Warn(fmt.Sprintf("claudewatcher: parse: "+format, args...))
},
func(t Turn) error {
if t.Skip || t.Content == "" {
return nil
}
if rule := Scrub(t.Content); rule != "" {
droppedScrub++
cfg.Logger.Warn("claudewatcher: turn dropped by scrubber",
"rule", rule, "path", path, "session_id", t.SessionID)
return nil
}
if sessionID == "" {
sessionID = t.SessionID
}
keep = append(keep, t)
return nil
})
if err != nil {
return fmt.Errorf("parse stream: %w", err)
}
if len(keep) == 0 {
if cfg.Cursors != nil {
if err := cfg.Cursors.SetOffset(ctx, cfg.Host, path, endOffset); err != nil {
return fmt.Errorf("advance cursor (no-turns): %w", err)
}
}
if droppedScrub > 0 {
cfg.Logger.Info("claudewatcher: only scrubbed turns this tick",
"path", path, "dropped", droppedScrub)
}
return nil
}
batch := Batch{
Host: cfg.Host,
FilePath: path,
SessionID: sessionID,
ProjectID: filepath.Base(filepath.Dir(path)),
Turns: keep,
}
if err := cfg.Sink.Ingest(ctx, batch); err != nil {
return fmt.Errorf("sink ingest: %w", err)
}
if cfg.Cursors != nil {
if err := cfg.Cursors.SetOffset(ctx, cfg.Host, path, endOffset); err != nil {
return fmt.Errorf("advance cursor: %w", err)
}
}
cfg.Logger.Info("claudewatcher: ingested batch",
"path", path, "session_id", sessionID,
"turns_kept", len(keep), "dropped_scrub", droppedScrub,
"new_offset", endOffset)
return nil
}

View File

@@ -0,0 +1,174 @@
package claudewatcher
import (
"context"
"os"
"path/filepath"
"strings"
"sync"
"testing"
"time"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// memSink captures batches without touching postgres. Thread-safe so
// TickOnce can run from any goroutine in concurrent tests.
type memSink struct {
mu sync.Mutex
batches []Batch
failOn string // file basename to error on
}
func (m *memSink) Ingest(_ context.Context, b Batch) error {
m.mu.Lock()
defer m.mu.Unlock()
if m.failOn != "" && strings.Contains(b.FilePath, m.failOn) {
return assert.AnError
}
m.batches = append(m.batches, b)
return nil
}
func writeSession(t *testing.T, dir, sessionID string, lines []string) string {
t.Helper()
path := filepath.Join(dir, sessionID+".jsonl")
body := strings.Join(lines, "\n") + "\n"
require.NoError(t, os.WriteFile(path, []byte(body), 0o644))
return path
}
func TestTickOnce_NoCursorReingestsEverythingEveryTick(t *testing.T) {
tmp := t.TempDir()
projectDir := filepath.Join(tmp, "-home-mathias-dev")
require.NoError(t, os.MkdirAll(projectDir, 0o755))
writeSession(t, projectDir, "sess1", []string{
`{"type":"user","sessionId":"sess1","message":"first prompt"}`,
`{"type":"assistant","sessionId":"sess1","message":{"content":[{"type":"text","text":"first answer"}]}}`,
})
sink := &memSink{}
cfg := Config{
SessionsDir: tmp,
Host: "koala",
Sink: sink,
}
require.NoError(t, TickOnce(context.Background(), cfg))
require.NoError(t, TickOnce(context.Background(), cfg))
require.Len(t, sink.batches, 2, "no cursor => re-emits same batch every tick")
assert.Equal(t, "sess1", sink.batches[0].SessionID)
assert.Equal(t, "koala", sink.batches[0].Host)
assert.Equal(t, "-home-mathias-dev", sink.batches[0].ProjectID)
assert.Len(t, sink.batches[0].Turns, 2)
}
func TestTickOnce_FiltersSkipTurnsAndScrubberMatches(t *testing.T) {
tmp := t.TempDir()
proj := filepath.Join(tmp, "-home-mathias-dev")
require.NoError(t, os.MkdirAll(proj, 0o755))
writeSession(t, proj, "sess-scrub", []string{
`{"type":"queue-operation","sessionId":"sess-scrub","content":"x"}`, // Skip
`{"type":"user","sessionId":"sess-scrub","message":"normal prompt"}`,
`{"type":"assistant","sessionId":"sess-scrub","message":{"content":[{"type":"text","text":"value POSTGRES_PASSWORD=hunter2supersecretvalue"}]}}`, // scrubbed
})
sink := &memSink{}
require.NoError(t, TickOnce(context.Background(), Config{
SessionsDir: tmp, Host: "koala", Sink: sink,
}))
require.Len(t, sink.batches, 1)
turns := sink.batches[0].Turns
require.Len(t, turns, 1, "skip + scrubbed turns must not reach the sink")
assert.Equal(t, "user", turns[0].Type)
}
func TestTickOnce_AllScrubbedNoBatchEmitted(t *testing.T) {
tmp := t.TempDir()
proj := filepath.Join(tmp, "-home-mathias-dev")
require.NoError(t, os.MkdirAll(proj, 0o755))
writeSession(t, proj, "all-bad", []string{
`{"type":"user","sessionId":"all-bad","message":"Authorization: Bearer abcdef1234567890ghijklmnop"}`,
})
sink := &memSink{}
require.NoError(t, TickOnce(context.Background(), Config{
SessionsDir: tmp, Host: "koala", Sink: sink,
}))
assert.Empty(t, sink.batches, "no usable turns => no batch")
}
func TestTickOnce_IgnoresNonJsonlFiles(t *testing.T) {
tmp := t.TempDir()
proj := filepath.Join(tmp, "-home-mathias-dev")
require.NoError(t, os.MkdirAll(proj, 0o755))
require.NoError(t, os.WriteFile(filepath.Join(proj, "README.md"), []byte("ignore me"), 0o644))
require.NoError(t, os.WriteFile(filepath.Join(proj, "config.json"), []byte("{}"), 0o644))
sink := &memSink{}
require.NoError(t, TickOnce(context.Background(), Config{
SessionsDir: tmp, Host: "koala", Sink: sink,
}))
assert.Empty(t, sink.batches)
}
func TestTickOnce_HandlesMultipleProjectsAndSessions(t *testing.T) {
tmp := t.TempDir()
projA := filepath.Join(tmp, "-home-mathias-dev")
projB := filepath.Join(tmp, "-home-mathias-AI-infra")
require.NoError(t, os.MkdirAll(projA, 0o755))
require.NoError(t, os.MkdirAll(projB, 0o755))
writeSession(t, projA, "a1", []string{`{"type":"user","sessionId":"a1","message":"q1"}`})
writeSession(t, projA, "a2", []string{`{"type":"user","sessionId":"a2","message":"q2"}`})
writeSession(t, projB, "b1", []string{`{"type":"user","sessionId":"b1","message":"q3"}`})
sink := &memSink{}
require.NoError(t, TickOnce(context.Background(), Config{
SessionsDir: tmp, Host: "koala", Sink: sink,
}))
require.Len(t, sink.batches, 3)
projects := map[string]int{}
for _, b := range sink.batches {
projects[b.ProjectID]++
}
assert.Equal(t, 2, projects["-home-mathias-dev"])
assert.Equal(t, 1, projects["-home-mathias-AI-infra"])
}
func TestTickOnce_SinkErrorDoesNotKillOtherFiles(t *testing.T) {
tmp := t.TempDir()
proj := filepath.Join(tmp, "-home-mathias-dev")
require.NoError(t, os.MkdirAll(proj, 0o755))
writeSession(t, proj, "good", []string{`{"type":"user","sessionId":"good","message":"q"}`})
writeSession(t, proj, "bad-session", []string{`{"type":"user","sessionId":"bad-session","message":"q"}`})
sink := &memSink{failOn: "bad-session"}
require.NoError(t, TickOnce(context.Background(), Config{
SessionsDir: tmp, Host: "koala", Sink: sink,
}))
require.Len(t, sink.batches, 1, "good session still ingested")
assert.Equal(t, "good", sink.batches[0].SessionID)
}
func TestWatch_RespectsContextCancel(t *testing.T) {
tmp := t.TempDir()
require.NoError(t, os.MkdirAll(filepath.Join(tmp, "-home-mathias-dev"), 0o755))
sink := &memSink{}
ctx, cancel := context.WithCancel(context.Background())
done := make(chan error, 1)
go func() {
done <- Watch(ctx, Config{
SessionsDir: tmp,
Host: "koala",
Interval: 10 * time.Millisecond,
Sink: sink,
})
}()
time.Sleep(50 * time.Millisecond)
cancel()
select {
case err := <-done:
assert.ErrorIs(t, err, context.Canceled)
case <-time.After(2 * time.Second):
t.Fatal("Watch did not return after cancel")
}
}

View File

@@ -0,0 +1,76 @@
// Package embed produces dense vector embeddings for brain content.
//
// Wire format is Ollama's `/api/embed`, with the canonical request shape
// `{"model": "...", "input": "..."}` and a 2-D `embeddings` response.
// Default deployment runs `nomic-embed-text` on iguana, which returns
// 768-dim vectors compatible with the brain_embeddings table schema.
package embed
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"strings"
"time"
)
// Client posts embedding requests to an Ollama-compatible endpoint.
type Client struct {
URL string
Model string
HTTP *http.Client
}
// New constructs a Client. Returns nil when url is empty so callers can
// treat a missing BRAIN_EMBED_URL as "feature disabled" via a single nil
// check.
func New(url, model string) *Client {
if url == "" {
return nil
}
return &Client{
URL: strings.TrimRight(url, "/"),
Model: model,
HTTP: &http.Client{Timeout: 30 * time.Second},
}
}
// Embed returns the embedding vector for text. Empty text is rejected
// up-front to keep upstream errors from masking caller mistakes.
func (c *Client) Embed(ctx context.Context, text string) ([]float32, error) {
if strings.TrimSpace(text) == "" {
return nil, fmt.Errorf("embed: empty text")
}
reqBody, _ := json.Marshal(map[string]any{
"model": c.Model,
"input": text,
})
req, err := http.NewRequestWithContext(ctx, http.MethodPost,
c.URL+"/api/embed", bytes.NewReader(reqBody))
if err != nil {
return nil, err
}
req.Header.Set("Content-Type", "application/json")
resp, err := c.HTTP.Do(req)
if err != nil {
return nil, err
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode/100 != 2 {
body, _ := io.ReadAll(resp.Body)
return nil, fmt.Errorf("embed: status %d: %s", resp.StatusCode, string(body))
}
var out struct {
Embeddings [][]float32 `json:"embeddings"`
}
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
return nil, fmt.Errorf("embed: decode: %w", err)
}
if len(out.Embeddings) == 0 || len(out.Embeddings[0]) == 0 {
return nil, fmt.Errorf("embed: empty embeddings in response")
}
return out.Embeddings[0], nil
}

View File

@@ -0,0 +1,74 @@
package embed_test
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/embed"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestNew_EmptyURLReturnsNil(t *testing.T) {
assert.Nil(t, embed.New("", "model"))
}
func TestEmbed_ReturnsVector(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, "/api/embed", r.URL.Path)
var req map[string]any
require.NoError(t, json.NewDecoder(r.Body).Decode(&req))
assert.Equal(t, "nomic", req["model"])
assert.Equal(t, "hello", req["input"])
_ = json.NewEncoder(w).Encode(map[string]any{
"embeddings": [][]float32{{0.1, 0.2, 0.3}},
})
}))
defer srv.Close()
c := embed.New(srv.URL, "nomic")
require.NotNil(t, c)
v, err := c.Embed(context.Background(), "hello")
require.NoError(t, err)
assert.Equal(t, []float32{0.1, 0.2, 0.3}, v)
}
func TestEmbed_StripsTrailingSlashFromURL(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, "/api/embed", r.URL.Path)
_ = json.NewEncoder(w).Encode(map[string]any{"embeddings": [][]float32{{1.0}}})
}))
defer srv.Close()
c := embed.New(srv.URL+"/", "nomic")
_, err := c.Embed(context.Background(), "x")
require.NoError(t, err)
}
func TestEmbed_PropagatesUpstreamError(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.WriteHeader(http.StatusBadGateway)
}))
defer srv.Close()
c := embed.New(srv.URL, "m")
_, err := c.Embed(context.Background(), "x")
require.Error(t, err)
}
func TestEmbed_RejectsEmptyEmbeddingsArray(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
_ = json.NewEncoder(w).Encode(map[string]any{"embeddings": [][]float32{}})
}))
defer srv.Close()
c := embed.New(srv.URL, "m")
_, err := c.Embed(context.Background(), "x")
require.Error(t, err)
}
func TestEmbed_RejectsEmptyText(t *testing.T) {
c := embed.New("http://127.0.0.1:1", "m")
_, err := c.Embed(context.Background(), "")
require.Error(t, err)
}

View File

@@ -0,0 +1,263 @@
// Package graph extracts entity + edge records from brain markdown
// documents for the brain_entities / brain_edges relational graph.
//
// The extractor is pure: it takes markdown bytes and a document path and
// returns the entity (one per doc) and the wikilink edges (zero or more)
// it found, with source line numbers so the graph store can record
// provenance.
//
// Edge types in v1: only "wikilink" — derived from [[slug]] and
// [[slug|Display]] occurrences in the body. Section-header edges are
// deferred (see infra#62 grill addendum).
package graph
import (
"bufio"
"bytes"
"path/filepath"
"regexp"
"strings"
)
// Entity represents one brain document for graph indexing.
//
// Slug is the basename without ".md" — the same identity used by
// wiki canonicalization and the wikilink target syntax.
//
// Type categorises the doc into a coarse bucket so callers can filter
// graph traversals (e.g. "only entity nodes"). When the doc lives
// under brain/wiki/<wing>/<hall>/, Wing and Hall capture the
// taxonomy; otherwise they're empty (legacy brain/knowledge/ docs).
type Entity struct {
DocPath string // forward-slash, relative to brainDir
Slug string
Type string // "concept" | "entity" | "source" | "hall" | "knowledge"
Wing string // optional; from frontmatter or path
Hall string // optional; from frontmatter or path
Title string // optional; from frontmatter
// DIKW tier — infra#72. Empty until M3 migration writes `tier:`
// frontmatter to every entry. Path-inferred tier kicks in as a
// fallback so the column populates immediately on backfill even
// for entries that haven't had their frontmatter rewritten yet.
Tier string // "inbox" | "note" | "knowledge"
Topic string // kebab-slug; the thing the entry is about
}
// Edge represents a directed relationship between two slugs.
//
// SrcLine is the 1-indexed line in the source document where the link
// was found, so callers can re-find the linking text after an edit.
type Edge struct {
SrcDoc string // forward-slash, relative to brainDir
SrcSlug string // == Entity.Slug for SrcDoc
DstSlug string
EdgeType string // "wikilink" in v1
SrcLine int // 1-indexed
}
// linkRE matches both [[slug]] and [[slug|Display Name]] wikilinks.
// Group 1 is the slug; group 2 (if present) is the display.
var linkRE = regexp.MustCompile(`\[\[([^\]|]+)(?:\|([^\]]+))?\]\]`)
// Extract parses one markdown document and returns its Entity plus the
// outgoing wikilink Edges. docPath is forward-slash, relative to
// brainDir; content is the raw markdown bytes.
//
// Returns ok=false when docPath does not yield a usable slug (e.g.
// non-markdown file slipped through).
func Extract(docPath string, content []byte) (Entity, []Edge, bool) {
slug := slugFromPath(docPath)
if slug == "" {
return Entity{}, nil, false
}
ent := Entity{DocPath: docPath, Slug: slug}
classifyByPath(&ent, docPath)
readFrontmatter(&ent, content)
inferTierFromPath(&ent, docPath)
edges := extractEdges(docPath, slug, content)
return ent, edges, true
}
// inferTierFromPath fills Tier when frontmatter didn't already set it.
// The new layout has dedicated subtrees per tier; pre-migration paths
// (knowledge/, wiki/, raw/, sessions/) get their best-guess mapping so
// the column populates on backfill before the M3 file moves run.
func inferTierFromPath(e *Entity, docPath string) {
if e.Tier != "" {
return
}
parts := strings.Split(docPath, "/")
if len(parts) == 0 {
return
}
switch parts[0] {
case "inbox":
e.Tier = "inbox"
case "notes":
e.Tier = "note"
case "knowledge":
e.Tier = "knowledge"
case "wiki":
// Pre-M3 wiki layout. Most subdirs are I-level:
// wiki/sources/ — synth summaries of raw inbox material
// wiki/concepts/ — definitions, not lessons
// One exception: wiki/entities/ holds anchor facts about
// concrete things (models, services, people) that the eval
// expects to surface when queried directly. Those map to K
// to match the post-M3 layout target (knowledge/facts/).
if len(parts) >= 2 && parts[1] == "entities" {
e.Tier = "knowledge"
} else {
e.Tier = "note"
}
case "raw", "sessions", "clips":
e.Tier = "inbox"
}
}
func slugFromPath(docPath string) string {
base := filepath.Base(docPath)
if !strings.HasSuffix(base, ".md") {
return ""
}
return strings.TrimSuffix(base, ".md")
}
// classifyByPath fills Type / Wing / Hall from the path layout when the
// doc lives under brain/wiki/. Layout: wiki/<wing>/<hall>/<slug>.md
// or wiki/<bucket>/<slug>.md for the legacy concept/entity/source dirs.
//
// Files directly under wiki/ (no subdirectory — e.g. wiki/index.md) used
// to incorrectly land Type="hall" Wing="index.md" because the path's
// second segment was the file itself. Now they fall through to Type
// "knowledge" and leave wing/hall to frontmatter.
func classifyByPath(e *Entity, docPath string) {
parts := strings.Split(docPath, "/")
if len(parts) < 2 || parts[0] != "wiki" {
e.Type = "knowledge"
return
}
if len(parts) < 3 {
// wiki/<slug>.md — no subdirectory. Treat as plain knowledge
// and let frontmatter set wing/hall if they're present.
e.Type = "knowledge"
return
}
switch parts[1] {
case "concepts":
e.Type = "concept"
case "entities":
e.Type = "entity"
case "sources":
e.Type = "source"
default:
// wiki/<wing>/<hall>/<slug>.md
e.Type = "hall"
e.Wing = parts[1]
if len(parts) >= 4 {
e.Hall = parts[2]
}
}
}
// readFrontmatter pulls title/wing/hall from a YAML frontmatter block.
// Frontmatter is optional; missing fields leave the entity unchanged.
func readFrontmatter(e *Entity, content []byte) {
scanner := bufio.NewScanner(bytes.NewReader(content))
inFM := false
for scanner.Scan() {
line := scanner.Text()
if strings.TrimSpace(line) == "---" {
if !inFM {
inFM = true
continue
}
return
}
if !inFM {
return
}
key, val, ok := strings.Cut(line, ":")
if !ok {
continue
}
v := strings.Trim(strings.TrimSpace(val), `"'`)
switch strings.TrimSpace(key) {
case "title":
if e.Title == "" {
e.Title = v
}
case "wing":
if e.Wing == "" {
e.Wing = v
}
case "hall":
if e.Hall == "" {
e.Hall = v
}
case "tier":
if e.Tier == "" {
e.Tier = v
}
case "topic":
if e.Topic == "" {
e.Topic = v
}
}
}
}
func extractEdges(docPath, srcSlug string, content []byte) []Edge {
var edges []Edge
seen := make(map[string]struct{}) // dedupe (dst, line)
scanner := bufio.NewScanner(bytes.NewReader(content))
line := 0
for scanner.Scan() {
line++
matches := linkRE.FindAllStringSubmatch(scanner.Text(), -1)
for _, m := range matches {
dst := strings.TrimSpace(m[1])
if dst == "" || dst == srcSlug {
continue
}
key := dst + "|" + itoa(line)
if _, dup := seen[key]; dup {
continue
}
seen[key] = struct{}{}
edges = append(edges, Edge{
SrcDoc: docPath,
SrcSlug: srcSlug,
DstSlug: dst,
EdgeType: "wikilink",
SrcLine: line,
})
}
}
return edges
}
// itoa avoids the fmt dependency on a hot path. Single-digit fast path
// keeps overhead negligible for typical line counts.
func itoa(n int) string {
if n == 0 {
return "0"
}
var buf [20]byte
i := len(buf)
neg := n < 0
if neg {
n = -n
}
for n > 0 {
i--
buf[i] = byte('0' + n%10)
n /= 10
}
if neg {
i--
buf[i] = '-'
}
return string(buf[i:])
}

View File

@@ -0,0 +1,179 @@
package graph
import (
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestExtract_HallDoc(t *testing.T) {
content := []byte(`---
wing: jepa-fx
hall: decisions
title: Val Vol Decision
---
# Val Vol
See also [[other-decision]] and [[parent-concept|Parent Concept]].
Linking to [[unrelated]].
`)
ent, edges, ok := Extract("wiki/jepa-fx/decisions/val-vol.md", content)
require.True(t, ok)
assert.Equal(t, "val-vol", ent.Slug)
assert.Equal(t, "hall", ent.Type)
assert.Equal(t, "jepa-fx", ent.Wing)
assert.Equal(t, "decisions", ent.Hall)
assert.Equal(t, "Val Vol Decision", ent.Title)
require.Len(t, edges, 3)
assert.Equal(t, "other-decision", edges[0].DstSlug)
assert.Equal(t, "parent-concept", edges[1].DstSlug)
assert.Equal(t, "unrelated", edges[2].DstSlug)
for _, e := range edges {
assert.Equal(t, "wikilink", e.EdgeType)
assert.Equal(t, "val-vol", e.SrcSlug)
assert.Equal(t, "wiki/jepa-fx/decisions/val-vol.md", e.SrcDoc)
assert.Greater(t, e.SrcLine, 0)
}
}
func TestExtract_LegacyConceptDoc(t *testing.T) {
content := []byte(`---
title: Hash Encoding
---
# Hash Encoding
Linked to [[financial-sentiment-analysis|FSA]].
`)
ent, edges, ok := Extract("wiki/concepts/hash-encoding.md", content)
require.True(t, ok)
assert.Equal(t, "hash-encoding", ent.Slug)
assert.Equal(t, "concept", ent.Type)
assert.Empty(t, ent.Wing)
assert.Empty(t, ent.Hall)
assert.Equal(t, "Hash Encoding", ent.Title)
require.Len(t, edges, 1)
assert.Equal(t, "financial-sentiment-analysis", edges[0].DstSlug)
}
func TestExtract_KnowledgeDoc(t *testing.T) {
content := []byte("# No frontmatter, no links here.\n")
ent, edges, ok := Extract("knowledge/some-note.md", content)
require.True(t, ok)
assert.Equal(t, "some-note", ent.Slug)
assert.Equal(t, "knowledge", ent.Type)
assert.Empty(t, edges)
}
func TestExtract_DedupesRepeatedLinkOnSameLine(t *testing.T) {
content := []byte("See [[foo]] and [[foo]] again on the same line.\n")
_, edges, ok := Extract("knowledge/dup.md", content)
require.True(t, ok)
require.Len(t, edges, 1)
assert.Equal(t, "foo", edges[0].DstSlug)
}
func TestExtract_KeepsMultipleEdgesOnDifferentLines(t *testing.T) {
content := []byte("First mention [[foo]].\n\nSecond mention [[foo]].\n")
_, edges, ok := Extract("knowledge/multi.md", content)
require.True(t, ok)
require.Len(t, edges, 2)
assert.NotEqual(t, edges[0].SrcLine, edges[1].SrcLine)
}
func TestExtract_IgnoresSelfLinks(t *testing.T) {
content := []byte("Self-reference [[self]] should be ignored.\n")
_, edges, ok := Extract("knowledge/self.md", content)
require.True(t, ok)
assert.Empty(t, edges)
}
func TestExtract_RejectsNonMarkdown(t *testing.T) {
_, _, ok := Extract("wiki/concepts/not-markdown.txt", []byte("anything"))
assert.False(t, ok)
}
func TestExtract_LineNumbersAre1Indexed(t *testing.T) {
content := []byte("line 1\nline 2 [[bar]]\n")
_, edges, ok := Extract("knowledge/lines.md", content)
require.True(t, ok)
require.Len(t, edges, 1)
assert.Equal(t, 2, edges[0].SrcLine)
}
// Files directly under wiki/ (no subdirectory) used to land
// Type="hall" Wing="<filename>.md" because the path's second segment
// was the file itself. The fix routes them to Type="knowledge" with
// empty Wing/Hall and lets frontmatter set them if present.
func TestExtract_WikiRootFileIsKnowledgeNotHall(t *testing.T) {
content := []byte("# Index\n\n- [[foo]]\n")
ent, _, ok := Extract("wiki/index.md", content)
require.True(t, ok)
assert.Equal(t, "index", ent.Slug)
assert.Equal(t, "knowledge", ent.Type)
assert.Empty(t, ent.Wing)
assert.Empty(t, ent.Hall)
}
func TestExtract_TierFromFrontmatter(t *testing.T) {
content := []byte(`---
tier: knowledge
topic: postgres-roles
title: Least-privilege migration trap
---
# body
`)
ent, _, ok := Extract("knowledge/some-lesson.md", content)
require.True(t, ok)
assert.Equal(t, "knowledge", ent.Tier)
assert.Equal(t, "postgres-roles", ent.Topic)
}
func TestExtract_TierInferredFromPath(t *testing.T) {
cases := []struct {
path string
want string
}{
{"knowledge/foo.md", "knowledge"},
{"wiki/sources/x.md", "note"},
{"wiki/concepts/x.md", "note"},
{"wiki/x.md", "note"},
{"inbox/clips/x.md", "inbox"},
{"notes/x.md", "note"},
{"raw/x.md", "inbox"},
{"sessions/x.md", "inbox"},
}
for _, tc := range cases {
ent, _, ok := Extract(tc.path, []byte("# x\n"))
require.True(t, ok, tc.path)
assert.Equal(t, tc.want, ent.Tier, tc.path)
}
}
func TestExtract_FrontmatterTierBeatsPathInference(t *testing.T) {
// A clip explicitly promoted via frontmatter wins over the path's
// inbox inference. Catches the case where a file has been moved
// to a new location but frontmatter hasn't been updated.
content := []byte("---\ntier: knowledge\n---\n# x\n")
ent, _, ok := Extract("inbox/clips/x.md", content)
require.True(t, ok)
assert.Equal(t, "knowledge", ent.Tier)
}
func TestExtract_WikiRootFileWithFrontmatterWingHall(t *testing.T) {
content := []byte(`---
wing: homelab
hall: facts
---
# Some root note
`)
ent, _, ok := Extract("wiki/some-note.md", content)
require.True(t, ok)
assert.Equal(t, "knowledge", ent.Type)
assert.Equal(t, "homelab", ent.Wing)
assert.Equal(t, "facts", ent.Hall)
}

View File

@@ -0,0 +1,365 @@
// Package graphstore stores the brain knowledge graph (entities +
// directed edges) in PostgreSQL on the shared postgres18 instance,
// alongside the pgvector embeddings in [vectorstore].
//
// Schema (created idempotently by Init):
//
// brain_entities(slug PK, type, wing, hall, doc_path, title, updated_at)
// brain_edges(id PK, src_slug FK, dst_slug, edge_type, src_doc, src_line,
// weight, updated_at)
//
// Edges fan-out from a source document; calling [PGStore.ReplaceEdgesForDoc]
// replaces every edge previously emitted from that document so re-ingest is
// idempotent without bookkeeping.
//
// All slug strings are stored verbatim — callers are expected to canonicalise
// before persisting. Dst slugs may reference entities that don't yet exist
// (dangling edges); resolution is deferred to query time so ingestion order
// doesn't matter.
package graphstore
import (
"context"
"errors"
"fmt"
"github.com/jackc/pgx/v5"
"github.com/jackc/pgx/v5/pgxpool"
"github.com/mathiasbq/hyperguild/ingestion/internal/graph"
)
// PGStore is the postgres-backed brain knowledge-graph store. Construct
// with New + call Init once to create tables and indexes. Use Close to
// release the pool.
type PGStore struct {
pool *pgxpool.Pool
}
// New opens a pgxpool against dsn and pings to verify connectivity. The
// caller owns the resulting PGStore and must invoke Close.
func New(ctx context.Context, dsn string) (*PGStore, error) {
pool, err := pgxpool.New(ctx, dsn)
if err != nil {
return nil, fmt.Errorf("pgxpool: %w", err)
}
if err := pool.Ping(ctx); err != nil {
pool.Close()
return nil, fmt.Errorf("ping: %w", err)
}
return &PGStore{pool: pool}, nil
}
// Close releases the underlying connection pool.
func (s *PGStore) Close() {
if s.pool != nil {
s.pool.Close()
}
}
// Init creates brain_entities + brain_edges tables and their indexes if
// they don't yet exist. Safe to call on every startup. No-op when the
// schema already matches.
func (s *PGStore) Init(ctx context.Context) error {
const ddl = `
CREATE TABLE IF NOT EXISTS brain_entities (
slug TEXT PRIMARY KEY,
type TEXT NOT NULL DEFAULT 'knowledge',
wing TEXT NOT NULL DEFAULT '',
hall TEXT NOT NULL DEFAULT '',
doc_path TEXT NOT NULL,
title TEXT NOT NULL DEFAULT '',
tier TEXT NOT NULL DEFAULT '',
topic TEXT NOT NULL DEFAULT '',
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Idempotent migration for clusters created before the DIKW tier
-- redesign (infra#72). ADD COLUMN IF NOT EXISTS is safe across
-- repeated startups.
ALTER TABLE brain_entities
ADD COLUMN IF NOT EXISTS tier TEXT NOT NULL DEFAULT '',
ADD COLUMN IF NOT EXISTS topic TEXT NOT NULL DEFAULT '';
CREATE INDEX IF NOT EXISTS brain_entities_wing_idx
ON brain_entities (wing) WHERE wing <> '';
CREATE INDEX IF NOT EXISTS brain_entities_type_idx
ON brain_entities (type);
CREATE INDEX IF NOT EXISTS brain_entities_tier_idx
ON brain_entities (tier) WHERE tier <> '';
CREATE INDEX IF NOT EXISTS brain_entities_topic_idx
ON brain_entities (topic) WHERE topic <> '';
CREATE TABLE IF NOT EXISTS brain_edges (
id BIGSERIAL PRIMARY KEY,
src_slug TEXT NOT NULL,
dst_slug TEXT NOT NULL,
edge_type TEXT NOT NULL DEFAULT 'wikilink',
src_doc TEXT NOT NULL,
src_line INTEGER NOT NULL DEFAULT 0,
weight REAL NOT NULL DEFAULT 1.0,
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX IF NOT EXISTS brain_edges_src_idx
ON brain_edges (src_slug, edge_type);
CREATE INDEX IF NOT EXISTS brain_edges_dst_idx
ON brain_edges (dst_slug, edge_type);
CREATE INDEX IF NOT EXISTS brain_edges_src_doc_idx
ON brain_edges (src_doc);
`
_, err := s.pool.Exec(ctx, ddl)
return err
}
// UpsertEntity inserts or updates one entity by slug.
func (s *PGStore) UpsertEntity(ctx context.Context, e graph.Entity) error {
if e.Slug == "" {
return errors.New("entity slug is required")
}
if e.Type == "" {
e.Type = "knowledge"
}
_, err := s.pool.Exec(ctx, `
INSERT INTO brain_entities (slug, type, wing, hall, doc_path, title, tier, topic, updated_at)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, now())
ON CONFLICT (slug) DO UPDATE
SET type = EXCLUDED.type,
wing = EXCLUDED.wing,
hall = EXCLUDED.hall,
doc_path = EXCLUDED.doc_path,
title = EXCLUDED.title,
tier = EXCLUDED.tier,
topic = EXCLUDED.topic,
updated_at = now()
`, e.Slug, e.Type, e.Wing, e.Hall, e.DocPath, e.Title, e.Tier, e.Topic)
if err != nil {
return fmt.Errorf("upsert entity %q: %w", e.Slug, err)
}
return nil
}
// ReplaceEdgesForDoc deletes every edge previously emitted from docPath
// and inserts the new set in one transaction. Caller should pass the
// complete edge set for the doc — partial updates are not supported.
func (s *PGStore) ReplaceEdgesForDoc(ctx context.Context, docPath string, edges []graph.Edge) error {
if docPath == "" {
return errors.New("doc path is required")
}
tx, err := s.pool.BeginTx(ctx, pgx.TxOptions{})
if err != nil {
return fmt.Errorf("begin: %w", err)
}
defer func() { _ = tx.Rollback(ctx) }()
if _, err := tx.Exec(ctx, `DELETE FROM brain_edges WHERE src_doc = $1`, docPath); err != nil {
return fmt.Errorf("delete prior edges for %q: %w", docPath, err)
}
for _, e := range edges {
if e.SrcSlug == "" || e.DstSlug == "" {
continue
}
if _, err := tx.Exec(ctx, `
INSERT INTO brain_edges (src_slug, dst_slug, edge_type, src_doc, src_line, weight)
VALUES ($1, $2, $3, $4, $5, 1.0)
`, e.SrcSlug, e.DstSlug, e.EdgeType, e.SrcDoc, e.SrcLine); err != nil {
return fmt.Errorf("insert edge %s->%s: %w", e.SrcSlug, e.DstSlug, err)
}
}
if err := tx.Commit(ctx); err != nil {
return fmt.Errorf("commit: %w", err)
}
return nil
}
// DeleteByDoc removes the entity at docPath and every edge it sourced.
// Use when a wiki page is deleted on disk.
func (s *PGStore) DeleteByDoc(ctx context.Context, docPath string) error {
if docPath == "" {
return errors.New("doc path is required")
}
tx, err := s.pool.BeginTx(ctx, pgx.TxOptions{})
if err != nil {
return fmt.Errorf("begin: %w", err)
}
defer func() { _ = tx.Rollback(ctx) }()
if _, err := tx.Exec(ctx, `DELETE FROM brain_edges WHERE src_doc = $1`, docPath); err != nil {
return fmt.Errorf("delete edges: %w", err)
}
if _, err := tx.Exec(ctx, `DELETE FROM brain_entities WHERE doc_path = $1`, docPath); err != nil {
return fmt.Errorf("delete entity: %w", err)
}
return tx.Commit(ctx)
}
// Neighbor is one row in a Neighbors / Subgraph response.
type Neighbor struct {
Slug string
Type string
Wing string
Hall string
DocPath string
Title string
EdgeType string
Distance int // hop count from origin; 1 for direct neighbors
}
// Neighbors returns the direct (1-hop) outgoing neighbours of slug.
// edgeType filters by relationship kind; "" returns all kinds.
// limit defaults to 25 when <= 0.
func (s *PGStore) Neighbors(ctx context.Context, slug, edgeType string, limit int) ([]Neighbor, error) {
if slug == "" {
return nil, errors.New("slug is required")
}
if limit <= 0 {
limit = 25
}
q := `
SELECT e.dst_slug, COALESCE(t.type,''), COALESCE(t.wing,''), COALESCE(t.hall,''),
COALESCE(t.doc_path,''), COALESCE(t.title,''), e.edge_type, 1
FROM brain_edges e
LEFT JOIN brain_entities t ON t.slug = e.dst_slug
WHERE e.src_slug = $1
AND ($2 = '' OR e.edge_type = $2)
ORDER BY e.updated_at DESC
LIMIT $3
`
rows, err := s.pool.Query(ctx, q, slug, edgeType, limit)
if err != nil {
return nil, fmt.Errorf("query neighbors: %w", err)
}
defer rows.Close()
return scanNeighbors(rows)
}
// Subgraph returns every distinct slug reachable from origin within
// depth outgoing hops, annotated with the shortest hop distance. The
// origin itself is omitted. depth defaults to 2 when <= 0; values
// above 6 are clamped to 6 to bound traversal cost.
func (s *PGStore) Subgraph(ctx context.Context, origin string, depth int) ([]Neighbor, error) {
if origin == "" {
return nil, errors.New("origin slug is required")
}
if depth <= 0 {
depth = 2
}
if depth > 6 {
depth = 6
}
q := `
WITH RECURSIVE walk(slug, edge_type, distance) AS (
SELECT e.dst_slug, e.edge_type, 1
FROM brain_edges e
WHERE e.src_slug = $1
UNION
SELECT e.dst_slug, e.edge_type, w.distance + 1
FROM walk w
JOIN brain_edges e ON e.src_slug = w.slug
WHERE w.distance < $2
)
SELECT w.slug, COALESCE(t.type,''), COALESCE(t.wing,''), COALESCE(t.hall,''),
COALESCE(t.doc_path,''), COALESCE(t.title,''), w.edge_type, MIN(w.distance)
FROM walk w
LEFT JOIN brain_entities t ON t.slug = w.slug
WHERE w.slug <> $1
GROUP BY w.slug, t.type, t.wing, t.hall, t.doc_path, t.title, w.edge_type
ORDER BY MIN(w.distance), w.slug
`
rows, err := s.pool.Query(ctx, q, origin, depth)
if err != nil {
return nil, fmt.Errorf("query subgraph: %w", err)
}
defer rows.Close()
return scanNeighbors(rows)
}
// PathStep is one hop in a Path response.
type PathStep struct {
FromSlug string
ToSlug string
EdgeType string
}
// Path returns the shortest directed path from src to dst within
// maxDepth hops, as an ordered list of edges. Empty slice means no
// path exists. maxDepth defaults to 4 when <= 0; values above 8 are
// clamped to 8.
func (s *PGStore) Path(ctx context.Context, src, dst string, maxDepth int) ([]PathStep, error) {
if src == "" || dst == "" {
return nil, errors.New("src and dst are required")
}
if maxDepth <= 0 {
maxDepth = 4
}
if maxDepth > 8 {
maxDepth = 8
}
q := `
WITH RECURSIVE walk(cur, path_slugs, path_edges, distance) AS (
SELECT e.dst_slug,
ARRAY[e.src_slug, e.dst_slug]::TEXT[],
ARRAY[e.edge_type]::TEXT[],
1
FROM brain_edges e
WHERE e.src_slug = $1
UNION ALL
SELECT e.dst_slug,
w.path_slugs || e.dst_slug,
w.path_edges || e.edge_type,
w.distance + 1
FROM walk w
JOIN brain_edges e ON e.src_slug = w.cur
WHERE w.distance < $3
AND NOT (e.dst_slug = ANY(w.path_slugs))
)
SELECT path_slugs, path_edges
FROM walk
WHERE cur = $2
ORDER BY distance ASC
LIMIT 1
`
row := s.pool.QueryRow(ctx, q, src, dst, maxDepth)
var (
slugs []string
kinds []string
)
if err := row.Scan(&slugs, &kinds); err != nil {
if errors.Is(err, pgx.ErrNoRows) {
return nil, nil
}
return nil, fmt.Errorf("scan path: %w", err)
}
if len(slugs) < 2 || len(kinds) == 0 {
return nil, nil
}
steps := make([]PathStep, 0, len(kinds))
for i := 0; i < len(kinds) && i+1 < len(slugs); i++ {
steps = append(steps, PathStep{
FromSlug: slugs[i],
ToSlug: slugs[i+1],
EdgeType: kinds[i],
})
}
return steps, nil
}
// CountEdges is a debug helper — returns the total edges currently stored.
// Used by tests and by the volume-gate diagnostic.
func (s *PGStore) CountEdges(ctx context.Context) (int64, error) {
var n int64
err := s.pool.QueryRow(ctx, `SELECT count(*) FROM brain_edges`).Scan(&n)
return n, err
}
func scanNeighbors(rows pgx.Rows) ([]Neighbor, error) {
var out []Neighbor
for rows.Next() {
var n Neighbor
if err := rows.Scan(
&n.Slug, &n.Type, &n.Wing, &n.Hall,
&n.DocPath, &n.Title, &n.EdgeType, &n.Distance,
); err != nil {
return nil, fmt.Errorf("scan: %w", err)
}
out = append(out, n)
}
return out, rows.Err()
}

View File

@@ -0,0 +1,112 @@
// Package graphsync glues the disk-resident brain markdown documents to
// the relational graph in [graphstore]. It is a tiny seam so that the
// MCP handlers can call one function after every successful write or
// ingest without having to know either the parser or the postgres
// schema.
//
// Every operation is best-effort from the caller's perspective: if the
// graph store is unconfigured or the doc parses to nothing usable, the
// helpers return nil. Real database errors are surfaced so the caller
// can log them.
package graphsync
import (
"context"
"fmt"
"os"
"path/filepath"
"github.com/mathiasbq/hyperguild/ingestion/internal/graph"
"github.com/mathiasbq/hyperguild/ingestion/internal/graphstore"
)
// Store is the subset of graphstore.PGStore that graphsync requires.
// Tests can substitute a fake by satisfying this interface.
type Store interface {
UpsertEntity(ctx context.Context, e graph.Entity) error
ReplaceEdgesForDoc(ctx context.Context, docPath string, edges []graph.Edge) error
DeleteByDoc(ctx context.Context, docPath string) error
}
// Compile-time assertion that *graphstore.PGStore satisfies Store.
var _ Store = (*graphstore.PGStore)(nil)
// IndexDoc reads docPath under brainDir and pushes one Entity + its
// outgoing wikilink Edges into store. relPath must be the
// forward-slash path relative to brainDir (the same shape returned by
// api.WriteNote).
//
// nil store is a valid no-op so callers can wire the helper
// unconditionally and let configuration decide whether the graph is
// populated.
func IndexDoc(ctx context.Context, store Store, brainDir, relPath string) error {
if store == nil {
return nil
}
if relPath == "" {
return nil
}
abs := filepath.Join(brainDir, filepath.FromSlash(relPath))
content, err := os.ReadFile(abs)
if err != nil {
return fmt.Errorf("read %q: %w", relPath, err)
}
ent, edges, ok := graph.Extract(relPath, content)
if !ok {
return nil
}
if err := store.UpsertEntity(ctx, ent); err != nil {
return fmt.Errorf("upsert entity: %w", err)
}
if err := store.ReplaceEdgesForDoc(ctx, relPath, edges); err != nil {
return fmt.Errorf("replace edges: %w", err)
}
return nil
}
// BackfillFromBrainDir walks every markdown file under brainDir/wiki/
// and brainDir/knowledge/, parses each, and upserts the resulting
// Entity + Edges. Existing rows are overwritten; orphan rows for
// already-deleted files are NOT cleaned up — call this only on a
// fresh store, or follow with a separate prune pass.
//
// Intended for one-shot startup runs against a populated brain dir.
// Cost scales linearly with corpus size; ~30 wiki pages plus the
// knowledge corpus is a few hundred ms.
func BackfillFromBrainDir(ctx context.Context, store Store, brainDir string) (indexed int, _ error) {
if store == nil {
return 0, nil
}
roots := []string{"wiki", "knowledge"}
for _, root := range roots {
base := filepath.Join(brainDir, root)
if _, err := os.Stat(base); os.IsNotExist(err) {
continue
}
err := filepath.WalkDir(base, func(path string, d os.DirEntry, walkErr error) error {
if walkErr != nil {
return walkErr
}
if d.IsDir() {
return nil
}
if filepath.Ext(path) != ".md" {
return nil
}
rel, relErr := filepath.Rel(brainDir, path)
if relErr != nil {
return fmt.Errorf("rel %q: %w", path, relErr)
}
rel = filepath.ToSlash(rel)
if err := IndexDoc(ctx, store, brainDir, rel); err != nil {
return fmt.Errorf("index %q: %w", rel, err)
}
indexed++
return nil
})
if err != nil {
return indexed, fmt.Errorf("walk %s: %w", root, err)
}
}
return indexed, nil
}

View File

@@ -0,0 +1,134 @@
package graphsync
import (
"context"
"errors"
"os"
"path/filepath"
"sync"
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"github.com/mathiasbq/hyperguild/ingestion/internal/graph"
)
// fakeStore captures the calls IndexDoc / BackfillFromBrainDir made.
type fakeStore struct {
mu sync.Mutex
upserts []graph.Entity
replaces map[string][]graph.Edge
deletes []string
failOn string // upsert fails when entity slug == failOn
}
func newFakeStore() *fakeStore {
return &fakeStore{replaces: make(map[string][]graph.Edge)}
}
func (f *fakeStore) UpsertEntity(_ context.Context, e graph.Entity) error {
f.mu.Lock()
defer f.mu.Unlock()
if f.failOn != "" && e.Slug == f.failOn {
return errors.New("synthetic failure")
}
f.upserts = append(f.upserts, e)
return nil
}
func (f *fakeStore) ReplaceEdgesForDoc(_ context.Context, docPath string, edges []graph.Edge) error {
f.mu.Lock()
defer f.mu.Unlock()
f.replaces[docPath] = append([]graph.Edge(nil), edges...)
return nil
}
func (f *fakeStore) DeleteByDoc(_ context.Context, docPath string) error {
f.mu.Lock()
defer f.mu.Unlock()
f.deletes = append(f.deletes, docPath)
return nil
}
func writeBrain(t *testing.T, brainDir, relPath, body string) {
t.Helper()
full := filepath.Join(brainDir, filepath.FromSlash(relPath))
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
require.NoError(t, os.WriteFile(full, []byte(body), 0o644))
}
func TestIndexDoc_UpsertsEntityAndEdges(t *testing.T) {
tmp := t.TempDir()
writeBrain(t, tmp, "wiki/concepts/foo.md", `---
title: Foo
---
# Foo
Linking to [[bar]] and [[baz|Baz]].
`)
fs := newFakeStore()
require.NoError(t, IndexDoc(context.Background(), fs, tmp, "wiki/concepts/foo.md"))
require.Len(t, fs.upserts, 1)
assert.Equal(t, "foo", fs.upserts[0].Slug)
assert.Equal(t, "concept", fs.upserts[0].Type)
edges := fs.replaces["wiki/concepts/foo.md"]
require.Len(t, edges, 2)
assert.Equal(t, "bar", edges[0].DstSlug)
assert.Equal(t, "baz", edges[1].DstSlug)
}
func TestIndexDoc_NoopOnNilStore(t *testing.T) {
require.NoError(t, IndexDoc(context.Background(), nil, "anywhere", "foo.md"))
}
func TestIndexDoc_NoopOnEmptyRelPath(t *testing.T) {
fs := newFakeStore()
require.NoError(t, IndexDoc(context.Background(), fs, "anywhere", ""))
assert.Empty(t, fs.upserts)
}
func TestIndexDoc_ErrorsOnMissingFile(t *testing.T) {
fs := newFakeStore()
err := IndexDoc(context.Background(), fs, t.TempDir(), "wiki/nope.md")
require.Error(t, err)
}
func TestIndexDoc_SurfacesStoreFailure(t *testing.T) {
tmp := t.TempDir()
writeBrain(t, tmp, "wiki/concepts/boom.md", "# Boom\n")
fs := newFakeStore()
fs.failOn = "boom"
err := IndexDoc(context.Background(), fs, tmp, "wiki/concepts/boom.md")
require.Error(t, err)
}
func TestBackfillFromBrainDir_WalksWikiAndKnowledge(t *testing.T) {
tmp := t.TempDir()
writeBrain(t, tmp, "wiki/concepts/foo.md", "# Foo\n[[bar]]\n")
writeBrain(t, tmp, "wiki/entities/bar.md", "# Bar\n")
writeBrain(t, tmp, "knowledge/legacy.md", "# Legacy [[foo]]\n")
// non-markdown file should be skipped
writeBrain(t, tmp, "wiki/concepts/skip.txt", "ignore me")
fs := newFakeStore()
n, err := BackfillFromBrainDir(context.Background(), fs, tmp)
require.NoError(t, err)
assert.Equal(t, 3, n)
assert.Len(t, fs.upserts, 3)
}
func TestBackfillFromBrainDir_TolerantOfMissingDirs(t *testing.T) {
tmp := t.TempDir()
fs := newFakeStore()
n, err := BackfillFromBrainDir(context.Background(), fs, tmp)
require.NoError(t, err)
assert.Equal(t, 0, n)
}
func TestBackfillFromBrainDir_NilStoreNoop(t *testing.T) {
n, err := BackfillFromBrainDir(context.Background(), nil, t.TempDir())
require.NoError(t, err)
assert.Equal(t, 0, n)
}

View File

@@ -0,0 +1,29 @@
package llm
import (
"context"
"fmt"
)
// Router calls Primary first; on any error falls back to Fallback.
// Fallback may be nil, in which case primary errors are returned directly.
type Router struct {
Primary *Client
Fallback *Client
}
// Complete implements pipeline.CompleteFunc, routing through Primary then Fallback.
func (r *Router) Complete(ctx context.Context, system, user string) (string, error) {
out, err := r.Primary.Complete(ctx, system, user)
if err == nil {
return out, nil
}
if r.Fallback == nil {
return "", fmt.Errorf("primary llm: %w", err)
}
out, err2 := r.Fallback.Complete(ctx, system, user)
if err2 != nil {
return "", fmt.Errorf("primary llm: %w; fallback llm: %v", err, err2)
}
return out, nil
}

View File

@@ -0,0 +1,71 @@
package llm
import (
"context"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestRouter_PrimarySucceeds(t *testing.T) {
primary := mockServer(t, "from-primary")
defer primary.Close()
fallback := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
t.Error("fallback must not be called when primary succeeds")
}))
defer fallback.Close()
r := &Router{
Primary: New(primary.URL, "", "m", time.Second),
Fallback: New(fallback.URL, "", "m", time.Second),
}
out, err := r.Complete(context.Background(), "sys", "user")
require.NoError(t, err)
assert.Equal(t, "from-primary", out)
}
func TestRouter_FallsBackOnPrimaryError(t *testing.T) {
primary := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
http.Error(w, "unavailable", http.StatusServiceUnavailable)
}))
defer primary.Close()
fallback := mockServer(t, "from-fallback")
defer fallback.Close()
r := &Router{
Primary: New(primary.URL, "", "m", time.Second),
Fallback: New(fallback.URL, "", "m", time.Second),
}
out, err := r.Complete(context.Background(), "sys", "user")
require.NoError(t, err)
assert.Equal(t, "from-fallback", out)
}
func TestRouter_BothFail(t *testing.T) {
fail := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
http.Error(w, "err", http.StatusBadGateway)
}))
defer fail.Close()
r := &Router{
Primary: New(fail.URL, "", "m", time.Second),
Fallback: New(fail.URL, "", "m", time.Second),
}
_, err := r.Complete(context.Background(), "sys", "user")
assert.Error(t, err)
}
func TestRouter_NilFallback(t *testing.T) {
fail := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
http.Error(w, "err", http.StatusBadGateway)
}))
defer fail.Close()
r := &Router{Primary: New(fail.URL, "", "m", time.Second)}
_, err := r.Complete(context.Background(), "sys", "user")
assert.Error(t, err)
}

View File

@@ -4,12 +4,15 @@ import (
"context"
"encoding/json"
"fmt"
"log/slog"
"path/filepath"
"strings"
"time"
"github.com/mathiasbq/hyperguild/ingestion/internal/api"
"github.com/mathiasbq/hyperguild/ingestion/internal/brain"
"github.com/mathiasbq/hyperguild/ingestion/internal/extract"
"github.com/mathiasbq/hyperguild/ingestion/internal/graphsync"
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
"github.com/mathiasbq/hyperguild/ingestion/internal/search"
"github.com/mathiasbq/hyperguild/ingestion/internal/session"
@@ -24,6 +27,10 @@ func (s *Server) tools() []map[string]any {
int_ := func(desc string) map[string]any {
return map[string]any{"type": "integer", "description": desc}
}
enum := func(desc string, vals ...string) map[string]any {
return map[string]any{"type": "string", "description": desc, "enum": vals}
}
halls := []string{"facts", "decisions", "failures", "hypotheses", "sources"}
schema := func(required []string, props map[string]any) json.RawMessage {
b, _ := json.Marshal(map[string]any{
"type": "object", "required": required, "properties": props,
@@ -34,20 +41,39 @@ func (s *Server) tools() []map[string]any {
return []map[string]any{
{
"name": "brain_query",
"description": "BM25 full-text search across brain/knowledge/ and brain/wiki/ markdown files.",
"description": "BM25 full-text search across brain/knowledge/ and brain/wiki/ markdown files. Optionally scope by wing (topic domain) and hall (memory type).",
"inputSchema": schema([]string{"query"}, map[string]any{
"query": str("search terms"),
"limit": int_("max results, default 5"),
"wing": str("optional wing to scope to, e.g. jepa-fx"),
"hall": enum("optional hall to scope to (requires wing)", halls...),
}),
},
{
"name": "brain_write",
"description": "Write a raw knowledge note to brain/knowledge/.",
"description": "Write a markdown note to the brain. With wing+hall set, routes to brain/wiki/<wing>/<hall>/ with wing/hall/created_at frontmatter; otherwise writes to brain/knowledge/ (legacy).",
"inputSchema": schema([]string{"content"}, map[string]any{
"content": str("markdown content"),
"filename": str("optional filename"),
"type": str("optional frontmatter type"),
"domain": str("optional frontmatter domain"),
"filename": str("optional filename or slug"),
"type": str("optional frontmatter type (legacy)"),
"domain": str("optional frontmatter domain (legacy)"),
"wing": str("optional topic domain, e.g. jepa-fx"),
"hall": enum("optional memory type (requires wing)", halls...),
}),
},
{
"name": "brain_tunnel",
"description": "Create an explicit bidirectional [[wikilink]] between two notes in different wings. Idempotent.",
"inputSchema": schema([]string{"source", "target"}, map[string]any{
"source": str("path of source note relative to brain dir, e.g. wiki/jepa-fx/decisions/val-vol.md"),
"target": str("path of target note (must be in a different wing)"),
}),
},
{
"name": "brain_index",
"description": "Regenerate _index.md (Map of Content) for one or all wings under brain/wiki/. Auto-called after brain_write with wing+hall.",
"inputSchema": schema([]string{}, map[string]any{
"wing": str("optional wing to index; if absent, rebuilds every wing"),
}),
},
{
@@ -69,6 +95,46 @@ func (s *Server) tools() []map[string]any {
"dry_run": map[string]any{"type": "boolean"},
}),
},
{
"name": "brain_answer",
"description": "Retrieve relevant brain content via BM25 and synthesize a coherent answer using an LLM.",
"inputSchema": schema([]string{"query"}, map[string]any{
"query": str("question to answer"),
}),
},
{
"name": "brain_classify",
"description": "Classify raw text into doc type, title, and tags using an LLM.",
"inputSchema": schema([]string{"text"}, map[string]any{
"text": str("raw document text to classify (first 3000 chars used)"),
}),
},
{
"name": "brain_graph",
"description": "Query the brain knowledge graph (entities + wikilink edges). Op selects the traversal: neighbors (1-hop outgoing from slug), subgraph (every reachable slug within depth hops), or path (shortest directed path src→dst). Returns slug + entity metadata + edge_type + hop distance.",
"inputSchema": schema([]string{"op"}, map[string]any{
"op": enum("traversal kind", "neighbors", "subgraph", "path"),
"slug": str("origin slug for op=neighbors or op=subgraph"),
"src": str("source slug for op=path"),
"dst": str("destination slug for op=path"),
"edge_type": str("optional edge type filter for op=neighbors (e.g. wikilink); empty matches all"),
"limit": int_("max neighbors to return for op=neighbors, default 25"),
"depth": int_("max traversal depth for op=subgraph (default 2, clamped to 6) and op=path (default 4, clamped to 8)"),
}),
},
{
"name": "brain_context",
"description": "Return top-N relevant brain entries for a project context. Use at session start or before a complex task to load prior decisions, corrections, and surprises.",
"inputSchema": schema([]string{"project_root"}, map[string]any{
"project_root": str("absolute path to the project root"),
"recent_files": map[string]any{
"type": "array",
"items": map[string]any{"type": "string"},
"description": "optional: recent file paths in the project to bias relevance",
},
"limit": int_("max entries to return, default 10"),
}),
},
{
"name": "session_log",
"description": "Append a structured entry to brain/sessions/<session_id>.jsonl.",
@@ -77,7 +143,7 @@ func (s *Server) tools() []map[string]any {
"skill": str("skill name"),
"phase": str("phase within the skill"),
"project_root": str("absolute project root"),
"final_status": str("ok | error | skipped"),
"final_status": str("pass | fail | skip (legacy: ok | error | skipped also accepted)"),
"file_path": str("optional file produced"),
"model_used": str("optional model identifier"),
"duration_ms": int_("optional duration in ms"),
@@ -90,6 +156,8 @@ func (s *Server) tools() []map[string]any {
type brainQueryArgs struct {
Query string `json:"query"`
Limit int `json:"limit,omitempty"`
Wing string `json:"wing,omitempty"`
Hall string `json:"hall,omitempty"`
}
func (s *Server) brainQuery(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
@@ -103,7 +171,14 @@ func (s *Server) brainQuery(ctx context.Context, args json.RawMessage) (json.Raw
if a.Limit == 0 {
a.Limit = 5
}
results, err := search.Query(s.brainDir, a.Query, a.Limit)
results, err := search.QueryContext(ctx, s.brainDir, search.QueryOptions{
Query: a.Query,
Limit: a.Limit,
Wing: a.Wing,
Hall: a.Hall,
Vector: s.vector,
Embedder: s.embedder,
})
if err != nil {
return nil, fmt.Errorf("search: %w", err)
}
@@ -115,6 +190,8 @@ type brainWriteArgs struct {
Filename string `json:"filename,omitempty"`
Type string `json:"type,omitempty"`
Domain string `json:"domain,omitempty"`
Wing string `json:"wing,omitempty"`
Hall string `json:"hall,omitempty"`
}
func (s *Server) brainWrite(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
@@ -122,13 +199,89 @@ func (s *Server) brainWrite(ctx context.Context, args json.RawMessage) (json.Raw
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
relPath, err := api.WriteNote(s.brainDir, a.Content, a.Filename, a.Type, a.Domain)
relPath, err := api.WriteNote(s.brainDir, api.WriteNoteOptions{
Content: a.Content,
Filename: a.Filename,
Type: a.Type,
Domain: a.Domain,
Wing: a.Wing,
Hall: a.Hall,
})
if err != nil {
return nil, err
}
// Auto-regenerate the wing _index.md when the write landed in the
// structured wiki, and auto-tunnel cross-wing matches. Both are
// best-effort: the note is already written.
if a.Wing != "" && a.Hall != "" {
if err := brain.BuildWingIndex(s.brainDir, a.Wing); err != nil {
slog.Warn("brain_write: auto-index failed", "wing", a.Wing, "err", err)
}
if err := brain.AutoTunnel(s.brainDir, relPath, a.Content); err != nil {
slog.Warn("brain_write: auto-tunnel failed", "src", relPath, "err", err)
}
}
s.indexInGraph(ctx, "brain_write", relPath)
return json.Marshal(map[string]string{"path": relPath})
}
// indexInGraph is a best-effort wrapper around graphsync.IndexDoc that
// logs failures but never propagates them — the underlying write/ingest
// has already succeeded and the graph is an augmentation, not a
// correctness invariant.
func (s *Server) indexInGraph(ctx context.Context, op, relPath string) {
if s.graph == nil || relPath == "" {
return
}
if err := graphsync.IndexDoc(ctx, s.graph, s.brainDir, relPath); err != nil {
slog.Warn(op+": graph index failed", "path", relPath, "err", err)
}
}
type brainTunnelArgs struct {
Source string `json:"source"`
Target string `json:"target"`
}
func (s *Server) brainTunnel(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
var a brainTunnelArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.Source == "" || a.Target == "" {
return nil, fmt.Errorf("source and target are required")
}
if err := brain.WriteTunnel(s.brainDir, a.Source, a.Target); err != nil {
return nil, fmt.Errorf("tunnel: %w", err)
}
s.indexInGraph(ctx, "brain_tunnel", a.Source)
s.indexInGraph(ctx, "brain_tunnel", a.Target)
return json.Marshal(map[string]string{"status": "ok"})
}
type brainIndexArgs struct {
Wing string `json:"wing,omitempty"`
}
func (s *Server) brainIndex(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
var a brainIndexArgs
if len(args) > 0 {
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
}
if a.Wing == "" {
if err := brain.BuildAllWingIndexes(s.brainDir); err != nil {
return nil, fmt.Errorf("index: %w", err)
}
return json.Marshal(map[string]any{"status": "ok", "scope": "all"})
}
if err := brain.BuildWingIndex(s.brainDir, a.Wing); err != nil {
return nil, fmt.Errorf("index: %w", err)
}
return json.Marshal(map[string]any{"status": "ok", "scope": a.Wing})
}
type brainIngestRawArgs struct {
Source string `json:"source"`
Pages []pipeline.RawPage `json:"pages"`
@@ -158,6 +311,11 @@ func (s *Server) brainIngestRaw(ctx context.Context, args json.RawMessage) (json
if warnings == nil {
warnings = []string{}
}
if !a.DryRun {
for _, p := range pages {
s.indexInGraph(ctx, "brain_ingest_raw", p)
}
}
return json.Marshal(map[string]any{"pages": pages, "warnings": warnings})
}
@@ -248,6 +406,11 @@ func (s *Server) runIngest(ctx context.Context, content, source string, dryRun b
if pages == nil {
pages = []string{}
}
if !dryRun {
for _, p := range pages {
s.indexInGraph(ctx, "brain_ingest", p)
}
}
warnings := result.Warnings
if warnings == nil {
warnings = []string{}

View File

@@ -40,7 +40,7 @@ func TestBrainQueryReturnsResults(t *testing.T) {
0o644,
))
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_query", map[string]any{"query": "tdd"})
require.Nil(t, resp["error"])
@@ -53,7 +53,7 @@ func TestBrainQueryReturnsResults(t *testing.T) {
func TestBrainWriteCreatesFile(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "# Test\n\nbody",
@@ -70,9 +70,147 @@ func TestBrainWriteCreatesFile(t *testing.T) {
assert.Contains(t, string(got), "# Test")
}
func TestBrainWriteWingHallRoutesToWiki(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "# Val Vol\n\nbody",
"filename": "val-vol-r2",
"wing": "jepa-fx",
"hall": "decisions",
})
require.Nil(t, resp["error"])
got, err := os.ReadFile(filepath.Join(brainDir, "wiki", "jepa-fx", "decisions", "val-vol-r2.md"))
require.NoError(t, err)
assert.Contains(t, string(got), "wing: jepa-fx")
assert.Contains(t, string(got), "hall: decisions")
assert.Contains(t, string(got), "created_at:")
assert.Contains(t, string(got), "# Val Vol")
}
func TestBrainWriteRejectsInvalidHall(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "x",
"wing": "jepa-fx",
"hall": "garbage",
})
require.NotNil(t, resp["error"])
}
func TestBrainQueryWingScope(t *testing.T) {
brainDir := t.TempDir()
for _, p := range []struct{ rel, body string }{
{"wiki/jepa-fx/facts/x.md", "---\nwing: jepa-fx\nhall: facts\n---\nfoo keyword.\n"},
{"wiki/other/facts/y.md", "---\nwing: other\nhall: facts\n---\nfoo keyword.\n"},
} {
full := filepath.Join(brainDir, p.rel)
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
require.NoError(t, os.WriteFile(full, []byte(p.body), 0o644))
}
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_query", map[string]any{
"query": "foo",
"wing": "jepa-fx",
})
require.Nil(t, resp["error"])
text := resp["result"].(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
assert.Contains(t, text, "wiki/jepa-fx/facts/x.md")
assert.NotContains(t, text, "wiki/other/facts/y.md")
}
func TestBrainWriteAutoTunnelsOnExactMatch(t *testing.T) {
brainDir := t.TempDir()
// Seed a pre-existing note in wing "other".
existing := filepath.Join(brainDir, "wiki/other/facts/widget.md")
require.NoError(t, os.MkdirAll(filepath.Dir(existing), 0o755))
require.NoError(t, os.WriteFile(existing,
[]byte("---\nwing: other\nhall: facts\ntitle: Widget\n---\nbody.\n"), 0o644))
srv := mcp.NewServer(brainDir, nil, nil, nil)
// Write a new note in a *different* wing whose content references "Widget".
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "# Notes\n\nThis note discusses the Widget concept.\n",
"filename": "notes",
"wing": "jepa-fx",
"hall": "facts",
})
require.Nil(t, resp["error"])
newNote := filepath.Join(brainDir, "wiki/jepa-fx/facts/notes.md")
got, err := os.ReadFile(newNote)
require.NoError(t, err)
assert.Contains(t, string(got), "[[other/facts/widget]]", "new note should link to existing")
gotTgt, err := os.ReadFile(existing)
require.NoError(t, err)
assert.Contains(t, string(gotTgt), "[[jepa-fx/facts/notes]]", "existing note should backlink")
}
func TestBrainWriteAutoTunnelSkipsSameWing(t *testing.T) {
brainDir := t.TempDir()
existing := filepath.Join(brainDir, "wiki/jepa-fx/facts/widget.md")
require.NoError(t, os.MkdirAll(filepath.Dir(existing), 0o755))
require.NoError(t, os.WriteFile(existing,
[]byte("---\nwing: jepa-fx\nhall: facts\ntitle: Widget\n---\nbody.\n"), 0o644))
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "Same wing reference to Widget here.\n",
"filename": "notes",
"wing": "jepa-fx",
"hall": "facts",
})
require.Nil(t, resp["error"])
newNote := filepath.Join(brainDir, "wiki/jepa-fx/facts/notes.md")
got, err := os.ReadFile(newNote)
require.NoError(t, err)
assert.NotContains(t, string(got), "[[jepa-fx/facts/widget]]", "same-wing match must not auto-tunnel")
}
func TestBrainTunnelLinksTwoNotes(t *testing.T) {
brainDir := t.TempDir()
for _, p := range []struct{ rel, body string }{
{"wiki/jepa-fx/decisions/val-vol.md", "---\nwing: jepa-fx\nhall: decisions\n---\n# Val Vol\n"},
{"wiki/hyperguild/decisions/routing.md", "---\nwing: hyperguild\nhall: decisions\n---\n# Routing\n"},
} {
full := filepath.Join(brainDir, p.rel)
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
require.NoError(t, os.WriteFile(full, []byte(p.body), 0o644))
}
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_tunnel", map[string]any{
"source": "wiki/jepa-fx/decisions/val-vol.md",
"target": "wiki/hyperguild/decisions/routing.md",
})
require.Nil(t, resp["error"])
src, err := os.ReadFile(filepath.Join(brainDir, "wiki/jepa-fx/decisions/val-vol.md"))
require.NoError(t, err)
assert.Contains(t, string(src), "[[hyperguild/decisions/routing]]")
tgt, err := os.ReadFile(filepath.Join(brainDir, "wiki/hyperguild/decisions/routing.md"))
require.NoError(t, err)
assert.Contains(t, string(tgt), "[[jepa-fx/decisions/val-vol]]")
}
func TestBrainTunnelRejectsMissing(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_tunnel", map[string]any{
"source": "wiki/a/facts/ghost.md",
"target": "wiki/b/facts/ghost.md",
})
require.NotNil(t, resp["error"])
}
func TestBrainWriteRejectsTraversal(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "x",
@@ -83,7 +221,7 @@ func TestBrainWriteRejectsTraversal(t *testing.T) {
func TestBrainWriteAcceptsDoubleDotInName(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_write", map[string]any{
"content": "x",
@@ -98,7 +236,7 @@ func TestBrainWriteAcceptsDoubleDotInName(t *testing.T) {
func TestBrainIngestRawDryRun(t *testing.T) {
brainDir := t.TempDir()
require.NoError(t, os.MkdirAll(filepath.Join(brainDir, "wiki", "concepts"), 0o755))
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_ingest_raw", map[string]any{
"source": "test-source",
@@ -130,7 +268,7 @@ func TestBrainIngestRawDryRun(t *testing.T) {
func TestBrainIngestRejectsBoth(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_ingest", map[string]any{
"content": "x",
@@ -142,7 +280,7 @@ func TestBrainIngestRejectsBoth(t *testing.T) {
func TestBrainIngestRequiresOne(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_ingest", map[string]any{})
require.NotNil(t, resp["error"])
@@ -150,7 +288,7 @@ func TestBrainIngestRequiresOne(t *testing.T) {
func TestBrainIngestRejectsContentWithoutSource(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "brain_ingest", map[string]any{
"content": "x",
@@ -160,7 +298,7 @@ func TestBrainIngestRejectsContentWithoutSource(t *testing.T) {
func TestBrainIngestRequiresLLMConfigured(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil) // nil pipelineCfg → no LLM
srv := mcp.NewServer(brainDir, nil, nil, nil) // nil pipelineCfg → no LLM
resp := toolCall(t, srv, "brain_ingest", map[string]any{
"content": "some content",
@@ -173,7 +311,7 @@ func TestBrainIngestRequiresLLMConfigured(t *testing.T) {
func TestSessionLogAppends(t *testing.T) {
brainDir := t.TempDir()
srv := mcp.NewServer(brainDir, nil, nil)
srv := mcp.NewServer(brainDir, nil, nil, nil)
resp := toolCall(t, srv, "session_log", map[string]any{
"session_id": "session-x",
@@ -190,7 +328,7 @@ func TestSessionLogAppends(t *testing.T) {
}
func TestSessionLogRequiresSessionID(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
resp := toolCall(t, srv, "session_log", map[string]any{"skill": "tdd"})
require.NotNil(t, resp["error"])
}

View File

@@ -14,7 +14,7 @@ import (
)
func TestMCPMountedHandler(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
mux := http.NewServeMux()
mux.Handle("POST /mcp", srv)
@@ -27,7 +27,7 @@ func TestMCPMountedHandler(t *testing.T) {
require.NoError(t, err)
resp, err := http.Post(ts.URL+"/mcp", "application/json", bytes.NewReader(body))
require.NoError(t, err)
defer resp.Body.Close()
defer func() { _ = resp.Body.Close() }()
assert.Equal(t, http.StatusOK, resp.StatusCode)
out, _ := io.ReadAll(resp.Body)

View File

@@ -1,5 +1,7 @@
// Package mcp implements an MCP HTTP handler for the ingestion service.
// Exposed tools: brain_query, brain_write, brain_ingest, brain_ingest_raw, session_log.
// Exposed tools: brain_query, brain_write, brain_index, brain_tunnel,
// brain_ingest, brain_ingest_raw, brain_answer, brain_classify,
// brain_graph, brain_context, session_log.
package mcp
import (
@@ -8,7 +10,11 @@ import (
"fmt"
"net/http"
"github.com/mathiasbq/hyperguild/ingestion/internal/graphstore"
"github.com/mathiasbq/hyperguild/ingestion/internal/graphsync"
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
"github.com/mathiasbq/hyperguild/ingestion/internal/reranker"
"github.com/mathiasbq/hyperguild/ingestion/internal/search"
)
type request struct {
@@ -35,19 +41,70 @@ type Server struct {
brainDir string
pipeline pipeline.Config
llm pipeline.CompleteFunc
answerLLM pipeline.CompleteFunc // nil = brain_answer and brain_classify unavailable
reranker *reranker.Client // nil = no rerank, BM25 top-10 → LLM
vector search.VectorSearcher // nil = BM25-only retrieval
embedder search.Embedder // nil = BM25-only retrieval
graph graphsync.Store // nil = brain_graph and GraphRAG augmentation disabled
}
// NewServer constructs a Server bound to brainDir. pipelineCfg supplies the
// LLM-backed pipeline; llm may be nil for non-LLM tools only.
func NewServer(brainDir string, pipelineCfg *pipeline.Config, llm pipeline.CompleteFunc) *Server {
// answerLLM drives brain_answer and brain_classify; nil disables those tools.
func NewServer(brainDir string, pipelineCfg *pipeline.Config, llm pipeline.CompleteFunc, answerLLM pipeline.CompleteFunc) *Server {
cfg := pipeline.Config{}
if pipelineCfg != nil {
cfg = *pipelineCfg
}
return &Server{brainDir: brainDir, pipeline: cfg, llm: llm}
return &Server{brainDir: brainDir, pipeline: cfg, llm: llm, answerLLM: answerLLM}
}
// WithReranker installs an opt-in cross-encoder reranker. When set,
// brain_answer retrieves a wider BM25 candidate set and prunes it to
// the relevant ones before LLM synthesis. Returns the server for
// fluent chaining.
func (s *Server) WithReranker(r *reranker.Client) *Server {
s.reranker = r
return s
}
// WithHybridRetrieval wires the embedding store and embedder so
// brain_query and brain_answer run BM25 + pgvector merged via RRF
// instead of BM25 alone. Either nil disables hybrid mode.
func (s *Server) WithHybridRetrieval(v search.VectorSearcher, e search.Embedder) *Server {
s.vector = v
s.embedder = e
return s
}
// WithGraph wires the brain entities + edges store so every successful
// brain_write / brain_ingest / brain_tunnel re-indexes its written docs
// into the graph, and so brain_graph + GraphRAG-augmented brain_answer
// are available. nil disables graph features and is the legacy default.
func (s *Server) WithGraph(g *graphstore.PGStore) *Server {
if g == nil {
s.graph = nil
return s
}
s.graph = g
return s
}
func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// MCP streamable HTTP: GET establishes the SSE stream for server-to-client events.
if r.Method == http.MethodGet {
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
w.Header().Set("X-Accel-Buffering", "no")
w.WriteHeader(http.StatusOK)
if f, ok := w.(http.Flusher); ok {
f.Flush()
}
<-r.Context().Done()
return
}
var req request
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
writeError(w, nil, -32700, "parse error")
@@ -120,12 +177,24 @@ func (s *Server) handleCall(ctx context.Context, name string, args json.RawMessa
return s.brainQuery(ctx, args)
case "brain_write":
return s.brainWrite(ctx, args)
case "brain_index":
return s.brainIndex(ctx, args)
case "brain_tunnel":
return s.brainTunnel(ctx, args)
case "brain_ingest_raw":
return s.brainIngestRaw(ctx, args)
case "brain_ingest":
return s.brainIngest(ctx, args)
case "session_log":
return s.sessionLog(ctx, args)
case "brain_answer":
return s.brainAnswer(ctx, args)
case "brain_classify":
return s.brainClassify(ctx, args)
case "brain_graph":
return s.brainGraph(ctx, args)
case "brain_context":
return s.brainContext(ctx, args)
default:
return nil, fmt.Errorf("unknown tool: %s", name)
}

View File

@@ -21,7 +21,7 @@ func body(t *testing.T, v any) *bytes.Buffer {
}
func TestServerInitialize(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "id": 1, "method": "initialize",
@@ -38,7 +38,7 @@ func TestServerInitialize(t *testing.T) {
}
func TestServerToolsList(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "id": 2, "method": "tools/list",
@@ -55,12 +55,15 @@ func TestServerToolsList(t *testing.T) {
names = append(names, t.(map[string]any)["name"].(string))
}
assert.ElementsMatch(t, []string{
"brain_query", "brain_write", "brain_ingest_raw", "brain_ingest", "session_log",
"brain_query", "brain_write", "brain_index", "brain_tunnel",
"brain_ingest_raw", "brain_ingest",
"brain_answer", "brain_classify", "brain_graph", "brain_context",
"session_log",
}, names)
}
func TestServerNotificationGetsNoBody(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "method": "notifications/initialized",
@@ -73,7 +76,7 @@ func TestServerNotificationGetsNoBody(t *testing.T) {
}
func TestServerUnknownMethodReturnsError(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil)
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
req := httptest.NewRequest(http.MethodPost, "/mcp", body(t, map[string]any{
"jsonrpc": "2.0", "id": 3, "method": "unknown/method",

View File

@@ -0,0 +1,199 @@
package mcp
import (
"context"
"encoding/json"
"fmt"
"strings"
"github.com/mathiasbq/hyperguild/ingestion/internal/reranker"
"github.com/mathiasbq/hyperguild/ingestion/internal/search"
)
// rerankResults scores each candidate's excerpt against the query and
// returns up to top results whose score is positive, preserving the
// caller's input order (BM25 rank) within the kept set. The reranker is
// a filter: ties are broken by BM25, not by the reranker's binary score.
func rerankResults(ctx context.Context, rr *reranker.Client, query string, results []search.Result, top int) ([]search.Result, error) {
docs := make([]string, len(results))
for i, r := range results {
docs[i] = r.Excerpt
}
scores, err := rr.Score(ctx, query, docs)
if err != nil {
return nil, err
}
kept := make([]search.Result, 0, top)
for i, r := range results {
if scores[i] > 0 {
kept = append(kept, r)
}
if len(kept) == top {
break
}
}
return kept, nil
}
const (
answerSystemPrompt = `You are a knowledge assistant. Answer the question using ONLY the provided sources.
Cite source file paths inline when referencing specific content.
If the context does not contain enough information to answer, say so clearly.`
classifySystemPrompt = `Classify the document. Respond with JSON only, no markdown fences.
{"type":"...","title":"...","tags":["..."]}
Valid types: spec, plan, decision, note, wiki, log, code, unknown.`
)
type brainAnswerArgs struct {
Query string `json:"query"`
}
func (s *Server) brainAnswer(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
if s.answerLLM == nil {
return nil, fmt.Errorf("answer LLM not configured: set BRAIN_LLM_PRIMARY_URL")
}
var a brainAnswerArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.Query == "" {
return nil, fmt.Errorf("query is required")
}
// With reranker disabled: BM25 top-10 straight to the LLM.
// With reranker enabled: BM25 top-20 → cross-encoder filter → top-5.
bm25Limit := 10
if s.reranker != nil {
bm25Limit = 20
}
results, err := search.QueryContext(ctx, s.brainDir, search.QueryOptions{
Query: a.Query,
Limit: bm25Limit,
Vector: s.vector,
Embedder: s.embedder,
})
if err != nil {
return nil, fmt.Errorf("search: %w", err)
}
if s.reranker != nil && len(results) > 0 {
results, err = rerankResults(ctx, s.reranker, a.Query, results, 5)
if err != nil {
return nil, fmt.Errorf("rerank: %w", err)
}
}
if len(results) == 0 {
return json.Marshal(map[string]any{
"answer": "No relevant content found in brain.",
"sources": []string{},
})
}
var sb strings.Builder
sources := make([]string, 0, len(results))
for _, r := range results {
fmt.Fprintf(&sb, "<source path=%q>\n%s\n</source>\n\n", r.Path, r.Excerpt)
sources = append(sources, r.Path)
}
// GraphRAG augmentation: when the graph is wired, attach the 1-hop
// outgoing neighbourhood of the top BM25/rerank hit as an extra
// context block. The LLM can ignore it when irrelevant; when the
// neighbour adds signal we don't need a second retrieval pass.
// Failures are silently skipped — graph is augmentation, not
// correctness.
if reader, ok := s.graph.(graphReader); ok && len(results) > 0 {
topSlug := slugFromPath(results[0].Path)
if topSlug != "" {
if ns, gerr := reader.Subgraph(ctx, topSlug, 1); gerr == nil && len(ns) > 0 {
sb.WriteString("<related>\n")
for _, n := range ns {
label := n.Title
if label == "" {
label = n.Slug
}
fmt.Fprintf(&sb, "- %s (%s) at %s\n", label, n.EdgeType, n.DocPath)
}
sb.WriteString("</related>\n\n")
}
}
}
answer, err := s.answerLLM(ctx, answerSystemPrompt, sb.String()+"Question: "+a.Query)
if err != nil {
return nil, fmt.Errorf("llm: %w", err)
}
return json.Marshal(map[string]any{
"answer": answer,
"sources": sources,
})
}
// slugFromPath converts "wiki/concepts/foo.md" → "foo".
// Returns "" when path has no .md suffix or empty basename.
func slugFromPath(path string) string {
if path == "" {
return ""
}
// strip directory
for i := len(path) - 1; i >= 0; i-- {
if path[i] == '/' {
path = path[i+1:]
break
}
}
if !strings.HasSuffix(path, ".md") {
return ""
}
return strings.TrimSuffix(path, ".md")
}
type brainClassifyArgs struct {
Text string `json:"text"`
}
type classifyResult struct {
Type string `json:"type"`
Title string `json:"title"`
Tags []string `json:"tags"`
}
func (s *Server) brainClassify(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
if s.answerLLM == nil {
return nil, fmt.Errorf("answer LLM not configured: set BRAIN_LLM_PRIMARY_URL")
}
var a brainClassifyArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.Text == "" {
return nil, fmt.Errorf("text is required")
}
text := a.Text
if len(text) > 3000 {
text = text[:3000]
}
raw, err := s.answerLLM(ctx, classifySystemPrompt, text)
if err != nil {
return nil, fmt.Errorf("llm: %w", err)
}
// Strip markdown fences if model adds them despite the instruction.
raw = strings.TrimSpace(raw)
raw = strings.TrimPrefix(raw, "```json")
raw = strings.TrimPrefix(raw, "```")
raw = strings.TrimSuffix(raw, "```")
raw = strings.TrimSpace(raw)
var cr classifyResult
if err := json.Unmarshal([]byte(raw), &cr); err != nil {
return nil, fmt.Errorf("parse classify response %q: %w", raw, err)
}
if cr.Tags == nil {
cr.Tags = []string{}
}
return json.Marshal(cr)
}

View File

@@ -0,0 +1,155 @@
package mcp_test
import (
"context"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"strings"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/mcp"
"github.com/mathiasbq/hyperguild/ingestion/internal/pipeline"
"github.com/mathiasbq/hyperguild/ingestion/internal/reranker"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func mockAnswerLLM(response string) pipeline.CompleteFunc {
return func(_ context.Context, _, _ string) (string, error) {
return response, nil
}
}
func brainDirWithContent(t *testing.T) string {
t.Helper()
dir := t.TempDir()
wikiDir := filepath.Join(dir, "wiki")
require.NoError(t, os.MkdirAll(wikiDir, 0o755))
require.NoError(t, os.WriteFile(filepath.Join(wikiDir, "test.md"), []byte(
"---\ntitle: Pass-rate Logging\ntype: spec\n---\n\nPass-rate logging tracks skill invocations.",
), 0o644))
return dir
}
func callTool(t *testing.T, ts *httptest.Server, name string, arguments map[string]any) map[string]any {
t.Helper()
req := map[string]any{
"jsonrpc": "2.0", "id": 1, "method": "tools/call",
"params": map[string]any{"name": name, "arguments": arguments},
}
resp, err := http.Post(ts.URL, "application/json", body(t, req))
require.NoError(t, err)
defer resp.Body.Close() //nolint:errcheck
var out map[string]any
require.NoError(t, json.NewDecoder(resp.Body).Decode(&out))
return out
}
func TestBrainAnswer_RerankerFiltersBeforeLLM(t *testing.T) {
brainDir := t.TempDir()
wikiDir := filepath.Join(brainDir, "wiki")
require.NoError(t, os.MkdirAll(wikiDir, 0o755))
// Two notes — both BM25-match the query, but only one is truly relevant.
require.NoError(t, os.WriteFile(filepath.Join(wikiDir, "good.md"), []byte(
"---\ntitle: Pass-rate Logging\n---\nPass-rate logging tracks skill invocations.",
), 0o644))
require.NoError(t, os.WriteFile(filepath.Join(wikiDir, "noise.md"), []byte(
"---\ntitle: Pass-rate Tangent\n---\nPass-rate appears here too but as a tangent.",
), 0o644))
// Fake Ollama reranker: yes only when prompt contains "tracks skill invocations".
rrSrv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
raw, _ := io.ReadAll(r.Body)
yes := strings.Contains(string(raw), "tracks skill invocations")
ans := "no"
if yes {
ans = "yes"
}
_ = json.NewEncoder(w).Encode(map[string]any{"response": ans, "done": true})
}))
defer rrSrv.Close()
// LLM mock captures the rendered sources so we can assert what reached it.
var sawSources string
llm := func(_ context.Context, _, user string) (string, error) {
sawSources = user
return "answer text", nil
}
srv := mcp.NewServer(brainDir, nil, nil, llm).
WithReranker(reranker.New(rrSrv.URL, "qwen3"))
ts := httptest.NewServer(srv)
defer ts.Close()
rpc := callTool(t, ts, "brain_answer", map[string]any{"query": "pass-rate logging"})
require.Nil(t, rpc["error"])
content := rpc["result"].(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
var result map[string]any
require.NoError(t, json.Unmarshal([]byte(content), &result))
sources := result["sources"].([]any)
require.Len(t, sources, 1, "reranker should drop noise.md")
assert.Equal(t, "wiki/good.md", sources[0])
assert.Contains(t, sawSources, "good.md")
assert.NotContains(t, sawSources, "noise.md")
}
func TestBrainAnswer_NoLLM(t *testing.T) {
srv := mcp.NewServer(t.TempDir(), nil, nil, nil)
ts := httptest.NewServer(srv)
defer ts.Close()
rpc := callTool(t, ts, "brain_answer", map[string]any{"query": "test"})
assert.NotNil(t, rpc["error"], "expected error when answerLLM is nil")
}
func TestBrainAnswer_Synthesizes(t *testing.T) {
brainDir := brainDirWithContent(t)
srv := mcp.NewServer(brainDir, nil, nil, mockAnswerLLM("Pass-rate logging is described in spec."))
ts := httptest.NewServer(srv)
defer ts.Close()
rpc := callTool(t, ts, "brain_answer", map[string]any{"query": "pass-rate logging"})
require.Nil(t, rpc["error"])
content := rpc["result"].(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
var result map[string]any
require.NoError(t, json.Unmarshal([]byte(content), &result))
assert.Equal(t, "Pass-rate logging is described in spec.", result["answer"])
assert.NotEmpty(t, result["sources"])
}
func TestBrainClassify_ReturnsJSON(t *testing.T) {
llmResp := `{"type":"spec","title":"My Spec","tags":["go","mcp"]}`
srv := mcp.NewServer(t.TempDir(), nil, nil, mockAnswerLLM(llmResp))
ts := httptest.NewServer(srv)
defer ts.Close()
rpc := callTool(t, ts, "brain_classify", map[string]any{"text": "# My Spec\n\nThis is a Go MCP spec."})
require.Nil(t, rpc["error"])
content := rpc["result"].(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
var result map[string]any
require.NoError(t, json.Unmarshal([]byte(content), &result))
assert.Equal(t, "spec", result["type"])
assert.Equal(t, "My Spec", result["title"])
}
func TestBrainClassify_StripsFences(t *testing.T) {
llmResp := "```json\n{\"type\":\"note\",\"title\":\"T\",\"tags\":[]}\n```"
srv := mcp.NewServer(t.TempDir(), nil, nil, mockAnswerLLM(llmResp))
ts := httptest.NewServer(srv)
defer ts.Close()
rpc := callTool(t, ts, "brain_classify", map[string]any{"text": "some text"})
require.Nil(t, rpc["error"])
content := rpc["result"].(map[string]any)["content"].([]any)[0].(map[string]any)["text"].(string)
var result map[string]any
require.NoError(t, json.Unmarshal([]byte(content), &result))
assert.Equal(t, "note", result["type"])
}

View File

@@ -0,0 +1,202 @@
package mcp
import (
"context"
"encoding/json"
"fmt"
"os"
"path/filepath"
"sort"
"strings"
"github.com/mathiasbq/hyperguild/ingestion/internal/search"
)
// brainContextArgs is the input shape of brain_context. project_root is
// required; recent_files biases ranking when provided; limit caps the
// returned set (default 10).
type brainContextArgs struct {
ProjectRoot string `json:"project_root"`
RecentFiles []string `json:"recent_files,omitempty"`
Limit int `json:"limit,omitempty"`
}
// contextEntry is one returned brain entry: the slug, its title,
// frontmatter-stripped excerpt, source (bm25|graph), and a final score
// used for ranking before truncation to Limit.
type contextEntry struct {
Slug string `json:"slug"`
Title string `json:"title"`
DocPath string `json:"doc_path"`
Excerpt string `json:"excerpt"`
EdgeType string `json:"edge_type"`
Score float64 `json:"score"`
}
// brainContext returns top-N brain entries relevant to a project context.
// It runs a BM25 query against the project name, takes the top-3 hits as
// seeds, expands each seed 2 hops in the brain graph (when configured),
// then merges and deduplicates by slug. recent_files optionally boosts
// entries whose doc_path matches a recent file basename.
func (s *Server) brainContext(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
var a brainContextArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if a.ProjectRoot == "" {
return nil, fmt.Errorf("project_root is required")
}
limit := a.Limit
if limit <= 0 {
limit = 10
}
projectName := filepath.Base(strings.TrimRight(a.ProjectRoot, "/"))
if projectName == "" || projectName == "." || projectName == "/" {
return nil, fmt.Errorf("project_root has no usable basename: %q", a.ProjectRoot)
}
// Seed BM25 hits on the project name. Take top-3 as graph expansion seeds.
bm25, err := search.QueryContext(ctx, s.brainDir, search.QueryOptions{
Query: projectName,
Limit: 3,
Vector: s.vector,
Embedder: s.embedder,
})
if err != nil {
return nil, fmt.Errorf("search: %w", err)
}
// Dedup by slug while merging BM25 hits and graph neighbours.
bySlug := make(map[string]*contextEntry)
// BM25 score: highest rank gets the largest score, decaying linearly.
// Score 3.0 / 2.0 / 1.0 for ranks 0/1/2 respectively.
for i, r := range bm25 {
slug := slugFromPath(r.Path)
if slug == "" {
continue
}
score := float64(len(bm25) - i)
bySlug[slug] = &contextEntry{
Slug: slug,
Title: r.Title,
DocPath: r.Path,
Excerpt: truncateExcerpt(r.Excerpt, 200),
EdgeType: "bm25",
Score: score,
}
}
// Graph expansion: for each BM25 hit, fetch its 2-hop subgraph and
// merge those neighbours in with a graph score that decays with hop
// distance. Failures are silently dropped — graph augmentation is
// best-effort.
if reader, ok := s.graph.(graphReader); ok {
for _, r := range bm25 {
seed := slugFromPath(r.Path)
if seed == "" {
continue
}
ns, gerr := reader.Subgraph(ctx, seed, 2)
if gerr != nil {
continue
}
for _, n := range ns {
if n.Slug == "" || n.Slug == seed {
continue
}
// Graph score: closer hops carry more signal. Distance 1
// scores 0.6, distance 2 scores 0.3.
gscore := 0.6 / float64(max1(n.Distance))
if existing, ok := bySlug[n.Slug]; ok {
// Already surfaced via BM25 — bump its score so that
// BM25 + graph evidence outranks BM25-only hits.
existing.Score += gscore
continue
}
bySlug[n.Slug] = &contextEntry{
Slug: n.Slug,
Title: n.Title,
DocPath: n.DocPath,
Excerpt: readExcerpt(s.brainDir, n.DocPath, 200),
EdgeType: "graph",
Score: gscore,
}
}
}
}
// Optional recent_files boost: +1 to entries whose doc_path basename
// matches any recent file basename. v1 is intentionally simple.
if len(a.RecentFiles) > 0 {
recent := make(map[string]struct{}, len(a.RecentFiles))
for _, f := range a.RecentFiles {
recent[filepath.Base(f)] = struct{}{}
}
for _, e := range bySlug {
if _, hit := recent[filepath.Base(e.DocPath)]; hit {
e.Score += 1.0
}
}
}
// Flatten and sort by score desc, slug asc as a stable tiebreaker.
entries := make([]contextEntry, 0, len(bySlug))
for _, e := range bySlug {
entries = append(entries, *e)
}
sort.SliceStable(entries, func(i, j int) bool {
if entries[i].Score != entries[j].Score {
return entries[i].Score > entries[j].Score
}
return entries[i].Slug < entries[j].Slug
})
if len(entries) > limit {
entries = entries[:limit]
}
return json.Marshal(map[string]any{"entries": entries})
}
// truncateExcerpt clamps an already-stripped excerpt to maxLen characters
// without re-running the frontmatter parser. The ellipsis suffix matches
// the convention used in search.excerpt.
func truncateExcerpt(s string, maxLen int) string {
if len(s) <= maxLen {
return s
}
return s[:maxLen] + "…"
}
// readExcerpt loads a doc relative to brainDir, strips its frontmatter,
// and returns the first maxLen chars. Returns "" on any error — the
// excerpt is informational, not load-bearing for correctness.
func readExcerpt(brainDir, relPath string, maxLen int) string {
if relPath == "" {
return ""
}
full := filepath.Join(brainDir, filepath.FromSlash(relPath))
content, err := os.ReadFile(full)
if err != nil {
return ""
}
parts := strings.SplitN(string(content), "---", 3)
body := string(content)
if len(parts) == 3 {
body = strings.TrimSpace(parts[2])
}
if len(body) > maxLen {
return body[:maxLen] + "…"
}
return body
}
// max1 returns the maximum of n and 1, used to guard against divide-by-zero
// on graph distance and to give self-references (distance 0) a sensible
// score instead of an infinity.
func max1(n int) int {
if n < 1 {
return 1
}
return n
}

View File

@@ -0,0 +1,212 @@
package mcp
import (
"context"
"encoding/json"
"os"
"path/filepath"
"sort"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/graph"
"github.com/mathiasbq/hyperguild/ingestion/internal/graphstore"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// fakeGraph implements graphsync.Store + graphReader so it can be
// assigned to Server.graph and downcast by brainContext. Only Subgraph
// is exercised by brain_context today; the rest are no-op satisfiers.
type fakeGraph struct {
subgraph map[string][]graphstore.Neighbor
}
func (f *fakeGraph) UpsertEntity(_ context.Context, _ graph.Entity) error { return nil }
func (f *fakeGraph) ReplaceEdgesForDoc(_ context.Context, _ string, _ []graph.Edge) error {
return nil
}
func (f *fakeGraph) DeleteByDoc(_ context.Context, _ string) error { return nil }
func (f *fakeGraph) Neighbors(_ context.Context, slug, _ string, _ int) ([]graphstore.Neighbor, error) {
return f.subgraph[slug], nil
}
func (f *fakeGraph) Subgraph(_ context.Context, origin string, _ int) ([]graphstore.Neighbor, error) {
return f.subgraph[origin], nil
}
func (f *fakeGraph) Path(_ context.Context, _, _ string, _ int) ([]graphstore.PathStep, error) {
return nil, nil
}
func writeNote(t *testing.T, brainDir, relPath, title, body string) {
t.Helper()
full := filepath.Join(brainDir, filepath.FromSlash(relPath))
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
content := "---\ntitle: " + title + "\n---\n\n" + body
require.NoError(t, os.WriteFile(full, []byte(content), 0o644))
}
// callContext runs brainContext directly and decodes the JSON response.
func callContext(t *testing.T, s *Server, args map[string]any) map[string]any {
t.Helper()
raw, err := json.Marshal(args)
require.NoError(t, err)
out, err := s.brainContext(context.Background(), raw)
require.NoError(t, err)
var resp map[string]any
require.NoError(t, json.Unmarshal(out, &resp))
return resp
}
func sortedSlugs(entries []any) []string {
slugs := make([]string, 0, len(entries))
for _, e := range entries {
slugs = append(slugs, e.(map[string]any)["slug"].(string))
}
sort.Strings(slugs)
return slugs
}
func TestBrainContext_RejectsMissingProjectRoot(t *testing.T) {
s := NewServer(t.TempDir(), nil, nil, nil)
_, err := s.brainContext(context.Background(), json.RawMessage(`{}`))
assert.Error(t, err)
}
func TestBrainContext_RejectsUnusableBasename(t *testing.T) {
s := NewServer(t.TempDir(), nil, nil, nil)
_, err := s.brainContext(context.Background(), json.RawMessage(`{"project_root":"/"}`))
assert.Error(t, err)
}
func TestBrainContext_BM25Only_NoGraph(t *testing.T) {
brainDir := t.TempDir()
// Two notes whose body contains the hyphenated project name. BM25
// uses literal substring matching after whitespace tokenisation, so
// the bodies must carry "azure-tiger" verbatim, not "Azure tiger".
writeNote(t, brainDir, "wiki/finance/decisions/azure-tiger-routing.md",
"Azure Tiger Routing", "azure-tiger payment routing decisions.")
writeNote(t, brainDir, "wiki/finance/facts/iso20022.md",
"Azure Tiger ISO 20022 fields", "azure-tiger maps invoice fields to ISO 20022.")
s := NewServer(brainDir, nil, nil, nil)
// graph is nil — only BM25 hits should appear.
resp := callContext(t, s, map[string]any{
"project_root": "/home/mathias/dev/QKX/azure-tiger",
})
entries := resp["entries"].([]any)
require.NotEmpty(t, entries, "expected at least one BM25 hit on project name")
for _, e := range entries {
entry := e.(map[string]any)
assert.Equal(t, "bm25", entry["edge_type"], "no graph configured, every entry must be BM25")
assert.NotEmpty(t, entry["slug"])
assert.NotEmpty(t, entry["doc_path"])
}
}
func TestBrainContext_BM25PlusGraphExpansion(t *testing.T) {
brainDir := t.TempDir()
// BM25 seed — body carries the hyphenated project name verbatim.
writeNote(t, brainDir, "wiki/finance/decisions/azure-tiger-routing.md",
"Azure Tiger Routing", "azure-tiger payment routing decisions.")
// Graph neighbour — does NOT match BM25 on "azure-tiger" so it can
// only arrive via the graph subgraph traversal.
writeNote(t, brainDir, "wiki/finance/facts/sepa-clearing.md",
"SEPA Clearing", "SEPA payment clearing rules and timing windows.")
graphFake := &fakeGraph{
subgraph: map[string][]graphstore.Neighbor{
"azure-tiger-routing": {
{
Slug: "sepa-clearing",
Title: "SEPA Clearing",
DocPath: "wiki/finance/facts/sepa-clearing.md",
EdgeType: "wikilink",
Distance: 1,
},
},
},
}
s := NewServer(brainDir, nil, nil, nil)
s.graph = graphFake
resp := callContext(t, s, map[string]any{
"project_root": "/home/mathias/dev/QKX/azure-tiger",
})
entries := resp["entries"].([]any)
require.GreaterOrEqual(t, len(entries), 2, "expected BM25 seed plus graph neighbour")
slugs := sortedSlugs(entries)
assert.Contains(t, slugs, "azure-tiger-routing", "BM25 seed must appear")
assert.Contains(t, slugs, "sepa-clearing", "graph neighbour must appear")
// Verify the graph-only entry carries edge_type="graph".
var sepaEntry map[string]any
for _, e := range entries {
m := e.(map[string]any)
if m["slug"] == "sepa-clearing" {
sepaEntry = m
break
}
}
require.NotNil(t, sepaEntry)
assert.Equal(t, "graph", sepaEntry["edge_type"])
assert.NotEmpty(t, sepaEntry["excerpt"], "excerpt should be loaded from disk for graph neighbours")
}
func TestBrainContext_LimitClamps(t *testing.T) {
brainDir := t.TempDir()
// Five notes all matching "azure-tiger".
for i, name := range []string{"a", "b", "c", "d", "e"} {
writeNote(t, brainDir,
"wiki/finance/decisions/azure-tiger-"+name+".md",
"Azure Tiger "+name,
"azure-tiger note "+name+" with index "+string(rune('0'+i)))
}
s := NewServer(brainDir, nil, nil, nil)
resp := callContext(t, s, map[string]any{
"project_root": "/home/mathias/dev/QKX/azure-tiger",
"limit": 2,
})
entries := resp["entries"].([]any)
assert.LessOrEqual(t, len(entries), 2)
}
func TestBrainContext_RecentFilesBoost(t *testing.T) {
brainDir := t.TempDir()
// Both notes BM25-match the project name, but azure-tiger-z has
// twice the term frequency so it naturally ranks above azure-tiger-a.
// The recent_files boost on azure-tiger-a should pull it level on
// score; the alphabetical slug tiebreaker (a < z) then promotes it
// to the top — exercising both the boost and the deterministic
// tiebreak.
writeNote(t, brainDir, "wiki/finance/decisions/azure-tiger-a.md",
"A", "azure-tiger note about a.")
writeNote(t, brainDir, "wiki/finance/decisions/azure-tiger-z.md",
"Z", "azure-tiger azure-tiger note about z.")
s := NewServer(brainDir, nil, nil, nil)
// Baseline ranking: azure-tiger-z must lead (higher term frequency).
baseline := callContext(t, s, map[string]any{
"project_root": "/home/mathias/dev/QKX/azure-tiger",
})
baselineEntries := baseline["entries"].([]any)
require.GreaterOrEqual(t, len(baselineEntries), 2)
baselineTop := baselineEntries[0].(map[string]any)
require.Equal(t, "azure-tiger-z", baselineTop["slug"],
"sanity: higher tf must rank first without a boost")
// With boost on azure-tiger-a — boosted entry must now lead.
boosted := callContext(t, s, map[string]any{
"project_root": "/home/mathias/dev/QKX/azure-tiger",
"recent_files": []string{"/some/where/azure-tiger-a.md"},
})
entries := boosted["entries"].([]any)
require.GreaterOrEqual(t, len(entries), 2)
top := entries[0].(map[string]any)
assert.Equal(t, "azure-tiger-a", top["slug"], "recent_files boost must promote the matching doc")
}

View File

@@ -0,0 +1,116 @@
package mcp
import (
"context"
"encoding/json"
"fmt"
"github.com/mathiasbq/hyperguild/ingestion/internal/graphstore"
)
// graphReader is the read-side surface of graphstore.PGStore the
// brain_graph handler needs. Splitting it out (vs. depending on the
// concrete *PGStore) lets tests inject a fake without standing up
// postgres, and keeps the write-side graphsync.Store interface free
// of query concerns.
type graphReader interface {
Neighbors(ctx context.Context, slug, edgeType string, limit int) ([]graphstore.Neighbor, error)
Subgraph(ctx context.Context, origin string, depth int) ([]graphstore.Neighbor, error)
Path(ctx context.Context, src, dst string, maxDepth int) ([]graphstore.PathStep, error)
}
// Compile-time check that *graphstore.PGStore satisfies graphReader.
var _ graphReader = (*graphstore.PGStore)(nil)
type brainGraphArgs struct {
Op string `json:"op"`
Slug string `json:"slug,omitempty"`
Src string `json:"src,omitempty"`
Dst string `json:"dst,omitempty"`
EdgeType string `json:"edge_type,omitempty"`
Limit int `json:"limit,omitempty"`
Depth int `json:"depth,omitempty"`
}
func (s *Server) brainGraph(ctx context.Context, args json.RawMessage) (json.RawMessage, error) {
reader, ok := s.graph.(graphReader)
if s.graph == nil || !ok {
return nil, fmt.Errorf("brain graph not configured: set BRAIN_GRAPH_ENABLED=true")
}
var a brainGraphArgs
if err := json.Unmarshal(args, &a); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
switch a.Op {
case "neighbors":
if a.Slug == "" {
return nil, fmt.Errorf("slug is required for op=neighbors")
}
ns, err := reader.Neighbors(ctx, a.Slug, a.EdgeType, a.Limit)
if err != nil {
return nil, fmt.Errorf("neighbors: %w", err)
}
return json.Marshal(map[string]any{"results": neighborsView(ns)})
case "subgraph":
if a.Slug == "" {
return nil, fmt.Errorf("slug is required for op=subgraph")
}
ns, err := reader.Subgraph(ctx, a.Slug, a.Depth)
if err != nil {
return nil, fmt.Errorf("subgraph: %w", err)
}
return json.Marshal(map[string]any{"results": neighborsView(ns)})
case "path":
if a.Src == "" || a.Dst == "" {
return nil, fmt.Errorf("src and dst are required for op=path")
}
steps, err := reader.Path(ctx, a.Src, a.Dst, a.Depth)
if err != nil {
return nil, fmt.Errorf("path: %w", err)
}
return json.Marshal(map[string]any{"steps": pathView(steps)})
default:
return nil, fmt.Errorf("unknown op %q (want neighbors|subgraph|path)", a.Op)
}
}
type neighborView struct {
Slug string `json:"slug"`
Type string `json:"type,omitempty"`
Wing string `json:"wing,omitempty"`
Hall string `json:"hall,omitempty"`
DocPath string `json:"doc_path,omitempty"`
Title string `json:"title,omitempty"`
EdgeType string `json:"edge_type"`
Distance int `json:"distance"`
}
func neighborsView(ns []graphstore.Neighbor) []neighborView {
out := make([]neighborView, 0, len(ns))
for _, n := range ns {
out = append(out, neighborView{
Slug: n.Slug, Type: n.Type, Wing: n.Wing, Hall: n.Hall,
DocPath: n.DocPath, Title: n.Title,
EdgeType: n.EdgeType, Distance: n.Distance,
})
}
return out
}
type pathStepView struct {
From string `json:"from"`
To string `json:"to"`
EdgeType string `json:"edge_type"`
}
func pathView(steps []graphstore.PathStep) []pathStepView {
out := make([]pathStepView, 0, len(steps))
for _, s := range steps {
out = append(out, pathStepView{From: s.FromSlug, To: s.ToSlug, EdgeType: s.EdgeType})
}
return out
}

View File

@@ -0,0 +1,194 @@
// Package metrics is a tiny Prometheus exposition layer.
//
// Hand-rolled rather than pulling in github.com/prometheus/client_golang
// to keep ingestion's dependency surface minimal (stdlib + jwx + testify
// per the repo CLAUDE.md). The single histogram + counter it emits cover
// the canary alert wired in k3s/apps/monitoring/ — see infra#50.
//
// Wire format follows the OpenMetrics text exposition that
// kube-prometheus-stack scrapes by default.
package metrics
import (
"fmt"
"net/http"
"sort"
"strings"
"sync"
"sync/atomic"
"time"
)
// histogram buckets in seconds. Tuned for in-cluster HTTP API
// latencies: BM25 query is sub-10ms, hybrid retrieval + LLM-synthesis
// can run into seconds. +Inf catch-all is implicit.
var defaultBuckets = []float64{
0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10,
}
// Registry holds one histogram (request latency) labeled by path + status
// and one counter (request total) with the same labels. Concurrent-safe.
type Registry struct {
mu sync.RWMutex
series map[labelKey]*series
buckets []float64
}
type labelKey struct{ path, status string }
type series struct {
// One atomic counter per bucket (counts of observations ≤ bucket).
// counts[len(buckets)] = +Inf bucket (== total observations).
counts []atomic.Uint64
sumNs atomic.Uint64 // sum of durations in nanoseconds
}
// New returns a Registry pre-populated with no series; the first
// observation per (path, status) lazy-creates one.
func New() *Registry {
return &Registry{
series: make(map[labelKey]*series),
buckets: defaultBuckets,
}
}
// Observe records a single request duration for the given path + status.
func (r *Registry) Observe(path, status string, d time.Duration) {
key := labelKey{path: path, status: status}
r.mu.RLock()
s := r.series[key]
r.mu.RUnlock()
if s == nil {
r.mu.Lock()
s = r.series[key]
if s == nil {
s = &series{counts: make([]atomic.Uint64, len(r.buckets)+1)}
r.series[key] = s
}
r.mu.Unlock()
}
secs := d.Seconds()
for i, b := range r.buckets {
if secs <= b {
s.counts[i].Add(1)
}
}
// +Inf bucket always increments.
s.counts[len(r.buckets)].Add(1)
s.sumNs.Add(uint64(d.Nanoseconds()))
}
// Middleware wraps next, observing every request's duration + status.
// The metric label `path` uses the request's Pattern (Go 1.22+ ServeMux),
// falling back to the URL path if no Pattern is set. Pattern keeps
// cardinality bounded (one series per route, not one per unique URL).
func (r *Registry) Middleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) {
rec := &statusRecorder{ResponseWriter: w, code: http.StatusOK}
start := time.Now()
next.ServeHTTP(rec, req)
path := req.Pattern
if path == "" {
path = req.URL.Path
}
r.Observe(path, statusClass(rec.code), time.Since(start))
})
}
// Handler exposes /metrics in OpenMetrics text format.
func (r *Registry) Handler() http.HandlerFunc {
return func(w http.ResponseWriter, req *http.Request) {
w.Header().Set("Content-Type", "text/plain; version=0.0.4; charset=utf-8")
r.write(w)
}
}
func (r *Registry) write(w http.ResponseWriter) {
r.mu.RLock()
defer r.mu.RUnlock()
_, _ = fmt.Fprintln(w, "# HELP brain_query_duration_seconds Brain HTTP API request latency in seconds.")
_, _ = fmt.Fprintln(w, "# TYPE brain_query_duration_seconds histogram")
// Sort keys for stable output (helps diffing scrape responses).
keys := make([]labelKey, 0, len(r.series))
for k := range r.series {
keys = append(keys, k)
}
sort.Slice(keys, func(i, j int) bool {
if keys[i].path != keys[j].path {
return keys[i].path < keys[j].path
}
return keys[i].status < keys[j].status
})
for _, k := range keys {
s := r.series[k]
labels := fmt.Sprintf(`path=%q,status=%q`, k.path, k.status)
for i, b := range r.buckets {
_, _ = fmt.Fprintf(w, "brain_query_duration_seconds_bucket{%s,le=%q} %d\n",
labels, formatBucket(b), s.counts[i].Load())
}
// +Inf bucket
inf := s.counts[len(r.buckets)].Load()
_, _ = fmt.Fprintf(w, "brain_query_duration_seconds_bucket{%s,le=\"+Inf\"} %d\n", labels, inf)
_, _ = fmt.Fprintf(w, "brain_query_duration_seconds_sum{%s} %s\n",
labels, formatSeconds(s.sumNs.Load()))
_, _ = fmt.Fprintf(w, "brain_query_duration_seconds_count{%s} %d\n", labels, inf)
}
}
func formatBucket(b float64) string {
// Match Prometheus convention: no trailing zeros.
s := fmt.Sprintf("%g", b)
if !strings.ContainsAny(s, ".e") {
s = s + ".0"
}
return s
}
func formatSeconds(ns uint64) string {
return fmt.Sprintf("%g", float64(ns)/1e9)
}
func statusClass(code int) string {
switch {
case code >= 200 && code < 300:
return "2xx"
case code >= 300 && code < 400:
return "3xx"
case code >= 400 && code < 500:
return "4xx"
case code >= 500 && code < 600:
return "5xx"
default:
return "xxx"
}
}
// statusRecorder captures the response code so middleware can label
// the histogram by status class without buffering the body.
type statusRecorder struct {
http.ResponseWriter
code int
wroteHeader bool
}
func (r *statusRecorder) WriteHeader(code int) {
if r.wroteHeader {
return
}
r.code = code
r.wroteHeader = true
r.ResponseWriter.WriteHeader(code)
}
func (r *statusRecorder) Write(b []byte) (int, error) {
if !r.wroteHeader {
r.WriteHeader(http.StatusOK)
}
return r.ResponseWriter.Write(b)
}

View File

@@ -0,0 +1,119 @@
package metrics
import (
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
)
func TestRegistry_ObserveAndExpose(t *testing.T) {
t.Parallel()
r := New()
// Three observations on the same series; one falls into each
// representative band.
r.Observe("/query", "2xx", 4*time.Millisecond) // ≤ 5ms
r.Observe("/query", "2xx", 20*time.Millisecond) // ≤ 25ms
r.Observe("/query", "2xx", 600*time.Millisecond) // ≤ 1s
req := httptest.NewRequest(http.MethodGet, "/metrics", nil)
rec := httptest.NewRecorder()
r.Handler().ServeHTTP(rec, req)
body := rec.Body.String()
mustContain := []string{
`# TYPE brain_query_duration_seconds histogram`,
`brain_query_duration_seconds_bucket{path="/query",status="2xx",le="0.005"} 1`,
`brain_query_duration_seconds_bucket{path="/query",status="2xx",le="0.025"} 2`,
`brain_query_duration_seconds_bucket{path="/query",status="2xx",le="1.0"} 3`,
`brain_query_duration_seconds_bucket{path="/query",status="2xx",le="+Inf"} 3`,
`brain_query_duration_seconds_count{path="/query",status="2xx"} 3`,
}
for _, want := range mustContain {
if !strings.Contains(body, want) {
t.Errorf("missing line: %q\n--- body ---\n%s", want, body)
}
}
if got := rec.Header().Get("Content-Type"); !strings.HasPrefix(got, "text/plain") {
t.Errorf("content-type = %q, want text/plain prefix", got)
}
}
func TestRegistry_LabelsByStatus(t *testing.T) {
t.Parallel()
r := New()
r.Observe("/query", "2xx", time.Millisecond)
r.Observe("/query", "5xx", time.Millisecond)
r.Observe("/write", "2xx", time.Millisecond)
rec := httptest.NewRecorder()
r.Handler().ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/metrics", nil))
body := rec.Body.String()
for _, want := range []string{
`brain_query_duration_seconds_count{path="/query",status="2xx"} 1`,
`brain_query_duration_seconds_count{path="/query",status="5xx"} 1`,
`brain_query_duration_seconds_count{path="/write",status="2xx"} 1`,
} {
if !strings.Contains(body, want) {
t.Errorf("missing %q in body:\n%s", want, body)
}
}
}
func TestMiddleware_RecordsTiming(t *testing.T) {
t.Parallel()
r := New()
handler := r.Middleware(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
time.Sleep(2 * time.Millisecond)
w.WriteHeader(http.StatusOK)
_, _ = io.WriteString(w, "ok")
}))
srv := httptest.NewServer(handler)
defer srv.Close()
resp, err := http.Get(srv.URL + "/query")
if err != nil {
t.Fatalf("get: %v", err)
}
_ = resp.Body.Close()
if resp.StatusCode != http.StatusOK {
t.Fatalf("status %d, want 200", resp.StatusCode)
}
// Exposition should now include /query.
rec := httptest.NewRecorder()
r.Handler().ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/metrics", nil))
body := rec.Body.String()
if !strings.Contains(body, `path="/query"`) {
t.Errorf("expected /query series, got body:\n%s", body)
}
if !strings.Contains(body, `status="2xx"`) {
t.Errorf("expected 2xx status class, got body:\n%s", body)
}
}
func TestStatusRecorder_DefaultsTo200(t *testing.T) {
t.Parallel()
r := New()
handler := r.Middleware(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
_, _ = w.Write([]byte("hello"))
}))
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, httptest.NewRequest(http.MethodGet, "/x", nil))
if rec.Code != http.StatusOK {
t.Errorf("code %d, want 200", rec.Code)
}
}

View File

@@ -0,0 +1,38 @@
// Package oauth implements a minimal OAuth 2.0 client_credentials flow
// for the brain MCP server. Designed for claude.ai's custom MCP integration
// UI, which only supports OAuth (no static-Bearer field). The flow trades
// a registered client_id + client_secret for the existing BRAIN_MCP_TOKEN —
// no JWTs, no expiry, no refresh — so the rest of the auth middleware is
// unchanged.
package oauth
import (
"encoding/json"
"net/http"
"strings"
)
// MetadataHandler serves RFC 8414 authorization-server metadata at
// GET /.well-known/oauth-authorization-server. issuer must be the public
// origin of the brain MCP (e.g. https://brain-mcp.d-ma.be); the handler
// derives the token endpoint from it.
//
// Mount with no auth — discovery must be reachable to anonymous callers.
func MetadataHandler(issuer string) http.HandlerFunc {
issuer = strings.TrimRight(issuer, "/")
body, _ := json.Marshal(struct {
Issuer string `json:"issuer"`
TokenEndpoint string `json:"token_endpoint"`
GrantTypes []string `json:"grant_types_supported"`
TokenEndpointAuthMeth []string `json:"token_endpoint_auth_methods_supported"`
}{
Issuer: issuer,
TokenEndpoint: issuer + "/oauth/token",
GrantTypes: []string{"client_credentials"},
TokenEndpointAuthMeth: []string{"client_secret_post", "client_secret_basic"},
})
return func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write(body)
}
}

View File

@@ -0,0 +1,41 @@
package oauth_test
import (
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/oauth"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestMetadataHandler_ReturnsJSON(t *testing.T) {
h := oauth.MetadataHandler("https://brain-mcp.d-ma.be")
req := httptest.NewRequest(http.MethodGet, "/.well-known/oauth-authorization-server", nil)
rr := httptest.NewRecorder()
h.ServeHTTP(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
assert.Equal(t, "application/json", rr.Header().Get("Content-Type"))
var body map[string]any
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &body))
assert.Equal(t, "https://brain-mcp.d-ma.be", body["issuer"])
assert.Equal(t, "https://brain-mcp.d-ma.be/oauth/token", body["token_endpoint"])
assert.ElementsMatch(t, []any{"client_credentials"}, body["grant_types_supported"])
assert.ElementsMatch(t,
[]any{"client_secret_post", "client_secret_basic"},
body["token_endpoint_auth_methods_supported"])
}
func TestMetadataHandler_StripsTrailingSlashFromIssuer(t *testing.T) {
h := oauth.MetadataHandler("https://brain-mcp.d-ma.be/")
rr := httptest.NewRecorder()
h.ServeHTTP(rr, httptest.NewRequest(http.MethodGet, "/.well-known/oauth-authorization-server", nil))
var body map[string]any
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &body))
assert.Equal(t, "https://brain-mcp.d-ma.be", body["issuer"])
assert.Equal(t, "https://brain-mcp.d-ma.be/oauth/token", body["token_endpoint"])
}

View File

@@ -0,0 +1,87 @@
package oauth
import (
"crypto/subtle"
"encoding/json"
"net/http"
)
// TokenConfig is the static configuration for the token endpoint. All
// three fields are required.
type TokenConfig struct {
// ClientID and ClientSecret are the single accepted credentials.
// claude.ai's custom-MCP UI persists these on its side.
ClientID string
ClientSecret string
// AccessToken is the bearer value handed back on a successful
// exchange. In this deployment it is BRAIN_MCP_TOKEN — the same
// static token the rest of the auth middleware already accepts —
// so no JWT machinery is needed downstream.
AccessToken string
}
// TokenHandler serves POST /oauth/token. Implements the
// client_credentials grant only, with client_secret_post and
// client_secret_basic auth methods (both advertised by MetadataHandler).
// Errors follow RFC 6749 §5.2 — JSON body with an "error" field.
//
// Mount with no auth — credentials live in the request body / header.
func TokenHandler(cfg TokenConfig) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
w.Header().Set("Allow", http.MethodPost)
writeOAuthError(w, http.StatusMethodNotAllowed, "invalid_request", "POST required")
return
}
if err := r.ParseForm(); err != nil {
writeOAuthError(w, http.StatusBadRequest, "invalid_request", "form parse")
return
}
if r.PostForm.Get("grant_type") != "client_credentials" {
writeOAuthError(w, http.StatusBadRequest, "unsupported_grant_type",
"only client_credentials is supported")
return
}
clientID, clientSecret := extractClientCreds(r)
if !constantTimeEqual(clientID, cfg.ClientID) ||
!constantTimeEqual(clientSecret, cfg.ClientSecret) {
writeOAuthError(w, http.StatusUnauthorized, "invalid_client", "bad credentials")
return
}
w.Header().Set("Content-Type", "application/json")
w.Header().Set("Cache-Control", "no-store")
_ = json.NewEncoder(w).Encode(struct {
AccessToken string `json:"access_token"`
TokenType string `json:"token_type"`
}{cfg.AccessToken, "bearer"})
}
}
// extractClientCreds returns the client_id and client_secret pair from
// either client_secret_basic (HTTP Basic) or client_secret_post (form
// fields). When both are present, Basic wins per RFC 6749 §2.3.1.
func extractClientCreds(r *http.Request) (string, string) {
if id, secret, ok := r.BasicAuth(); ok {
return id, secret
}
return r.PostForm.Get("client_id"), r.PostForm.Get("client_secret")
}
func constantTimeEqual(a, b string) bool {
if a == "" || b == "" {
return false
}
return subtle.ConstantTimeCompare([]byte(a), []byte(b)) == 1
}
func writeOAuthError(w http.ResponseWriter, status int, code, desc string) {
w.Header().Set("Content-Type", "application/json")
w.Header().Set("Cache-Control", "no-store")
w.WriteHeader(status)
_ = json.NewEncoder(w).Encode(struct {
Error string `json:"error"`
ErrorDescription string `json:"error_description,omitempty"`
}{code, desc})
}

View File

@@ -0,0 +1,134 @@
package oauth_test
import (
"encoding/base64"
"encoding/json"
"net/http"
"net/http/httptest"
"net/url"
"strings"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/oauth"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func newTokenServer() *httptest.Server {
return httptest.NewServer(oauth.TokenHandler(oauth.TokenConfig{
ClientID: "the-client",
ClientSecret: "the-secret",
AccessToken: "BRAIN_TOKEN_VALUE",
}))
}
func postForm(t *testing.T, srv *httptest.Server, vals url.Values, basic [2]string) *http.Response {
t.Helper()
req, err := http.NewRequest(http.MethodPost, srv.URL+"/oauth/token", strings.NewReader(vals.Encode()))
require.NoError(t, err)
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
if basic[0] != "" {
req.SetBasicAuth(basic[0], basic[1])
}
resp, err := http.DefaultClient.Do(req)
require.NoError(t, err)
return resp
}
func TestTokenHandler_ClientSecretPost_Success(t *testing.T) {
srv := newTokenServer()
defer srv.Close()
resp := postForm(t, srv, url.Values{
"grant_type": {"client_credentials"},
"client_id": {"the-client"},
"client_secret": {"the-secret"},
}, [2]string{})
defer func() { _ = resp.Body.Close() }()
assert.Equal(t, http.StatusOK, resp.StatusCode)
assert.Equal(t, "application/json", resp.Header.Get("Content-Type"))
var body map[string]any
require.NoError(t, json.NewDecoder(resp.Body).Decode(&body))
assert.Equal(t, "BRAIN_TOKEN_VALUE", body["access_token"])
assert.Equal(t, "bearer", body["token_type"])
}
func TestTokenHandler_ClientSecretBasic_Success(t *testing.T) {
srv := newTokenServer()
defer srv.Close()
resp := postForm(t, srv,
url.Values{"grant_type": {"client_credentials"}},
[2]string{"the-client", "the-secret"},
)
defer func() { _ = resp.Body.Close() }()
assert.Equal(t, http.StatusOK, resp.StatusCode)
}
func TestTokenHandler_WrongSecret(t *testing.T) {
srv := newTokenServer()
defer srv.Close()
resp := postForm(t, srv, url.Values{
"grant_type": {"client_credentials"},
"client_id": {"the-client"},
"client_secret": {"wrong"},
}, [2]string{})
defer func() { _ = resp.Body.Close() }()
assert.Equal(t, http.StatusUnauthorized, resp.StatusCode)
var body map[string]any
require.NoError(t, json.NewDecoder(resp.Body).Decode(&body))
assert.Equal(t, "invalid_client", body["error"])
}
func TestTokenHandler_BadGrantType(t *testing.T) {
srv := newTokenServer()
defer srv.Close()
resp := postForm(t, srv, url.Values{
"grant_type": {"password"},
"client_id": {"the-client"},
"client_secret": {"the-secret"},
}, [2]string{})
defer func() { _ = resp.Body.Close() }()
assert.Equal(t, http.StatusBadRequest, resp.StatusCode)
var body map[string]any
require.NoError(t, json.NewDecoder(resp.Body).Decode(&body))
assert.Equal(t, "unsupported_grant_type", body["error"])
}
func TestTokenHandler_RejectsGet(t *testing.T) {
srv := newTokenServer()
defer srv.Close()
resp, err := http.Get(srv.URL + "/oauth/token")
require.NoError(t, err)
defer func() { _ = resp.Body.Close() }()
assert.Equal(t, http.StatusMethodNotAllowed, resp.StatusCode)
}
func TestTokenHandler_BasicMalformed_FallsThrough(t *testing.T) {
srv := newTokenServer()
defer srv.Close()
// Malformed (non-base64) Authorization header — handler should treat
// the request as missing creds, not crash.
req, _ := http.NewRequest(http.MethodPost, srv.URL+"/oauth/token",
strings.NewReader("grant_type=client_credentials"))
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
req.Header.Set("Authorization", "Basic ###not-base64###")
resp, err := http.DefaultClient.Do(req)
require.NoError(t, err)
defer func() { _ = resp.Body.Close() }()
assert.Equal(t, http.StatusUnauthorized, resp.StatusCode)
}
func TestTokenHandler_BasicNoColon(t *testing.T) {
srv := newTokenServer()
defer srv.Close()
// "client-only" base64 — missing the `:secret` half.
enc := base64.StdEncoding.EncodeToString([]byte("the-client"))
req, _ := http.NewRequest(http.MethodPost, srv.URL+"/oauth/token",
strings.NewReader("grant_type=client_credentials"))
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
req.Header.Set("Authorization", "Basic "+enc)
resp, err := http.DefaultClient.Do(req)
require.NoError(t, err)
defer func() { _ = resp.Body.Close() }()
assert.Equal(t, http.StatusUnauthorized, resp.StatusCode)
}

View File

@@ -0,0 +1,119 @@
// Package reranker scores (query, document) pairs against a cross-encoder
// served by an Ollama-compatible backend.
//
// Wire format is Ollama's `/api/generate`. The model is prompted with the
// Qwen3-Reranker yes/no template — the canonical interface the model
// itself was trained against — and the first token of the response is
// treated as a binary relevance vote: "yes" → 1.0, anything else → 0.0.
// Ties are expected to be broken by the caller's primary retrieval score
// (e.g. BM25), so the binary signal is a filter rather than a ranking
// substitute.
package reranker
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"strings"
"time"
)
// Client posts rerank requests to an Ollama-compatible endpoint.
type Client struct {
URL string
Model string
HTTP *http.Client
}
// New constructs a Client. Returns nil when url is empty so callers can
// treat a missing BRAIN_RERANKER_URL as "feature disabled" with a single
// nil check.
func New(url, model string) *Client {
if url == "" {
return nil
}
return &Client{
URL: strings.TrimRight(url, "/"),
Model: model,
HTTP: &http.Client{Timeout: 30 * time.Second},
}
}
// Score returns one [0, 1] relevance score per input document, parallel
// to the input order. Each (query, doc) pair is scored independently —
// Qwen3-Reranker is a cross-encoder and expects per-pair calls.
func (c *Client) Score(ctx context.Context, query string, docs []string) ([]float64, error) {
out := make([]float64, len(docs))
for i, doc := range docs {
s, err := c.scoreOne(ctx, query, doc)
if err != nil {
return nil, fmt.Errorf("rerank doc %d: %w", i, err)
}
out[i] = s
}
return out, nil
}
func (c *Client) scoreOne(ctx context.Context, query, doc string) (float64, error) {
prompt := buildPrompt(query, doc)
reqBody, _ := json.Marshal(map[string]any{
"model": c.Model,
"prompt": prompt,
"stream": false,
"options": map[string]any{
"num_predict": 4,
"temperature": 0,
},
})
req, err := http.NewRequestWithContext(ctx, http.MethodPost,
c.URL+"/api/generate", bytes.NewReader(reqBody))
if err != nil {
return 0, err
}
req.Header.Set("Content-Type", "application/json")
resp, err := c.HTTP.Do(req)
if err != nil {
return 0, err
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode/100 != 2 {
body, _ := io.ReadAll(resp.Body)
return 0, fmt.Errorf("status %d: %s", resp.StatusCode, string(body))
}
var out struct {
Response string `json:"response"`
}
if err := json.NewDecoder(resp.Body).Decode(&out); err != nil {
return 0, err
}
return parseYesNo(out.Response), nil
}
// buildPrompt assembles the Qwen3-Reranker chat template. Kept verbatim
// because the model was trained on this exact wording.
func buildPrompt(query, doc string) string {
return "<|im_start|>system\nJudge whether the Document meets the requirements based on the Query and the Instruct provided. Note that the answer can only be \"yes\" or \"no\".<|im_end|>\n" +
"<|im_start|>user\n<Instruct>: Given a web search query, retrieve relevant passages that answer the query\n" +
"<Query>: " + query + "\n" +
"<Document>: " + doc + "<|im_end|>\n" +
"<|im_start|>assistant\n<think>\n\n</think>\n\n"
}
// parseYesNo extracts the first meaningful token from response and
// returns 1.0 when it starts with "yes" (case-insensitive), 0.0 otherwise.
// Any leading whitespace, `<think>` block, or punctuation is skipped.
func parseYesNo(s string) float64 {
s = strings.TrimSpace(s)
// Strip any `<think>…</think>` block the model may emit even with empty thinking.
if idx := strings.Index(s, "</think>"); idx != -1 {
s = strings.TrimSpace(s[idx+len("</think>"):])
}
s = strings.ToLower(s)
if strings.HasPrefix(s, "yes") {
return 1.0
}
return 0.0
}

View File

@@ -0,0 +1,119 @@
package reranker_test
import (
"context"
"encoding/json"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/reranker"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// fakeOllama responds to /api/generate based on a per-document
// {needle → answer} map: if the prompt contains the needle, returns
// the mapped answer.
type fakeOllama struct {
t *testing.T
answers map[string]string // needle → "yes" or "no"
calls int
lastBody map[string]any
}
func (f *fakeOllama) handler() http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
require.Equal(f.t, http.MethodPost, r.Method)
require.Equal(f.t, "/api/generate", r.URL.Path)
body, err := io.ReadAll(r.Body)
require.NoError(f.t, err)
var p map[string]any
require.NoError(f.t, json.Unmarshal(body, &p))
f.calls++
f.lastBody = p
prompt := p["prompt"].(string)
answer := "no"
for needle, a := range f.answers {
if strings.Contains(prompt, needle) {
answer = a
break
}
}
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(map[string]any{
"model": p["model"], "response": answer, "done": true,
})
})
}
func TestNew_EmptyURLReturnsNil(t *testing.T) {
assert.Nil(t, reranker.New("", "model"))
}
func TestScore_YesAndNoOrdered(t *testing.T) {
f := &fakeOllama{t: t, answers: map[string]string{
"alpha doc": "yes",
"beta doc": "no",
"gamma doc": "yes",
}}
srv := httptest.NewServer(f.handler())
defer srv.Close()
c := reranker.New(srv.URL, "test-model")
require.NotNil(t, c)
scores, err := c.Score(context.Background(), "what is alpha",
[]string{"alpha doc body", "beta doc body", "gamma doc body"})
require.NoError(t, err)
require.Len(t, scores, 3)
assert.Equal(t, 1.0, scores[0])
assert.Equal(t, 0.0, scores[1])
assert.Equal(t, 1.0, scores[2])
assert.Equal(t, 3, f.calls)
}
func TestScore_SendsCorrectShape(t *testing.T) {
f := &fakeOllama{t: t, answers: map[string]string{"hello": "yes"}}
srv := httptest.NewServer(f.handler())
defer srv.Close()
c := reranker.New(srv.URL, "qwen3-rerank")
_, err := c.Score(context.Background(), "greeting", []string{"hello world"})
require.NoError(t, err)
assert.Equal(t, "qwen3-rerank", f.lastBody["model"])
prompt := f.lastBody["prompt"].(string)
assert.Contains(t, prompt, "greeting")
assert.Contains(t, prompt, "hello world")
assert.Contains(t, prompt, `"yes" or "no"`)
}
func TestScore_HandlesAmbiguousResponse(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
_ = json.NewEncoder(w).Encode(map[string]any{"response": "maybe — unclear", "done": true})
}))
defer srv.Close()
c := reranker.New(srv.URL, "m")
scores, err := c.Score(context.Background(), "q", []string{"d"})
require.NoError(t, err)
// Anything that does not start with "yes" (case-insensitive, after
// whitespace/think trim) is treated as "no" = 0.
assert.Equal(t, []float64{0}, scores)
}
func TestScore_EmptyDocsReturnsEmpty(t *testing.T) {
c := reranker.New("http://127.0.0.1:1", "m")
scores, err := c.Score(context.Background(), "q", nil)
require.NoError(t, err)
assert.Empty(t, scores)
}
func TestScore_UpstreamErrorPropagates(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.WriteHeader(http.StatusInternalServerError)
}))
defer srv.Close()
c := reranker.New(srv.URL, "m")
_, err := c.Score(context.Background(), "q", []string{"d"})
require.Error(t, err)
}

View File

@@ -3,38 +3,117 @@ package search
import (
"bufio"
"context"
"fmt"
"log/slog"
"os"
"path/filepath"
"sort"
"strings"
"github.com/mathiasbq/hyperguild/ingestion/internal/brain"
"github.com/mathiasbq/hyperguild/ingestion/internal/vectorstore"
)
// VectorSearcher returns the top-limit nearest paths by cosine
// distance. The vectorstore package implements this against pgvector.
type VectorSearcher interface {
Search(ctx context.Context, query []float32, limit int) ([]VectorHit, error)
}
// VectorHit is a single path + distance pair from a vector search.
// Re-declared here (rather than imported) to keep search package
// free of vectorstore/embed deps and to make stubbing trivial in tests.
type VectorHit struct {
Path string
Distance float64
}
// Embedder turns a query string into a dense vector. The embed package
// implements this against Ollama's /api/embed.
type Embedder interface {
Embed(ctx context.Context, text string) ([]float32, error)
}
// Result is a single search hit from the brain wiki.
type Result struct {
Path string `json:"path"`
Title string `json:"title"`
Excerpt string `json:"excerpt"`
Score int `json:"score"`
Wing string `json:"wing,omitempty"`
Hall string `json:"hall,omitempty"`
// Tier is the DIKW classification used for retrieval weighting
// (infra#72). Read from frontmatter when present, otherwise
// inferred from the parent directory.
Tier string `json:"tier,omitempty"`
}
// Query searches all .md files under brainDir/wiki/ for pages containing
// any of the whitespace-separated terms in query. Returns up to limit results
// sorted by score descending.
func Query(brainDir, query string, limit int) ([]Result, error) {
if limit <= 0 {
limit = 5
// tierWeight maps the DIKW tier to a score multiplier applied right
// before the final truncation. Knowledge entries (focused lessons that
// age well) get boosted; inbox entries (raw captures, sessions, clips)
// get demoted. Empty / unknown tiers keep the original BM25 score
// (multiplier 1.0). See infra#72 for the failure mode this addresses:
// short focused entries lose to long aggregate dump-files under
// raw BM25 ranking.
func tierWeight(tier string) float64 {
switch tier {
case "knowledge":
return 1.5
case "note":
return 1.0
case "inbox":
return 0.3
default:
return 1.0
}
terms := strings.Fields(strings.ToLower(query))
}
// QueryOptions configures a search.
//
// When Wing is set, the walk is restricted to brain/wiki/<wing>/.
// When Hall is additionally set, the walk is restricted to
// brain/wiki/<wing>/<hall>/. Without either, the legacy walk over
// brain/knowledge/ and brain/wiki/ is used.
//
// When both Vector and Embedder are non-nil, results are computed
// hybridly: BM25 and vector candidate lists are merged via Reciprocal
// Rank Fusion. With either nil the function falls back to BM25 only,
// keeping behaviour unchanged for callers that have not opted in.
type QueryOptions struct {
Query string
Limit int
Wing string
Hall string
Vector VectorSearcher
Embedder Embedder
}
// Query searches the brain. Returns up to opts.Limit results sorted by
// score descending. Empty query returns nil.
func Query(brainDir string, opts QueryOptions) ([]Result, error) {
return QueryContext(context.Background(), brainDir, opts)
}
// QueryContext is the cancellable variant of Query. Hybrid retrieval
// requires a context because both the embedder and the vector store are
// network calls.
func QueryContext(ctx context.Context, brainDir string, opts QueryOptions) ([]Result, error) {
if opts.Limit <= 0 {
opts.Limit = 5
}
terms := strings.Fields(strings.ToLower(opts.Query))
if len(terms) == 0 {
return nil, nil
}
var results []Result
roots, err := resolveRoots(brainDir, opts.Wing, opts.Hall)
if err != nil {
return nil, err
}
for _, subdir := range []string{"knowledge", "wiki"} {
dir := filepath.Join(brainDir, subdir)
var results []Result
for _, dir := range roots {
if _, statErr := os.Stat(dir); os.IsNotExist(statErr) {
continue
}
@@ -46,13 +125,11 @@ func Query(brainDir, query string, limit int) ([]Result, error) {
if d.IsDir() || !strings.HasSuffix(path, ".md") {
return nil
}
content, err := os.ReadFile(path)
if err != nil {
slog.Warn("search: skipping unreadable file", "path", path, "err", err)
return nil
}
lower := strings.ToLower(string(content))
score := 0
for _, term := range terms {
@@ -61,18 +138,21 @@ func Query(brainDir, query string, limit int) ([]Result, error) {
if score == 0 {
return nil
}
rel, err := filepath.Rel(brainDir, path)
if err != nil {
return fmt.Errorf("rel path: %w", err)
}
rel = filepath.ToSlash(rel)
wing, hall := extractWingHall(string(content), rel)
tier := extractTier(string(content), rel)
results = append(results, Result{
Path: rel,
Title: extractTitle(string(content), d.Name()),
Excerpt: excerpt(string(content), 300),
Score: score,
Wing: wing,
Hall: hall,
Tier: tier,
})
return nil
})
@@ -84,12 +164,241 @@ func Query(brainDir, query string, limit int) ([]Result, error) {
sort.Slice(results, func(i, j int) bool {
return results[i].Score > results[j].Score
})
if len(results) > limit {
results = results[:limit]
// Hybrid scoring kicks in only when both the embedder and the
// vector store are wired and BM25 actually returned candidates.
if opts.Vector != nil && opts.Embedder != nil && len(results) > 0 {
merged, err := hybridMerge(ctx, brainDir, opts, results)
if err != nil {
slog.Warn("search: hybrid merge failed, falling back to BM25", "err", err)
} else {
results = merged
}
}
// Tier-weighted final re-rank (infra#72). Knowledge tier entries
// boost ×1.5, inbox demote ×0.3, note stays at ×1.0. Applied after
// hybridMerge so RRF ranking still drives candidate generation;
// the tier weight only re-orders the merged set.
sort.SliceStable(results, func(i, j int) bool {
return float64(results[i].Score)*tierWeight(results[i].Tier) >
float64(results[j].Score)*tierWeight(results[j].Tier)
})
if len(results) > opts.Limit {
results = results[:opts.Limit]
}
return results, nil
}
// rrfK is the constant in the Reciprocal Rank Fusion formula. 60 is
// standard (Cormack et al. 2009) and parameter-free in practice.
const rrfK = 60.0
// hybridMerge embeds the query, runs a vector search, and merges its
// candidates with the BM25 list via Reciprocal Rank Fusion. Results
// that came only from the vector side are hydrated by reading the
// note's frontmatter for title/wing/hall and excerpting the body.
//
// rrf(d) = sum_r 1 / (k + rank_r(d)) over rankers r ∈ {BM25, vector}.
func hybridMerge(ctx context.Context, brainDir string, opts QueryOptions, bm25 []Result) ([]Result, error) {
q, err := opts.Embedder.Embed(ctx, opts.Query)
if err != nil {
return nil, fmt.Errorf("embed query: %w", err)
}
vectorLimit := opts.Limit * 4
if vectorLimit < 20 {
vectorLimit = 20
}
hits, err := opts.Vector.Search(ctx, q, vectorLimit)
if err != nil {
return nil, fmt.Errorf("vector search: %w", err)
}
rrf := make(map[string]float64)
byPath := make(map[string]Result)
for rank, r := range bm25 {
rrf[r.Path] += 1.0 / (rrfK + float64(rank+1))
byPath[r.Path] = r
}
for rank, h := range hits {
// Vector store keys are chunk paths ("wiki/foo.md#0001"); collapse
// back to the parent so multiple chunk hits from the same file
// score against a single result row.
parent := vectorstore.ParentPath(h.Path)
if opts.Wing != "" && !pathInScope(parent, opts.Wing, opts.Hall) {
continue
}
rrf[parent] += 1.0 / (rrfK + float64(rank+1))
if _, seen := byPath[parent]; !seen {
r, err := hydrate(brainDir, parent)
if err != nil {
slog.Warn("search: hydrate failed for vector hit", "path", parent, "err", err)
continue
}
byPath[parent] = r
}
}
merged := make([]Result, 0, len(byPath))
for p, r := range byPath {
r.Score = int(rrf[p] * 1e6) // scale to int for stable JSON; relative order is what matters
merged = append(merged, r)
}
sort.Slice(merged, func(i, j int) bool {
return merged[i].Score > merged[j].Score
})
return merged, nil
}
// pathInScope reports whether a wiki path satisfies the wing/hall filter.
func pathInScope(relPath, wing, hall string) bool {
prefix := "wiki/" + brain.Sanitise(wing) + "/"
if hall != "" {
prefix += hall + "/"
}
return strings.HasPrefix(relPath, prefix)
}
// hydrate reads a single note from disk and returns a Result with title,
// excerpt, wing, and hall populated. Used for paths that surface only
// via vector search.
func hydrate(brainDir, relPath string) (Result, error) {
full := filepath.Join(brainDir, filepath.FromSlash(relPath))
content, err := os.ReadFile(full)
if err != nil {
return Result{}, err
}
wing, hall := extractWingHall(string(content), relPath)
tier := extractTier(string(content), relPath)
return Result{
Path: relPath,
Title: extractTitle(string(content), filepath.Base(relPath)),
Excerpt: excerpt(string(content), 300),
Wing: wing,
Hall: hall,
Tier: tier,
}, nil
}
// resolveRoots returns the directories to walk for the given wing/hall
// filters. Validates hall against the closed vocabulary when set.
func resolveRoots(brainDir, wing, hall string) ([]string, error) {
if hall != "" && !brain.IsValidHall(hall) {
return nil, fmt.Errorf("invalid hall %q", hall)
}
if wing != "" {
w := brain.Sanitise(wing)
if w == "" {
return nil, fmt.Errorf("invalid wing %q", wing)
}
if hall != "" {
return []string{filepath.Join(brainDir, "wiki", w, hall)}, nil
}
return []string{filepath.Join(brainDir, "wiki", w)}, nil
}
if hall != "" {
return nil, fmt.Errorf("hall filter requires wing")
}
return []string{
filepath.Join(brainDir, "knowledge"),
filepath.Join(brainDir, "wiki"),
}, nil
}
// extractTier reads the DIKW tier from frontmatter first, falling back
// to the path prefix mapping (infra#72). Mirrors graph.inferTierFromPath
// so the two callers stay in lockstep — frontmatter is canonical,
// path inference is the migration-window fallback.
func extractTier(content, relPath string) string {
scanner := bufio.NewScanner(strings.NewReader(content))
inFrontmatter := false
for scanner.Scan() {
line := scanner.Text()
if strings.TrimSpace(line) == "---" {
if !inFrontmatter {
inFrontmatter = true
continue
}
break
}
if !inFrontmatter {
continue
}
key, val, ok := strings.Cut(line, ":")
if !ok {
continue
}
if strings.TrimSpace(key) == "tier" {
return strings.Trim(strings.TrimSpace(val), `"'`)
}
}
parts := strings.Split(relPath, "/")
if len(parts) == 0 {
return ""
}
switch parts[0] {
case "inbox", "raw", "sessions", "clips":
return "inbox"
case "notes":
return "note"
case "wiki":
// wiki/entities/ anchor pages map to knowledge (see
// graph.inferTierFromPath for the rationale).
if len(parts) >= 2 && parts[1] == "entities" {
return "knowledge"
}
return "note"
case "knowledge":
return "knowledge"
}
return ""
}
// extractWingHall reads wing/hall from frontmatter first, falling back to
// path segments brain/wiki/<wing>/<hall>/.
func extractWingHall(content, relPath string) (wing, hall string) {
scanner := bufio.NewScanner(strings.NewReader(content))
inFrontmatter := false
for scanner.Scan() {
line := scanner.Text()
if strings.TrimSpace(line) == "---" {
if !inFrontmatter {
inFrontmatter = true
continue
}
break
}
if !inFrontmatter {
continue
}
key, val, ok := strings.Cut(line, ":")
if !ok {
continue
}
v := strings.Trim(strings.TrimSpace(val), `"'`)
switch strings.TrimSpace(key) {
case "wing":
wing = v
case "hall":
hall = v
}
}
if wing != "" && hall != "" {
return wing, hall
}
parts := strings.Split(relPath, "/")
if len(parts) >= 4 && parts[0] == "wiki" {
if wing == "" {
wing = parts[1]
}
if hall == "" && brain.IsValidHall(parts[2]) {
hall = parts[2]
}
}
return wing, hall
}
func extractTitle(content, filename string) string {
scanner := bufio.NewScanner(strings.NewReader(content))
inFrontmatter := false
@@ -113,7 +422,6 @@ func extractTitle(content, filename string) string {
}
func excerpt(content string, maxLen int) string {
// Skip frontmatter, return first maxLen chars of body.
parts := strings.SplitN(content, "---", 3)
body := content
if len(parts) == 3 {

View File

@@ -2,9 +2,11 @@
package search_test
import (
"context"
"fmt"
"os"
"path/filepath"
"strings"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/search"
@@ -12,6 +14,99 @@ import (
"github.com/stretchr/testify/require"
)
type stubEmbedder struct{ vec []float32 }
func (s stubEmbedder) Embed(_ context.Context, _ string) ([]float32, error) { return s.vec, nil }
type stubVector struct{ hits []search.VectorHit }
func (s stubVector) Search(_ context.Context, _ []float32, _ int) ([]search.VectorHit, error) {
return s.hits, nil
}
func TestSearch_HybridRRFPromotesVectorOnlyHit(t *testing.T) {
dir := t.TempDir()
for _, p := range []struct{ rel, body string }{
// BM25-keyword note (matches "lejpa" once)
{"wiki/jepa-fx/facts/foo.md", "---\ntitle: Foo\n---\nlejpa keyword\n"},
// Semantically related note that does NOT contain the keyword.
{"wiki/jepa-fx/facts/semantic.md", "---\ntitle: Semantic\n---\nNo keyword in body.\n"},
} {
full := filepath.Join(dir, p.rel)
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
require.NoError(t, os.WriteFile(full, []byte(p.body), 0o644))
}
embedder := stubEmbedder{vec: []float32{0.1}}
vector := stubVector{hits: []search.VectorHit{
{Path: "wiki/jepa-fx/facts/semantic.md", Distance: 0.05}, // best vector match
{Path: "wiki/jepa-fx/facts/foo.md", Distance: 0.10},
}}
got, err := search.Query(dir, search.QueryOptions{
Query: "lejpa",
Limit: 5,
Vector: vector,
Embedder: embedder,
})
require.NoError(t, err)
require.Len(t, got, 2, "vector-only hit should be hydrated into results")
paths := []string{got[0].Path, got[1].Path}
assert.Contains(t, paths, "wiki/jepa-fx/facts/foo.md")
assert.Contains(t, paths, "wiki/jepa-fx/facts/semantic.md")
}
func TestSearch_HybridDedupesChunkPathsToParent(t *testing.T) {
dir := t.TempDir()
full := filepath.Join(dir, "knowledge", "long.md")
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
// Body contains the BM25 keyword "alpaca" so hybridMerge actually runs
// (it only kicks in when BM25 returns at least one candidate).
require.NoError(t, os.WriteFile(full, []byte("---\ntitle: Long\n---\nalpaca content.\n"), 0o644))
embedder := stubEmbedder{vec: []float32{0.1}}
// Vector store returns three chunk-path hits all pointing at the same
// parent file. The merged result must surface ONE row per parent — not
// three rows with chunk-suffixed paths.
vector := stubVector{hits: []search.VectorHit{
{Path: "knowledge/long.md#0001", Distance: 0.05},
{Path: "knowledge/long.md#0002", Distance: 0.07},
{Path: "knowledge/long.md#0003", Distance: 0.09},
}}
got, err := search.Query(dir, search.QueryOptions{
Query: "alpaca",
Limit: 5,
Vector: vector,
Embedder: embedder,
})
require.NoError(t, err)
require.Len(t, got, 1, "three chunk hits for one parent must merge to one result")
assert.Equal(t, "knowledge/long.md", got[0].Path)
assert.Equal(t, "Long", got[0].Title)
}
func TestSearch_HybridFallsBackOnEmbedderError(t *testing.T) {
dir := t.TempDir()
require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki"), 0o755))
require.NoError(t, os.WriteFile(filepath.Join(dir, "wiki", "x.md"), []byte("keyword foo"), 0o644))
embedder := errorEmbedder{}
vector := stubVector{}
got, err := search.Query(dir, search.QueryOptions{
Query: "keyword", Limit: 5, Vector: vector, Embedder: embedder,
})
require.NoError(t, err)
require.Len(t, got, 1, "BM25 result should still come back when embedder fails")
assert.Equal(t, "wiki/x.md", got[0].Path)
}
type errorEmbedder struct{}
func (errorEmbedder) Embed(_ context.Context, _ string) ([]float32, error) {
return nil, assert.AnError
}
func TestSearch_ReturnsMatchingPages(t *testing.T) {
dir := t.TempDir()
require.NoError(t, os.MkdirAll(filepath.Join(dir, "knowledge"), 0o755))
@@ -27,7 +122,7 @@ func TestSearch_ReturnsMatchingPages(t *testing.T) {
0o644,
))
results, err := search.Query(dir, "retry transient", 5)
results, err := search.Query(dir, search.QueryOptions{Query: "retry transient", Limit: 5})
require.NoError(t, err)
require.Len(t, results, 1)
assert.Equal(t, "knowledge/retry-logic.md", results[0].Path)
@@ -36,6 +131,72 @@ func TestSearch_ReturnsMatchingPages(t *testing.T) {
assert.Contains(t, results[0].Excerpt, "Retry")
}
func TestSearch_TierWeightingReordersResults(t *testing.T) {
dir := t.TempDir()
// A long note-tier dump mentions the keyword many times (high raw
// BM25 score); a short knowledge entry mentions it three times.
// Raw BM25 prefers the dump; tier weighting (knowledge ×1.5 vs
// note ×1.0) flips the order if the score gap is within reach.
// note raw = 5 × 2 terms = 10 hits, weight 1.0 → 10
// knowledge raw = 4 × 2 terms = 8 hits, weight 1.5 → 12 (overtakes)
noteBody := "---\ntier: note\n---\n" + strings.Repeat("scram trap. ", 5)
knowledgeBody := "---\ntier: knowledge\n---\n" + strings.Repeat("scram trap. ", 4)
require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki", "sources"), 0o755))
require.NoError(t, os.MkdirAll(filepath.Join(dir, "knowledge"), 0o755))
require.NoError(t, os.WriteFile(filepath.Join(dir, "wiki", "sources", "dump.md"), []byte(noteBody), 0o644))
require.NoError(t, os.WriteFile(filepath.Join(dir, "knowledge", "trap.md"), []byte(knowledgeBody), 0o644))
results, err := search.Query(dir, search.QueryOptions{Query: "scram trap", Limit: 5})
require.NoError(t, err)
require.GreaterOrEqual(t, len(results), 2)
assert.Equal(t, "knowledge/trap.md", results[0].Path, "knowledge tier weight should beat note tier")
assert.Equal(t, "knowledge", results[0].Tier)
assert.Equal(t, "note", results[1].Tier)
}
func TestSearch_WingHallScoping(t *testing.T) {
dir := t.TempDir()
for _, p := range []struct{ rel, body string }{
{"wiki/jepa-fx/decisions/val-vol.md", "---\nwing: jepa-fx\nhall: decisions\n---\nval-vol-r2 keyword.\n"},
{"wiki/jepa-fx/facts/architecture.md", "---\nwing: jepa-fx\nhall: facts\n---\nval-vol-r2 keyword in facts.\n"},
{"wiki/hyperguild/decisions/routing.md", "---\nwing: hyperguild\nhall: decisions\n---\nval-vol-r2 reference.\n"},
{"knowledge/loose.md", "---\n---\nval-vol-r2 in knowledge.\n"},
} {
full := filepath.Join(dir, p.rel)
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
require.NoError(t, os.WriteFile(full, []byte(p.body), 0o644))
}
// No filter: walk both knowledge/ and wiki/ — all 4 match.
got, err := search.Query(dir, search.QueryOptions{Query: "val-vol-r2", Limit: 10})
require.NoError(t, err)
assert.Len(t, got, 4)
// Wing scope: 2 jepa-fx hits, no hyperguild, no knowledge.
got, err = search.Query(dir, search.QueryOptions{Query: "val-vol-r2", Limit: 10, Wing: "jepa-fx"})
require.NoError(t, err)
require.Len(t, got, 2)
for _, r := range got {
assert.Equal(t, "jepa-fx", r.Wing)
}
// Wing+Hall scope: 1 hit.
got, err = search.Query(dir, search.QueryOptions{Query: "val-vol-r2", Limit: 10, Wing: "jepa-fx", Hall: "decisions"})
require.NoError(t, err)
require.Len(t, got, 1)
assert.Equal(t, "jepa-fx", got[0].Wing)
assert.Equal(t, "decisions", got[0].Hall)
assert.Equal(t, "wiki/jepa-fx/decisions/val-vol.md", got[0].Path)
// Invalid hall rejected.
_, err = search.Query(dir, search.QueryOptions{Query: "x", Wing: "jepa-fx", Hall: "garbage"})
require.Error(t, err)
// Hall without wing rejected.
_, err = search.Query(dir, search.QueryOptions{Query: "x", Hall: "facts"})
require.Error(t, err)
}
func TestSearch_RespectsLimit(t *testing.T) {
dir := t.TempDir()
require.NoError(t, os.MkdirAll(filepath.Join(dir, "knowledge"), 0o755))
@@ -46,7 +207,7 @@ func TestSearch_RespectsLimit(t *testing.T) {
0o644,
))
}
results, err := search.Query(dir, "retry", 3)
results, err := search.Query(dir, search.QueryOptions{Query: "retry", Limit: 3})
require.NoError(t, err)
assert.LessOrEqual(t, len(results), 3)
}

View File

@@ -0,0 +1,137 @@
package vectorstore
import (
"fmt"
"strings"
)
// NumberedChunk pairs a chunk's body with the storage path it will use
// in brain_embeddings. Path format: "<parent>#NNNN" where NNNN is the
// 1-based chunk index zero-padded to 4 digits.
type NumberedChunk struct {
Path string
Content string
}
// ParentPath returns the file path with any "#NNNN" chunk suffix removed.
// Inputs without a "#" are returned unchanged. Used by search to dedupe
// chunk-level hits back to a single document per result.
func ParentPath(p string) string {
if i := strings.Index(p, "#"); i >= 0 {
return p[:i]
}
return p
}
// NumberChunks assigns "<parent>#NNNN" storage paths to a slice of chunk
// bodies, indexed from 0001. Empty chunks are dropped.
func NumberChunks(parent string, chunks []string) []NumberedChunk {
out := make([]NumberedChunk, 0, len(chunks))
idx := 1
for _, c := range chunks {
if strings.TrimSpace(c) == "" {
continue
}
out = append(out, NumberedChunk{
Path: fmt.Sprintf("%s#%04d", parent, idx),
Content: c,
})
idx++
}
return out
}
// ChunkMarkdown splits a markdown document into embedding-sized pieces.
// Strategy:
// 1. Split at H1/H2 headings (top-of-line "#" or "##"). The intro before
// the first heading is its own chunk.
// 2. Any section larger than maxBytes is further split at paragraph
// boundaries (blank lines), packing paragraphs greedily under the
// byte budget.
//
// The function aims for "fits comfortably under nomic-embed-text's 2048-
// token context" — at ~4 chars/token for English markdown, maxBytes ≈ 4000
// is a safe call-site default.
func ChunkMarkdown(content string, maxBytes int) []string {
if maxBytes <= 0 {
maxBytes = 4000
}
sections := splitAtHeadings(content)
out := make([]string, 0, len(sections))
for _, s := range sections {
if len(s) <= maxBytes {
out = append(out, s)
continue
}
out = append(out, splitAtParagraphs(s, maxBytes)...)
}
return out
}
// splitAtHeadings cuts content into sections that each start with an
// "# " or "## " line (intro before any heading is the leading section).
func splitAtHeadings(content string) []string {
lines := strings.Split(content, "\n")
var sections []string
var cur strings.Builder
flush := func() {
if cur.Len() == 0 {
return
}
// Trim all trailing whitespace then re-add a single newline so a
// single-paragraph file round-trips to its original content rather
// than accumulating extra newlines from the empty-line split.
s := strings.TrimRight(cur.String(), "\n")
sections = append(sections, s+"\n")
cur.Reset()
}
for _, ln := range lines {
trimmed := strings.TrimLeft(ln, " ")
isH := strings.HasPrefix(trimmed, "# ") || strings.HasPrefix(trimmed, "## ")
if isH && cur.Len() > 0 {
flush()
}
cur.WriteString(ln)
cur.WriteByte('\n')
}
flush()
// Drop empty / whitespace-only trailing section (common when content
// itself ends with a "\n" — Split leaves a final empty element).
if n := len(sections); n > 0 && strings.TrimSpace(sections[n-1]) == "" {
sections = sections[:n-1]
}
return sections
}
// splitAtParagraphs packs paragraphs (blank-line separated blocks) into
// sub-chunks of at most maxBytes. A single paragraph that itself exceeds
// maxBytes is emitted as one over-budget chunk rather than being split
// mid-sentence — better to over-spend a little than truncate prose.
func splitAtParagraphs(section string, maxBytes int) []string {
paras := strings.Split(section, "\n\n")
var out []string
var cur strings.Builder
for _, p := range paras {
if p == "" {
continue
}
// +2 for the "\n\n" rejoin if cur isn't empty
need := len(p)
if cur.Len() > 0 {
need += 2
}
if cur.Len() > 0 && cur.Len()+need > maxBytes {
out = append(out, cur.String())
cur.Reset()
}
if cur.Len() > 0 {
cur.WriteString("\n\n")
}
cur.WriteString(p)
}
if cur.Len() > 0 {
out = append(out, cur.String())
}
return out
}

View File

@@ -0,0 +1,72 @@
package vectorstore_test
import (
"strings"
"testing"
"github.com/mathiasbq/hyperguild/ingestion/internal/vectorstore"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestChunkMarkdown_ShortFileFitsInOne(t *testing.T) {
out := vectorstore.ChunkMarkdown("Just a short paragraph.\n", 4000)
require.Len(t, out, 1)
assert.Equal(t, "Just a short paragraph.\n", out[0])
}
func TestChunkMarkdown_SplitsAtHeadings(t *testing.T) {
src := "# Top\n\nintro\n\n## A\n\nbody a\n\n## B\n\nbody b\n"
out := vectorstore.ChunkMarkdown(src, 50) // tiny limit forces per-section split
assert.GreaterOrEqual(t, len(out), 2, "should split at H2 boundaries")
// Each chunk should start with a heading (top-level intro chunk OK without one)
for i, c := range out {
if i == 0 {
continue
}
assert.True(t, strings.HasPrefix(strings.TrimSpace(c), "#"),
"non-first chunk %d should start with heading: %q", i, c)
}
}
func TestChunkMarkdown_FurtherSplitsOversizedSection(t *testing.T) {
// One H2 section with 4 paragraphs of ~80 chars each, limit 100.
src := "## big\n\n" +
strings.Repeat("paragraph one is moderately long.\n\n", 1) +
strings.Repeat("paragraph two also moderately long.\n\n", 1) +
strings.Repeat("paragraph three is moderately long.\n\n", 1) +
strings.Repeat("paragraph four is moderately long.\n\n", 1)
out := vectorstore.ChunkMarkdown(src, 100)
assert.Greater(t, len(out), 1, "oversized section should sub-split at paragraph boundaries")
for i, c := range out {
assert.LessOrEqual(t, len(c), 200,
"chunk %d exceeds 2x maxBytes: %d", i, len(c))
}
}
func TestChunkMarkdown_PreservesContent(t *testing.T) {
src := "# H1\n\nfirst section body.\n\n## H2a\n\nsecond section body.\n\n## H2b\n\nthird section body.\n"
out := vectorstore.ChunkMarkdown(src, 50)
joined := strings.Join(out, "")
// All non-whitespace tokens from src must appear in the joined output
for _, token := range []string{"H1", "first", "H2a", "second", "H2b", "third"} {
assert.Contains(t, joined, token, "token %q missing after chunking", token)
}
}
func TestChunkMarkdown_NumberedSuffix(t *testing.T) {
out := vectorstore.NumberChunks("knowledge/foo.md", []string{"a", "b", "c"})
require.Len(t, out, 3)
assert.Equal(t, "knowledge/foo.md#0001", out[0].Path)
assert.Equal(t, "knowledge/foo.md#0002", out[1].Path)
assert.Equal(t, "knowledge/foo.md#0003", out[2].Path)
assert.Equal(t, "a", out[0].Content)
}
func TestParentPath_StripsChunkSuffix(t *testing.T) {
assert.Equal(t, "knowledge/foo.md", vectorstore.ParentPath("knowledge/foo.md#0001"))
assert.Equal(t, "knowledge/foo.md", vectorstore.ParentPath("knowledge/foo.md"))
assert.Equal(t, "wiki/a/b.md", vectorstore.ParentPath("wiki/a/b.md#9999"))
}

View File

@@ -0,0 +1,161 @@
// Package vectorstore stores brain note embeddings in pgvector on the
// shared postgres18 instance. One row per markdown path, cosine-distance
// indexed via HNSW for sub-millisecond top-k retrieval.
package vectorstore
import (
"context"
"errors"
"fmt"
"strings"
"time"
"github.com/jackc/pgx/v5"
"github.com/jackc/pgx/v5/pgxpool"
)
// Hit is a single result from a cosine-distance search.
type Hit struct {
Path string
Distance float64 // 0 = identical, 2 = opposite
}
// PGStore is a pgvector-backed embeddings store. Construct with New and
// call Init once to create the table + HNSW index. Use Close to release
// the underlying pool.
type PGStore struct {
pool *pgxpool.Pool
}
// New opens a connection pool against dsn (a libpq-style URL). Caller
// owns the resulting *PGStore and must invoke Close.
func New(ctx context.Context, dsn string) (*PGStore, error) {
pool, err := pgxpool.New(ctx, dsn)
if err != nil {
return nil, fmt.Errorf("pgxpool: %w", err)
}
if err := pool.Ping(ctx); err != nil {
pool.Close()
return nil, fmt.Errorf("ping: %w", err)
}
return &PGStore{pool: pool}, nil
}
// Close releases the underlying connection pool.
func (s *PGStore) Close() {
if s.pool != nil {
s.pool.Close()
}
}
// Init creates the brain_embeddings table and its HNSW index if they
// don't already exist. Safe to call on every startup. Assumes the
// `vector` extension is already installed (one-time DBA setup; see
// scripts/brain-embeddings-init.sql).
func (s *PGStore) Init(ctx context.Context) error {
const ddl = `
CREATE TABLE IF NOT EXISTS brain_embeddings (
path TEXT PRIMARY KEY,
embedding vector(768) NOT NULL,
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX IF NOT EXISTS brain_embeddings_embedding_idx
ON brain_embeddings USING hnsw (embedding vector_cosine_ops);
`
_, err := s.pool.Exec(ctx, ddl)
return err
}
// Upsert inserts or replaces the embedding for path. Embedding must be
// 768-dim (nomic-embed-text). Caller is responsible for normalising
// paths to forward-slash form.
func (s *PGStore) Upsert(ctx context.Context, path string, embedding []float32) error {
if len(embedding) != 768 {
return fmt.Errorf("expected 768-dim embedding, got %d", len(embedding))
}
_, err := s.pool.Exec(ctx, `
INSERT INTO brain_embeddings (path, embedding, updated_at)
VALUES ($1, $2, now())
ON CONFLICT (path) DO UPDATE
SET embedding = EXCLUDED.embedding, updated_at = now()
`, path, vectorLiteral(embedding))
return err
}
// Delete removes the row at path. No-op when the row doesn't exist.
func (s *PGStore) Delete(ctx context.Context, path string) error {
_, err := s.pool.Exec(ctx, `DELETE FROM brain_embeddings WHERE path = $1`, path)
return err
}
// Search returns the top-limit nearest paths by cosine distance.
func (s *PGStore) Search(ctx context.Context, query []float32, limit int) ([]Hit, error) {
if len(query) != 768 {
return nil, fmt.Errorf("expected 768-dim query, got %d", len(query))
}
if limit <= 0 {
limit = 10
}
rows, err := s.pool.Query(ctx, `
SELECT path, embedding <=> $1 AS distance
FROM brain_embeddings
ORDER BY embedding <=> $1
LIMIT $2
`, vectorLiteral(query), limit)
if err != nil {
return nil, fmt.Errorf("query: %w", err)
}
defer rows.Close()
var hits []Hit
for rows.Next() {
var h Hit
if err := rows.Scan(&h.Path, &h.Distance); err != nil {
return nil, fmt.Errorf("scan: %w", err)
}
hits = append(hits, h)
}
if err := rows.Err(); err != nil && !errors.Is(err, pgx.ErrNoRows) {
return nil, err
}
return hits, nil
}
// KnownPathsWithTime returns every embedded chunk path paired with the
// row's updated_at. Sync uses the timestamps to decide whether a file
// has been edited since its chunks were last embedded — when the file's
// mtime exceeds the oldest chunk's updated_at, the file is re-embedded.
func (s *PGStore) KnownPathsWithTime(ctx context.Context) (map[string]time.Time, error) {
rows, err := s.pool.Query(ctx, `SELECT path, updated_at FROM brain_embeddings`)
if err != nil {
return nil, fmt.Errorf("query paths: %w", err)
}
defer rows.Close()
out := make(map[string]time.Time)
for rows.Next() {
var (
p string
t time.Time
)
if err := rows.Scan(&p, &t); err != nil {
return nil, err
}
out[p] = t
}
return out, rows.Err()
}
// vectorLiteral renders a Go float32 slice as the literal representation
// pgvector accepts as a parametric input: `[v1,v2,...,vN]`.
func vectorLiteral(v []float32) string {
var b strings.Builder
b.WriteByte('[')
for i, x := range v {
if i > 0 {
b.WriteByte(',')
}
fmt.Fprintf(&b, "%g", x)
}
b.WriteByte(']')
return b.String()
}

View File

@@ -0,0 +1,94 @@
package vectorstore_test
import (
"context"
"os"
"testing"
"time"
"github.com/mathiasbq/hyperguild/ingestion/internal/vectorstore"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// integration tests run against a real postgres18 + pgvector. Gated by
// BRAIN_PG_TEST_DSN so `task check` stays hermetic on hosts without a
// reachable database.
//
// To run:
// BRAIN_PG_TEST_DSN='postgres://brain_app:pwd@127.0.0.1:5432/brain' \
// go test ./internal/vectorstore/... -run Integration
func dsn(t *testing.T) string {
t.Helper()
v := os.Getenv("BRAIN_PG_TEST_DSN")
if v == "" {
t.Skip("BRAIN_PG_TEST_DSN not set; skipping pgvector integration tests")
}
return v
}
func freshStore(t *testing.T) (*vectorstore.PGStore, context.Context) {
t.Helper()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
t.Cleanup(cancel)
s, err := vectorstore.New(ctx, dsn(t))
require.NoError(t, err)
t.Cleanup(s.Close)
require.NoError(t, s.Init(ctx))
// Clean slate per test.
_, _ = s.KnownPathsWithTime(ctx)
require.NoError(t, s.Delete(ctx, "%test-fixture%"))
return s, ctx
}
func vec(dim int, fill float32) []float32 {
v := make([]float32, dim)
for i := range v {
v[i] = fill
}
return v
}
func TestIntegration_UpsertAndSearch(t *testing.T) {
s, ctx := freshStore(t)
require.NoError(t, s.Upsert(ctx, "wiki/a.md", vec(768, 1.0)))
require.NoError(t, s.Upsert(ctx, "wiki/b.md", vec(768, -1.0)))
hits, err := s.Search(ctx, vec(768, 1.0), 2)
require.NoError(t, err)
require.GreaterOrEqual(t, len(hits), 1)
assert.Equal(t, "wiki/a.md", hits[0].Path)
assert.InDelta(t, 0.0, hits[0].Distance, 1e-5)
t.Cleanup(func() {
_ = s.Delete(ctx, "wiki/a.md")
_ = s.Delete(ctx, "wiki/b.md")
})
}
func TestIntegration_KnownPathsWithTime(t *testing.T) {
s, ctx := freshStore(t)
before := time.Now()
require.NoError(t, s.Upsert(ctx, "wiki/k.md", vec(768, 0.5)))
t.Cleanup(func() { _ = s.Delete(ctx, "wiki/k.md") })
paths, err := s.KnownPathsWithTime(ctx)
require.NoError(t, err)
at, ok := paths["wiki/k.md"]
require.True(t, ok)
assert.False(t, at.IsZero(), "updated_at must not be zero")
assert.WithinDuration(t, before, at, 5*time.Second, "updated_at must be recent")
}
func TestUpsert_RejectsWrongDimension(t *testing.T) {
s := &vectorstore.PGStore{}
err := s.Upsert(context.Background(), "x", vec(100, 0))
require.Error(t, err)
}
func TestSearch_RejectsWrongDimension(t *testing.T) {
s := &vectorstore.PGStore{}
_, err := s.Search(context.Background(), vec(100, 0), 5)
require.Error(t, err)
}

View File

@@ -0,0 +1,205 @@
package vectorstore
import (
"context"
"fmt"
"log/slog"
"os"
"path/filepath"
"strings"
"time"
)
// Embedder produces dense vectors. The embed package's Client satisfies
// this; it's declared locally so vectorstore doesn't depend on embed.
type Embedder interface {
Embed(ctx context.Context, text string) ([]float32, error)
}
// Store is the subset of PGStore that Sync needs. Lets tests stub it.
type Store interface {
// KnownPathsWithTime returns every embedded chunk path paired with the
// row's updated_at. Sync uses the timestamp to detect edits — a file
// whose mtime is newer than ANY of its chunks' updated_at is re-embedded
// from scratch (old chunks deleted, fresh chunks upserted).
KnownPathsWithTime(ctx context.Context) (map[string]time.Time, error)
Upsert(ctx context.Context, path string, embedding []float32) error
Delete(ctx context.Context, path string) error
}
// SyncResult tallies what Sync did. Returned for logs / metrics; callers
// generally don't act on the fields directly.
type SyncResult struct {
Added int
Updated int
Deleted int
Errors []error
}
// scanDirs is the set of brainDir subdirectories whose .md files are
// embedded for vector retrieval. wiki/ holds LLM-extracted entity and
// source pages; knowledge/ holds curated hand-written entries.
var scanDirs = []string{"wiki", "knowledge"}
// maxChunkBytes is the per-chunk byte budget passed to ChunkMarkdown.
// Sized to fit comfortably under nomic-embed-text's 2048-token default
// context (~4 chars/token for English markdown → ~8 KB ceiling; we sit
// at 4 KB to leave headroom for unicode, code blocks, and tokenizer
// variance).
const maxChunkBytes = 4000
// Sync brings the embedding store in line with brain/{wiki,knowledge}/
// on disk:
// - new files (in the tree, not in the store) get embedded + upserted
// - files whose mtime exceeds the store's updated_at get re-embedded
// - files no longer on disk get deleted from the store
//
// Designed to be called on a ticker. Best-effort: per-file errors are
// collected into SyncResult.Errors and do not abort the run.
func Sync(ctx context.Context, brainDir string, store Store, embedder Embedder) (SyncResult, error) {
var res SyncResult
if store == nil || embedder == nil {
return res, nil
}
known, err := store.KnownPathsWithTime(ctx)
if err != nil {
return res, fmt.Errorf("known paths: %w", err)
}
// Group known chunks by parent path and remember the EARLIEST
// updated_at per parent. A file is considered stale if its mtime is
// after the oldest of its chunk rows — i.e. at least one chunk hasn't
// been refreshed since the last edit. Also keep the full chunk-path
// list per parent so we can delete every old chunk before re-embedding
// (handles "file shrunk → fewer chunks → orphan rows" cleanly).
type parentState struct {
minUpdatedAt time.Time
chunkPaths []string
}
parents := make(map[string]*parentState, len(known))
for p, t := range known {
parent := ParentPath(p)
ps, ok := parents[parent]
if !ok {
ps = &parentState{minUpdatedAt: t}
parents[parent] = ps
} else if t.Before(ps.minUpdatedAt) {
ps.minUpdatedAt = t
}
ps.chunkPaths = append(ps.chunkPaths, p)
}
seenParents := make(map[string]struct{})
for _, sub := range scanDirs {
root := filepath.Join(brainDir, sub)
if _, err := os.Stat(root); os.IsNotExist(err) {
continue
}
err = filepath.WalkDir(root, func(path string, d os.DirEntry, err error) error {
if err != nil {
return err
}
if d.IsDir() || !strings.HasSuffix(path, ".md") || d.Name() == "_index.md" {
return nil
}
rel, err := filepath.Rel(brainDir, path)
if err != nil {
return err
}
relSlash := filepath.ToSlash(rel)
seenParents[relSlash] = struct{}{}
if ps, ok := parents[relSlash]; ok {
// File already has chunks in the store. Re-embed only when
// the file has been edited since the oldest chunk was
// written. Tolerate clock skew with a sub-second grace.
info, statErr := d.Info()
if statErr != nil {
res.Errors = append(res.Errors, fmt.Errorf("stat %s: %w", relSlash, statErr))
return nil
}
if !info.ModTime().After(ps.minUpdatedAt) {
return nil
}
// Stale: delete old chunks before re-embedding so a shrunk
// file doesn't leave orphan rows at higher #NNNN indexes.
for _, oldPath := range ps.chunkPaths {
if delErr := store.Delete(ctx, oldPath); delErr != nil {
res.Errors = append(res.Errors, fmt.Errorf("delete %s for re-embed: %w", oldPath, delErr))
return nil
}
}
}
content, readErr := os.ReadFile(path)
if readErr != nil {
res.Errors = append(res.Errors, fmt.Errorf("read %s: %w", relSlash, readErr))
return nil
}
chunks := NumberChunks(relSlash, ChunkMarkdown(string(content), maxChunkBytes))
for _, ch := range chunks {
vec, embErr := embedder.Embed(ctx, ch.Content)
if embErr != nil {
res.Errors = append(res.Errors, fmt.Errorf("embed %s: %w", ch.Path, embErr))
continue
}
if upErr := store.Upsert(ctx, ch.Path, vec); upErr != nil {
res.Errors = append(res.Errors, fmt.Errorf("upsert %s: %w", ch.Path, upErr))
continue
}
res.Added++
}
return nil
})
if err != nil {
return res, fmt.Errorf("walk %s: %w", sub, err)
}
}
// Drop chunk rows whose parent file is gone.
for path := range known {
if _, ok := seenParents[ParentPath(path)]; ok {
continue
}
if err := store.Delete(ctx, path); err != nil {
res.Errors = append(res.Errors, fmt.Errorf("delete %s: %w", path, err))
continue
}
res.Deleted++
}
return res, nil
}
// StartSync launches Sync on a ticker in a background goroutine. The
// goroutine exits when ctx is cancelled. Failures are logged via slog.
func StartSync(ctx context.Context, brainDir string, store Store, embedder Embedder, interval time.Duration) {
if interval <= 0 {
interval = 5 * time.Minute
}
go func() {
t := time.NewTicker(interval)
defer t.Stop()
// Run once immediately so first-boot doesn't wait a full tick.
if r, err := Sync(ctx, brainDir, store, embedder); err != nil {
slog.Error("embed sync failed", "err", err)
} else if r.Added+r.Deleted > 0 || len(r.Errors) > 0 {
slog.Info("embed sync", "added", r.Added, "deleted", r.Deleted, "errors", len(r.Errors))
for _, e := range r.Errors {
slog.Warn("embed sync item failed", "err", e)
}
}
for {
select {
case <-ctx.Done():
return
case <-t.C:
if r, err := Sync(ctx, brainDir, store, embedder); err != nil {
slog.Error("embed sync failed", "err", err)
} else if r.Added+r.Deleted > 0 || len(r.Errors) > 0 {
slog.Info("embed sync", "added", r.Added, "deleted", r.Deleted, "errors", len(r.Errors))
}
}
}
}()
}

View File

@@ -0,0 +1,274 @@
package vectorstore_test
import (
"context"
"errors"
"os"
"path/filepath"
"strings"
"testing"
"time"
"github.com/mathiasbq/hyperguild/ingestion/internal/vectorstore"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
type stubStore struct {
// known maps chunk-path → updated_at. Tests that don't care about
// re-embed-on-mtime use a far-future time so the Sync skip path
// always wins. Tests that do exercise the mtime path set the
// updated_at explicitly.
known map[string]time.Time
upserts map[string][]float32
deletes []string
failNext error
}
// farFuture is "newer than any file mtime", used as the default
// updated_at in stubs that don't care about re-embed behavior.
var farFuture = time.Now().Add(24 * time.Hour)
func (s *stubStore) KnownPathsWithTime(_ context.Context) (map[string]time.Time, error) {
out := make(map[string]time.Time, len(s.known))
for k, t := range s.known {
if t.IsZero() {
t = farFuture
}
out[k] = t
}
return out, nil
}
func (s *stubStore) Upsert(_ context.Context, path string, v []float32) error {
if s.failNext != nil {
err := s.failNext
s.failNext = nil
return err
}
if s.upserts == nil {
s.upserts = make(map[string][]float32)
}
s.upserts[path] = v
return nil
}
func (s *stubStore) Delete(_ context.Context, path string) error {
s.deletes = append(s.deletes, path)
return nil
}
type stubEmbedder struct {
vec []float32
err error
}
func (e stubEmbedder) Embed(_ context.Context, _ string) ([]float32, error) {
return e.vec, e.err
}
func writeNote(t *testing.T, dir, rel, body string) {
t.Helper()
full := filepath.Join(dir, rel)
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
require.NoError(t, os.WriteFile(full, []byte(body), 0o644))
}
func TestSync_AddsNewFiles(t *testing.T) {
dir := t.TempDir()
writeNote(t, dir, "wiki/jepa-fx/facts/x.md", "body of x")
writeNote(t, dir, "wiki/jepa-fx/facts/y.md", "body of y")
store := &stubStore{known: map[string]time.Time{}}
emb := stubEmbedder{vec: make([]float32, 768)}
res, err := vectorstore.Sync(context.Background(), dir, store, emb)
require.NoError(t, err)
assert.Equal(t, 2, res.Added)
assert.Empty(t, res.Deleted)
assert.Contains(t, store.upserts, "wiki/jepa-fx/facts/x.md#0001")
assert.Contains(t, store.upserts, "wiki/jepa-fx/facts/y.md#0001")
}
func TestSync_SkipsAlreadyKnown(t *testing.T) {
dir := t.TempDir()
writeNote(t, dir, "wiki/a/facts/x.md", "x")
store := &stubStore{known: map[string]time.Time{"wiki/a/facts/x.md#0001": {}}}
emb := stubEmbedder{vec: make([]float32, 768)}
res, err := vectorstore.Sync(context.Background(), dir, store, emb)
require.NoError(t, err)
assert.Equal(t, 0, res.Added)
assert.Empty(t, store.upserts)
}
func TestSync_DeletesDisappearedFiles(t *testing.T) {
dir := t.TempDir()
require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki"), 0o755))
// store has a path that doesn't exist on disk anymore
store := &stubStore{known: map[string]time.Time{"wiki/old/facts/ghost.md#0001": {}}}
res, err := vectorstore.Sync(context.Background(), dir, &stubStoreWithDelete{stubStore: store}, stubEmbedder{vec: make([]float32, 768)})
require.NoError(t, err)
assert.Equal(t, 1, res.Deleted)
}
// stubStoreWithDelete is a thin wrapper to capture Delete calls;
// stubStore already implements Delete but we need the wrapper to mix
// store interfaces with sync-specific expectations.
type stubStoreWithDelete struct {
*stubStore
}
func TestSync_SkipsIndexFiles(t *testing.T) {
dir := t.TempDir()
writeNote(t, dir, "wiki/a/_index.md", "moc")
writeNote(t, dir, "wiki/a/facts/real.md", "body")
store := &stubStore{known: map[string]time.Time{}}
res, err := vectorstore.Sync(context.Background(), dir, store, stubEmbedder{vec: make([]float32, 768)})
require.NoError(t, err)
assert.Equal(t, 1, res.Added)
assert.NotContains(t, store.upserts, "wiki/a/_index.md#0001")
}
func TestSync_ScansKnowledgeDir(t *testing.T) {
dir := t.TempDir()
writeNote(t, dir, "wiki/a/facts/x.md", "x")
writeNote(t, dir, "knowledge/2026-05-19-koala-gpu-setup.md", "knowledge body")
store := &stubStore{known: map[string]time.Time{}}
emb := stubEmbedder{vec: make([]float32, 768)}
res, err := vectorstore.Sync(context.Background(), dir, store, emb)
require.NoError(t, err)
assert.Equal(t, 2, res.Added)
assert.Contains(t, store.upserts, "wiki/a/facts/x.md#0001")
assert.Contains(t, store.upserts, "knowledge/2026-05-19-koala-gpu-setup.md#0001")
}
func TestSync_ChunksLongFiles(t *testing.T) {
dir := t.TempDir()
// Build a file that's well over the chunk byte budget. Multi-section
// markdown so the chunker has heading boundaries to cut on.
body := "# Doc\n\nintro line.\n\n"
for i := 0; i < 10; i++ {
body += "## Section " + string(rune('A'+i)) + "\n\n"
body += strings.Repeat("This section has a fair amount of content. ", 50) + "\n\n"
}
writeNote(t, dir, "knowledge/long.md", body)
store := &stubStore{known: map[string]time.Time{}}
emb := stubEmbedder{vec: make([]float32, 768)}
res, err := vectorstore.Sync(context.Background(), dir, store, emb)
require.NoError(t, err)
assert.Greater(t, res.Added, 1, "long file should produce multiple chunk rows")
// Every upserted path for this file must be a chunk path.
chunkCount := 0
for p := range store.upserts {
if strings.HasPrefix(p, "knowledge/long.md#") {
chunkCount++
}
}
assert.Equal(t, res.Added, chunkCount, "all rows for long file should be chunk-suffixed")
// The bare parent path must NOT be upserted directly.
assert.NotContains(t, store.upserts, "knowledge/long.md")
}
func TestSync_ShortFileGetsSingleChunkRow(t *testing.T) {
dir := t.TempDir()
writeNote(t, dir, "wiki/short.md", "tiny body\n")
store := &stubStore{known: map[string]time.Time{}}
emb := stubEmbedder{vec: make([]float32, 768)}
res, err := vectorstore.Sync(context.Background(), dir, store, emb)
require.NoError(t, err)
assert.Equal(t, 1, res.Added)
assert.Contains(t, store.upserts, "wiki/short.md#0001")
}
func TestSync_SkipsFileIfAnyChunkAlreadyKnown(t *testing.T) {
dir := t.TempDir()
writeNote(t, dir, "wiki/foo.md", "body\n")
store := &stubStore{known: map[string]time.Time{
"wiki/foo.md#0001": {},
}}
emb := stubEmbedder{vec: make([]float32, 768)}
res, err := vectorstore.Sync(context.Background(), dir, store, emb)
require.NoError(t, err)
assert.Equal(t, 0, res.Added)
assert.Empty(t, store.upserts)
}
func TestSync_DeletesAllChunksOfDisappearedFile(t *testing.T) {
dir := t.TempDir()
require.NoError(t, os.MkdirAll(filepath.Join(dir, "wiki"), 0o755))
store := &stubStore{known: map[string]time.Time{
"wiki/ghost.md#0001": {},
"wiki/ghost.md#0002": {},
"wiki/ghost.md#0003": {},
}}
res, err := vectorstore.Sync(context.Background(), dir, store, stubEmbedder{vec: make([]float32, 768)})
require.NoError(t, err)
assert.Equal(t, 3, res.Deleted)
}
func TestSync_ReembedsFileWhenMtimeNewer(t *testing.T) {
dir := t.TempDir()
writeNote(t, dir, "wiki/edited.md", "original body\n")
// Force the file's mtime ahead of any plausible store updated_at.
future := time.Now().Add(1 * time.Hour)
require.NoError(t, os.Chtimes(filepath.Join(dir, "wiki/edited.md"), future, future))
store := &stubStore{
known: map[string]time.Time{
// Existing chunk row pre-dates the file's mtime.
"wiki/edited.md#0001": time.Now().Add(-1 * time.Hour),
},
}
emb := stubEmbedder{vec: make([]float32, 768)}
res, err := vectorstore.Sync(context.Background(), dir, store, emb)
require.NoError(t, err)
assert.Equal(t, 1, res.Added, "file with newer mtime should be re-embedded")
assert.Contains(t, store.upserts, "wiki/edited.md#0001")
// Old chunks of the same parent must be deleted before re-embed so
// shrunk files don't leave orphan rows at higher #NNNN indexes.
assert.Contains(t, store.deletes, "wiki/edited.md#0001")
}
func TestSync_SkipsFileWhenMtimeOlder(t *testing.T) {
dir := t.TempDir()
writeNote(t, dir, "wiki/stable.md", "body\n")
// Backdate mtime to before the store's recorded updated_at.
past := time.Now().Add(-2 * time.Hour)
require.NoError(t, os.Chtimes(filepath.Join(dir, "wiki/stable.md"), past, past))
store := &stubStore{
known: map[string]time.Time{
"wiki/stable.md#0001": time.Now(),
},
}
emb := stubEmbedder{vec: make([]float32, 768)}
res, err := vectorstore.Sync(context.Background(), dir, store, emb)
require.NoError(t, err)
assert.Equal(t, 0, res.Added)
assert.Empty(t, store.upserts)
assert.Empty(t, store.deletes)
}
func TestSync_NoOpWhenComponentsNil(t *testing.T) {
dir := t.TempDir()
writeNote(t, dir, "wiki/a/facts/x.md", "x")
res, err := vectorstore.Sync(context.Background(), dir, nil, nil)
require.NoError(t, err)
assert.Equal(t, 0, res.Added)
}
func TestSync_CollectsEmbedderErrors(t *testing.T) {
dir := t.TempDir()
writeNote(t, dir, "wiki/a/facts/x.md", "x")
store := &stubStore{known: map[string]time.Time{}}
emb := stubEmbedder{err: errors.New("upstream down")}
res, err := vectorstore.Sync(context.Background(), dir, store, emb)
require.NoError(t, err)
assert.Equal(t, 0, res.Added)
assert.Len(t, res.Errors, 1)
}

84
internal/auth/jwt.go Normal file
View File

@@ -0,0 +1,84 @@
package auth
import (
"context"
"encoding/json"
"fmt"
"net/http"
"time"
"github.com/lestrrat-go/jwx/v2/jwk"
"github.com/lestrrat-go/jwx/v2/jwt"
)
// Validator validates Bearer JWTs issued by a Dex (OIDC) authorization server.
// Audience is optional; leave empty to skip audience validation.
type Validator struct {
issuer string
audience string
jwksURI string
cache *jwk.Cache
}
// NewValidator fetches the OIDC discovery document from issuerURL, extracts
// jwks_uri, seeds the JWKS cache, and returns a ready Validator.
// If DEX_ISSUER_URL is not set the caller should pass "" and skip construction.
func NewValidator(issuerURL, audience string) (*Validator, error) {
resp, err := http.Get(issuerURL + "/.well-known/openid-configuration") //nolint:noctx
if err != nil {
return nil, fmt.Errorf("fetch oidc discovery: %w", err)
}
defer resp.Body.Close() //nolint:errcheck
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("oidc discovery: status %d", resp.StatusCode)
}
var doc struct {
JWKSURI string `json:"jwks_uri"`
}
if err := json.NewDecoder(resp.Body).Decode(&doc); err != nil {
return nil, fmt.Errorf("decode oidc discovery: %w", err)
}
if doc.JWKSURI == "" {
return nil, fmt.Errorf("oidc discovery: empty jwks_uri")
}
ctx := context.Background()
cache := jwk.NewCache(ctx)
if err := cache.Register(doc.JWKSURI, jwk.WithMinRefreshInterval(time.Hour)); err != nil {
return nil, fmt.Errorf("register jwks cache: %w", err)
}
if _, err := cache.Refresh(ctx, doc.JWKSURI); err != nil {
return nil, fmt.Errorf("initial jwks fetch: %w", err)
}
return &Validator{
issuer: issuerURL,
audience: audience,
jwksURI: doc.JWKSURI,
cache: cache,
}, nil
}
// Validate parses and validates rawToken. Returns the subject claim on success.
func (v *Validator) Validate(ctx context.Context, rawToken string) (string, error) {
keySet, err := v.cache.Get(ctx, v.jwksURI)
if err != nil {
return "", fmt.Errorf("get jwks: %w", err)
}
opts := []jwt.ParseOption{
jwt.WithKeySet(keySet),
jwt.WithValidate(true),
jwt.WithIssuer(v.issuer),
}
if v.audience != "" {
opts = append(opts, jwt.WithAudience(v.audience))
}
tok, err := jwt.ParseString(rawToken, opts...)
if err != nil {
return "", fmt.Errorf("validate jwt: %w", err)
}
return tok.Subject(), nil
}

169
internal/auth/jwt_test.go Normal file
View File

@@ -0,0 +1,169 @@
package auth_test
import (
"context"
"crypto/rand"
"crypto/rsa"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/lestrrat-go/jwx/v2/jwa"
"github.com/lestrrat-go/jwx/v2/jwk"
"github.com/lestrrat-go/jwx/v2/jwt"
"github.com/mathiasbq/supervisor/internal/auth"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
type testKeys struct {
priv jwk.Key
pub jwk.Key
}
func generateRSAKeys(t *testing.T) testKeys {
t.Helper()
raw, err := rsa.GenerateKey(rand.Reader, 2048)
require.NoError(t, err)
priv, err := jwk.FromRaw(raw)
require.NoError(t, err)
require.NoError(t, priv.Set(jwk.KeyIDKey, "test-kid"))
require.NoError(t, priv.Set(jwk.AlgorithmKey, jwa.RS256))
pub, err := jwk.PublicKeyOf(priv)
require.NoError(t, err)
return testKeys{priv: priv, pub: pub}
}
func mockOIDCServer(t *testing.T, keys testKeys) *httptest.Server {
t.Helper()
set := jwk.NewSet()
require.NoError(t, set.AddKey(keys.pub))
jwksBytes, err := json.Marshal(set)
require.NoError(t, err)
mux := http.NewServeMux()
var srv *httptest.Server
mux.HandleFunc("/.well-known/openid-configuration", func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_ = json.NewEncoder(w).Encode(map[string]string{
"issuer": srv.URL,
"jwks_uri": srv.URL + "/jwks",
})
})
mux.HandleFunc("/jwks", func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write(jwksBytes)
})
srv = httptest.NewServer(mux)
t.Cleanup(srv.Close)
return srv
}
func signToken(t *testing.T, keys testKeys, issuer, audience, subject string, exp time.Time) string {
t.Helper()
b := jwt.NewBuilder().
Issuer(issuer).
Subject(subject).
Expiration(exp)
if audience != "" {
b = b.Audience([]string{audience})
}
tok, err := b.Build()
require.NoError(t, err)
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
require.NoError(t, err)
return string(signed)
}
func TestValidator(t *testing.T) {
keys := generateRSAKeys(t)
srv := mockOIDCServer(t, keys)
ctx := context.Background()
v, err := auth.NewValidator(srv.URL, "brain")
require.NoError(t, err)
tests := []struct {
name string
token string
wantSub string
wantErr bool
}{
{
name: "valid jwt",
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)),
wantSub: "test-user",
},
{
name: "expired jwt",
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(-time.Hour)),
wantErr: true,
},
{
name: "wrong issuer",
token: signToken(t, keys, "https://evil.example.com", "brain", "test-user", time.Now().Add(time.Hour)),
wantErr: true,
},
{
name: "wrong audience",
token: signToken(t, keys, srv.URL, "other-service", "test-user", time.Now().Add(time.Hour)),
wantErr: true,
},
{
name: "tampered token",
token: signToken(t, keys, srv.URL, "brain", "test-user", time.Now().Add(time.Hour)) + "tampered",
wantErr: true,
},
{
name: "not a jwt",
token: "not-a-jwt",
wantErr: true,
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
sub, err := v.Validate(ctx, tc.token)
if tc.wantErr {
assert.Error(t, err)
assert.Empty(t, sub)
} else {
require.NoError(t, err)
assert.Equal(t, tc.wantSub, sub)
}
})
}
}
func TestNewValidator_NoAudience(t *testing.T) {
keys := generateRSAKeys(t)
srv := mockOIDCServer(t, keys)
ctx := context.Background()
v, err := auth.NewValidator(srv.URL, "")
require.NoError(t, err)
// Token without audience passes when audience validation is disabled.
tok, err := jwt.NewBuilder().
Issuer(srv.URL).
Subject("sub").
Expiration(time.Now().Add(time.Hour)).
Build()
require.NoError(t, err)
signed, err := jwt.Sign(tok, jwt.WithKey(jwa.RS256, keys.priv))
require.NoError(t, err)
sub, err := v.Validate(ctx, string(signed))
require.NoError(t, err)
assert.Equal(t, "sub", sub)
}
func TestNewValidator_BadDiscoveryURL(t *testing.T) {
_, err := auth.NewValidator("http://127.0.0.1:1", "brain")
assert.Error(t, err)
}

View File

@@ -0,0 +1,23 @@
package auth
import (
"encoding/json"
"net/http"
)
// ProtectedResourceHandler returns an RFC 9728 oauth-protected-resource metadata
// handler. Mount at GET /.well-known/oauth-protected-resource (no auth required).
func ProtectedResourceHandler(resourceURL, issuerURL string) http.HandlerFunc {
type metadata struct {
Resource string `json:"resource"`
AuthorizationServers []string `json:"authorization_servers"`
}
body, _ := json.Marshal(metadata{
Resource: resourceURL,
AuthorizationServers: []string{issuerURL},
})
return func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write(body)
}
}

View File

@@ -0,0 +1,28 @@
package auth_test
import (
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/mathiasbq/supervisor/internal/auth"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestProtectedResourceHandler(t *testing.T) {
h := auth.ProtectedResourceHandler("https://brain-mcp.d-ma.be", "https://auth.d-ma.be")
req := httptest.NewRequest(http.MethodGet, "/.well-known/oauth-protected-resource", nil)
rr := httptest.NewRecorder()
h(rr, req)
assert.Equal(t, http.StatusOK, rr.Code)
assert.Equal(t, "application/json", rr.Header().Get("Content-Type"))
var body map[string]any
require.NoError(t, json.Unmarshal(rr.Body.Bytes(), &body))
assert.Equal(t, "https://brain-mcp.d-ma.be", body["resource"])
servers := body["authorization_servers"].([]any)
assert.Equal(t, "https://auth.d-ma.be", servers[0])
}

Some files were not shown because too many files have changed in this diff Show More