feat(brain-mcp): OAuth 2.0 client_credentials flow for claude.ai

Adds a minimal RFC 8414 + RFC 6749 client_credentials flow so claude.ai's custom-MCP integration (no static-Bearer field in the UI) can exchange a client_id + client_secret pair for the existing BRAIN_MCP_TOKEN and use it as a Bearer on /mcp. No JWTs, no refresh, no expiry — the rest of the auth middleware is unchanged. New package ingestion/internal/oauth: - MetadataHandler(issuer): serves /.well-known/oauth-authorization-server with grant_types=[client_credentials] and both token_endpoint_auth_methods (post + basic). - TokenHandler(cfg): serves /oauth/token. Validates client_id and client_secret via constant-time compare; returns BRAIN_MCP_TOKEN as access_token. RFC 6749 §5.2 error JSON on bad grant / bad creds. Wiring in cmd/server/main.go: opt-in by setting both OAUTH_CLIENT_ID and OAUTH_CLIENT_SECRET. Setting only one is misconfiguration → exit 1. Mounts both endpoints with no auth; MCP_RESOURCE_URL supplies the issuer. Also pivots issue #8's vector backend from Qdrant to pgvector (see DECISIONS.md 2026-05-18) — Qdrant was never deployed and postgres18 with pgvector already runs as the project default; supersedes 2026-04-08 for this use case. Tests cover post-auth, basic-auth, wrong secret, bad grant, GET rejection, malformed Basic header, and Basic without colon. Closes hyperguild#5.
2026-05-18 22:21:54 +02:00
parent ddd07ae7eb
commit 58c57412a9
6 changed files with 357 additions and 0 deletions
--- a/DECISIONS.md
+++ b/DECISIONS.md
@@ -139,3 +139,31 @@ manifest is a fresh namespace under `k3s/staging/<name>/` — isolated, low
 blast-radius, and Flux will simply recreate it if the file is bad. Manual
 review gating was friction for no compensating safety gain on experiment
 namespaces.
+
+---
+
+## 2026-05-18 — pgvector over Qdrant for brain hybrid retrieval (supersedes 2026-04-08)
+
+**Context:** The 2026-04-08 ADR chose Qdrant for vector store. Since then,
+postgres18 with pgvector has been deployed in the `databases` namespace on
+koala and is already the shared default for the rest of the project
+(CLAUDE.md lists `pgvector (vector), BM25` as the primary search layer and
+Qdrant only as a fallback "when >1M vectors or hybrid retrieval"). Qdrant
+itself has never been deployed — `kubectl get` finds no pod, service, or
+manifest. Standing up a new vector engine for a single consumer is friction
+that the original ADR did not weigh.
+
+**Decision:** Use pgvector for brain hybrid retrieval. Issue #8 — and any
+follow-on embedding work — targets the existing `postgres18` instance:
+
+- one table `brain_embeddings(path TEXT PRIMARY KEY, embedding VECTOR(768), updated_at TIMESTAMPTZ)`,
+  IVFFlat or HNSW index by feel once volume warrants
+- BM25 stays as today (file walk + token frequency); cosine via pgvector
+- hybrid scoring done in SQL or Go; pick once we measure
+- nomic-embed-text on iguana ollama provides 768-dim vectors
+
+**Consequences:** One database engine instead of two. Backups, monitoring,
+and connection pooling already solved. Trade-off: pgvector at >1M vectors
+or under hybrid-search load may underperform Qdrant — revisit only when
+benchmarks hurt. The 2026-04-08 ADR is superseded for the brain use case;
+Qdrant remains the noted fallback path in CLAUDE.md if scale demands it.