feat(brain-mcp): OAuth 2.0 client_credentials flow for claude.ai
Adds a minimal RFC 8414 + RFC 6749 client_credentials flow so claude.ai's custom-MCP integration (no static-Bearer field in the UI) can exchange a client_id + client_secret pair for the existing BRAIN_MCP_TOKEN and use it as a Bearer on /mcp. No JWTs, no refresh, no expiry — the rest of the auth middleware is unchanged. New package ingestion/internal/oauth: - MetadataHandler(issuer): serves /.well-known/oauth-authorization-server with grant_types=[client_credentials] and both token_endpoint_auth_methods (post + basic). - TokenHandler(cfg): serves /oauth/token. Validates client_id and client_secret via constant-time compare; returns BRAIN_MCP_TOKEN as access_token. RFC 6749 §5.2 error JSON on bad grant / bad creds. Wiring in cmd/server/main.go: opt-in by setting both OAUTH_CLIENT_ID and OAUTH_CLIENT_SECRET. Setting only one is misconfiguration → exit 1. Mounts both endpoints with no auth; MCP_RESOURCE_URL supplies the issuer. Also pivots issue #8's vector backend from Qdrant to pgvector (see DECISIONS.md 2026-05-18) — Qdrant was never deployed and postgres18 with pgvector already runs as the project default; supersedes 2026-04-08 for this use case. Tests cover post-auth, basic-auth, wrong secret, bad grant, GET rejection, malformed Basic header, and Basic without colon. Closes hyperguild#5.
This commit is contained in:
28
DECISIONS.md
28
DECISIONS.md
@@ -139,3 +139,31 @@ manifest is a fresh namespace under `k3s/staging/<name>/` — isolated, low
|
||||
blast-radius, and Flux will simply recreate it if the file is bad. Manual
|
||||
review gating was friction for no compensating safety gain on experiment
|
||||
namespaces.
|
||||
|
||||
---
|
||||
|
||||
## 2026-05-18 — pgvector over Qdrant for brain hybrid retrieval (supersedes 2026-04-08)
|
||||
|
||||
**Context:** The 2026-04-08 ADR chose Qdrant for vector store. Since then,
|
||||
postgres18 with pgvector has been deployed in the `databases` namespace on
|
||||
koala and is already the shared default for the rest of the project
|
||||
(CLAUDE.md lists `pgvector (vector), BM25` as the primary search layer and
|
||||
Qdrant only as a fallback "when >1M vectors or hybrid retrieval"). Qdrant
|
||||
itself has never been deployed — `kubectl get` finds no pod, service, or
|
||||
manifest. Standing up a new vector engine for a single consumer is friction
|
||||
that the original ADR did not weigh.
|
||||
|
||||
**Decision:** Use pgvector for brain hybrid retrieval. Issue #8 — and any
|
||||
follow-on embedding work — targets the existing `postgres18` instance:
|
||||
|
||||
- one table `brain_embeddings(path TEXT PRIMARY KEY, embedding VECTOR(768), updated_at TIMESTAMPTZ)`,
|
||||
IVFFlat or HNSW index by feel once volume warrants
|
||||
- BM25 stays as today (file walk + token frequency); cosine via pgvector
|
||||
- hybrid scoring done in SQL or Go; pick once we measure
|
||||
- nomic-embed-text on iguana ollama provides 768-dim vectors
|
||||
|
||||
**Consequences:** One database engine instead of two. Backups, monitoring,
|
||||
and connection pooling already solved. Trade-off: pgvector at >1M vectors
|
||||
or under hybrid-search load may underperform Qdrant — revisit only when
|
||||
benchmarks hurt. The 2026-04-08 ADR is superseded for the brain use case;
|
||||
Qdrant remains the noted fallback path in CLAUDE.md if scale demands it.
|
||||
|
||||
Reference in New Issue
Block a user