2 Commits

Author SHA1 Message Date
Mathias
3b79311fdd feat(routing): project_create MCP tool — gitea-first new-project pipeline (#10)
All checks were successful
CI / Lint / Test / Vet (push) Successful in 12s
CI / Mirror to GitHub (push) Successful in 4s
Adds the project_create tool to the routing pod that automates the
"new project" bootstrap end-to-end from claude.ai. Gitea-first
architecture: GitHub receives the repo only via push-mirror, never
via a direct GitHub API call from this server.

Four sequential calls to the gitea-mcp server (configured via
GITEA_MCP_URL):

  1. create_project_from_template — Gitea repo from
     template-go-{agent,web} per the 'stack' arg
  2. repo_mirror_push (action=add) — push-mirror to
     github.com/<GITHUB_OWNER>/<name>.git, interval 8h, sync_on_commit
  3. file_write_branch — k3s/staging/<name>/namespace.yaml committed
     on a staging/<name> branch in the infra repo
  4. issue_create — experiment brief (hypothesis + description + stack
     + provisioning log) on the new repo, returns the issue_url

Returns gitea_url, github_url, issue_url, next_steps. The next_steps
string is the exact shell sequence the operator runs locally to
clone, scaffold via local-dev 'task new-project', and push.

Idempotency: create_project_from_template + repo_mirror_push +
file_write_branch all return JSON-RPC code -32003 (Conflict) when
their target already exists; the orchestrator swallows the conflict
and continues. Re-running on an existing repo restates the brief in
a fresh issue.

Error handling: on any non-conflict downstream failure the response
returns {reached: ["<step>",...], failed_step: "<step>"} alongside
a JSON-RPC error. No rollback — partial state stays so the operator
can resume manually.

New env vars (all optional except GITEA_MCP_URL):
  GITEA_MCP_URL    enables the tool
  GITEA_MCP_TOKEN  bearer auth for gitea-mcp
  GITEA_OWNER      default mathias
  GITHUB_OWNER     default mathiasb
  INFRA_REPO       default infra
  GITHUB_PAT       repo scope, used as mirror remote_password; never logged

Without GITEA_MCP_URL set, the tool is not registered and the
routing pod starts normally (degrades open).

internal/mcpclient/: new minimal JSON-RPC tools/call client with
bearer auth, used by project_create. Unwraps MCP's
content[0].text envelope and surfaces typed errors via mcpclient.Error.

Tests: table-driven against an httptest fake gitea-mcp covering happy
path (4-step success + correct PATCH-style arg shapes), idempotent
repo-exists, mirror failure (partial-success response with reached=
[create_repo] + failed_step=mirror), infra-commit failure (reached up
to mirror + failed_step=infra_commit), and validation errors.

Closes #10
2026-05-18 11:44:39 +02:00
Mathias
7baf8d7e7a chore: re-sync context adapters from updated root AGENT.md 2026-05-18 11:44:02 +02:00
11 changed files with 1150 additions and 64 deletions

View File

@@ -36,9 +36,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Branch-per-task for multi-agent repos.** When another agent may be active on
the same repo, create a branch (`agent/<description>`), commit there, and open a
PR. Do not merge without explicit instruction from Mathias.
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -49,9 +58,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -61,9 +71,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -71,7 +84,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -103,18 +116,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild``knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules

View File

@@ -41,9 +41,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Branch-per-task for multi-agent repos.** When another agent may be active on
the same repo, create a branch (`agent/<description>`), commit there, and open a
PR. Do not merge without explicit instruction from Mathias.
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -54,9 +63,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -66,9 +76,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -76,7 +89,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -108,18 +121,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild` → `knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules

View File

@@ -39,9 +39,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Branch-per-task for multi-agent repos.** When another agent may be active on
the same repo, create a branch (`agent/<description>`), commit there, and open a
PR. Do not merge without explicit instruction from Mathias.
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -52,9 +61,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -64,9 +74,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -74,7 +87,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -106,18 +119,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild` → `knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules

View File

@@ -36,9 +36,18 @@ These rules apply to every task across every project, regardless of harness.
4. **Goal-driven execution.** Define clear success criteria up front for every task.
Loop — implement, verify, refine — until those criteria are met. Don't claim
completion without evidence (tests pass, command output, observed behavior).
5. **Branch-per-task for multi-agent repos.** When another agent may be active on
the same repo, create a branch (`agent/<description>`), commit there, and open a
PR. Do not merge without explicit instruction from Mathias.
5. **Trunk-Based Development — commit directly to main.** Every commit is one
logical change (one tool, one fix, one test) with passing tests. Main is always
deployable. Never create long-lived feature branches.
**Exception — parallel agents on same repo:** If another agent is known to be
actively working on the same repo simultaneously, create a short-lived branch
(`agent/<description>`), finish the task, and merge to main within the same
session. Do not leave agent branches open between sessions.
**Exception — external contributor or client four-eyes requirement:** Use
PR flow only when a human reviewer outside the project is required. Document
the reason in PROJECT.md.
## Default stack
@@ -49,9 +58,10 @@ These rules apply to every task across every project, regardless of harness.
| Build | Task (taskfile.dev) | Make | — |
| Containers | Docker Compose (dev), k3s (prod) | — | — |
| DB | PostgreSQL + sqlc | SQLite | — |
| Search | Qdrant (vector), BM25 | | — |
| Search | pgvector (vector), BM25 | Qdrant (when >1M vectors or hybrid retrieval) | — |
| Logging | slog (structured) | — | — |
| Testing | Table-driven, testify | — | — |
| Agents (Go) | google.golang.org/adk + pkg/litellm adapter | — | — |
Exploratory: Rust, Zig — I'll tell you when I want these.
@@ -61,9 +71,12 @@ Exploratory: Rust, Zig — I'll tell you when I want these.
- **Errors**: `fmt.Errorf("operation: %w", err)` — never naked, never log-and-return
- **Naming**: stdlib conventions, no stuttering
- **Architecture**: prefer stdlib over frameworks, constructor injection, env-var config parsed into typed structs
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), one concern per PR, PR describes *why* not *what*
- **Git**: conventional commits (`feat:`, `fix:`, `chore:`), commit directly to main,
one logical change per commit, CI is the quality gate
- **Never**: long-lived feature branches, PRs for solo work, direct push without
passing `task check` locally first
- **Security**: no secrets in code, govulncheck before adding deps, SOPS for encrypted config
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc are pre-approved; anything else needs justification in the commit message
- **Dependencies**: prefer stdlib. testify, slog, templ, sqlc, google.golang.org/adk (agent projects only) are pre-approved; anything else needs justification in the commit message
## Infrastructure
@@ -71,7 +84,7 @@ Three machines on Tailscale:
| Machine | Role | Key specs |
|---------|------|-----------|
| koala | GPU inference, heavy compute | RTX 5070, runs llama-swap, Qdrant |
| koala | GPU inference, heavy compute | RTX 5070, runs k3s + llama-swap + shared postgres18/pgvector |
| iguana | Services, builds | M2 Ultra Mac |
| flamingo | Daily driver, edge | Mac mini, ~/dev is here |
@@ -103,18 +116,64 @@ See `~/dev/PROJECT_SUMMARY.md` for detailed descriptions of each project.
- **koala-ai-stack** (`AGENTS/`) — local AI server infrastructure management
- **klimatkollen** (`XT/`) — Swedish municipal climate data platform
## Knowledge base
## Knowledge base — actively use it
When available, agents can query the shared knowledge base:
A persistent brain (BM25 search + LLM-synthesised Q&A) survives across sessions,
hosts, and harnesses. It holds 100+ hard-won entries: infra incident postmortems,
Go pitfalls, framework gotchas, design principles, ADRs. **It is not optional
reference material — query it actively, not just when explicitly told.**
- **MCP**: `mcp://hyperguild.<TAILNET>.ts.net:3100/knowledge`
- **HTTP**: `http://hyperguild.<TAILNET>.ts.net:3100/api/v1/search`
### When to query (treat as a reflex)
<!-- TODO: replace <TAILNET> placeholder with the real Tailscale tailnet
name once hyperguild is deployed. Until then, agents that try to
reach the knowledge service on a host where it isn't running will
get DNS NXDOMAIN, which is the desired fail-loudly behavior. -->
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`
- **Before** starting a non-trivial task — search for prior art with the symptom
AND the system component ("how did we solve X in Y?"). 5 seconds beats 5 hours.
- **When debugging** — search for the error string, the stack frame, the affected
service. Past you may have already paid this tax.
- **Before adopting** a pattern, library, framework, or model name — check if it
was tried and rejected, or what the integration footguns are.
- **When making architectural decisions** — search for the domain + "ADR" or
"decision" to find prior reasoning before re-deriving it.
- **When a recommendation feels novel** — challenge yourself: "has this been
documented?" The brain often has it.
### When to write
After you discover something that **future-you would forget** and that **isn't
recoverable from the code, git log, or PR description alone**:
- Bugs whose root cause is non-obvious and generalisable beyond this project.
- Framework / library / model-name quirks that bit you and would bite anyone.
- Design principles validated under fire (e.g. "every `_get` needs a `_list`").
- Postmortems for incidents: what broke, why, how diagnosed, what to do next time.
DON'T write project status, sprint progress, PR summaries, or "what I did this
session" — those rot fast and the originals are in git/gitea anyway. Brain
entries that age well are about *why*, *how to avoid*, and *what to do when*.
### How to access (per harness)
| Harness | Query | Write |
|---------|-------|-------|
| **Claude Code, Claude Desktop** | `brain_query` (BM25), `brain_answer` (LLM-synth + sources) MCP tools | `brain_write` MCP tool |
| **Crush, Pi, Antigravity, other MCP-capable** | same MCP server: `ingestion-brain` (via the `mcp__*_brain__*` namespace once authenticated) | same |
| **Anything HTTP-only (curl, scripts)** | `POST https://brain-mcp.d-ma.be/query` with `{"query":"..."}` (auth via `BRAIN_MCP_TOKEN`) | `POST .../write` with `{"content":"...","filename":"..."}` |
| **Browser / human inspection** | `https://gitea.d-ma.be/mathias/hyperguild``knowledge/` and `wiki/` markdown files |
- **Scoping**: defaults to `public` collection; client projects filter to `{client}` + `public`.
- **Routing**: brain_answer's LLM uses berget.ai as primary, iguana ollama as
fallback. Both are configurable in the `supervisor/ingestion-deployment.yaml`
on the koala k3s cluster; don't hardcode local-only model names into the
berget URL (see knowledge entry on namespace mismatches).
### Quick reflex checks
If you find yourself about to say any of these out loud, you owe yourself a brain query first:
- "I think the issue might be..."
- "Let me try X and see..."
- "I'll just write a script to..."
- "This is probably a new bug..."
- "Has anyone done this before?" — *yes, probably, go check.*
## Client work rules

View File

@@ -18,9 +18,11 @@ import (
"github.com/mathiasbq/supervisor/internal/config"
iexec "github.com/mathiasbq/supervisor/internal/exec"
"github.com/mathiasbq/supervisor/internal/mcp"
"github.com/mathiasbq/supervisor/internal/mcpclient"
"github.com/mathiasbq/supervisor/internal/registry"
"github.com/mathiasbq/supervisor/internal/routing"
"github.com/mathiasbq/supervisor/internal/skills/debug"
"github.com/mathiasbq/supervisor/internal/skills/project"
"github.com/mathiasbq/supervisor/internal/skills/retrospective"
"github.com/mathiasbq/supervisor/internal/skills/review"
"github.com/mathiasbq/supervisor/internal/skills/trainer"
@@ -99,6 +101,21 @@ func main() {
CompleteFunc: trainer.CompleteFunc(wrap("trainer")),
}))
if cfg.GiteaMCPURL != "" {
reg.Register(project.New(project.Config{
Client: mcpclient.New(cfg.GiteaMCPURL, cfg.GiteaMCPToken),
GiteaOwner: cfg.GiteaOwner,
GitHubOwner: cfg.GitHubOwner,
GitHubPAT: cfg.GitHubPAT,
InfraRepo: cfg.InfraRepo,
}))
logger.Info("project_create registered", "gitea_mcp_url", cfg.GiteaMCPURL,
"gitea_owner", cfg.GiteaOwner, "github_owner", cfg.GitHubOwner,
"infra_repo", cfg.InfraRepo, "github_pat_set", cfg.GitHubPAT != "")
} else {
logger.Info("project_create skipped — GITEA_MCP_URL not set")
}
var validator *auth.Validator
if dexURL := os.Getenv("DEX_ISSUER_URL"); dexURL != "" {
audience := os.Getenv("MCP_AUDIENCE")

View File

@@ -25,6 +25,16 @@ type RoutingConfig struct {
RouteLocalFloor float64 // HYPERGUILD_ROUTE_LOCAL_FLOOR, default 0.90
RouteLocalCeil float64 // HYPERGUILD_ROUTE_LOCAL_CEIL, default 0.70
PassRateTTLSeconds int // HYPERGUILD_PASS_RATE_TTL_SECONDS, default 60
// project_create configuration. Empty GiteaMCPURL disables the
// project_create tool registration so the routing pod still starts
// in environments where it's not wired up.
GiteaMCPURL string // GITEA_MCP_URL, e.g. http://koala:30340/mcp
GiteaMCPToken string // GITEA_MCP_TOKEN, bearer for gitea-mcp
GiteaOwner string // GITEA_OWNER, default mathias
GitHubOwner string // GITHUB_OWNER, default mathiasb
InfraRepo string // INFRA_REPO, default infra
GitHubPAT string // GITHUB_PAT, repo scope; never logged
}
func LoadRouting() (RoutingConfig, error) {
@@ -56,6 +66,13 @@ func LoadRouting() (RoutingConfig, error) {
}
cfg.PassRateTTLSeconds = ttl
cfg.GiteaMCPURL = os.Getenv("GITEA_MCP_URL")
cfg.GiteaMCPToken = os.Getenv("GITEA_MCP_TOKEN")
cfg.GiteaOwner = envOr("GITEA_OWNER", "mathias")
cfg.GitHubOwner = envOr("GITHUB_OWNER", "mathiasb")
cfg.InfraRepo = envOr("INFRA_REPO", "infra")
cfg.GitHubPAT = os.Getenv("GITHUB_PAT")
return cfg, nil
}

View File

@@ -0,0 +1,135 @@
// Package mcpclient is a minimal JSON-RPC over HTTP client for talking to
// MCP servers from inside hyperguild components. It only implements
// `tools/call` because that's all consumer skills need today.
package mcpclient
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"time"
)
// Client calls an MCP server over Streamable HTTP / JSON-RPC.
type Client struct {
url string
token string
http *http.Client
}
// New returns a Client. token may be empty for unauthenticated servers.
func New(url, token string) *Client {
return &Client{
url: url,
token: token,
http: &http.Client{Timeout: 60 * time.Second},
}
}
// WithHTTPClient overrides the underlying HTTP client (test injection).
func (c *Client) WithHTTPClient(h *http.Client) *Client {
c.http = h
return c
}
type rpcRequest struct {
JSONRPC string `json:"jsonrpc"`
ID int `json:"id"`
Method string `json:"method"`
Params map[string]any `json:"params"`
}
type rpcError struct {
Code int `json:"code"`
Message string `json:"message"`
}
type rpcResponse struct {
JSONRPC string `json:"jsonrpc"`
ID int `json:"id"`
Result json.RawMessage `json:"result,omitempty"`
Error *rpcError `json:"error,omitempty"`
}
// Error is returned when the remote MCP server signals a typed failure.
// Code follows JSON-RPC conventions; see gitea-mcp internal/mcp/jsonrpc.go
// for the codes the server uses (e.g. -32002 NotFound, -32003 Conflict).
type Error struct {
Code int
Message string
}
func (e *Error) Error() string { return fmt.Sprintf("mcp error %d: %s", e.Code, e.Message) }
// CallTool issues `tools/call`. result is JSON-unmarshalled from the
// server's content[0].text field; pass nil to discard.
func (c *Client) CallTool(ctx context.Context, name string, args any, result any) error {
body, err := json.Marshal(rpcRequest{
JSONRPC: "2.0",
ID: 1,
Method: "tools/call",
Params: map[string]any{
"name": name,
"arguments": args,
},
})
if err != nil {
return fmt.Errorf("marshal request: %w", err)
}
req, err := http.NewRequestWithContext(ctx, http.MethodPost, c.url, bytes.NewReader(body))
if err != nil {
return fmt.Errorf("new request: %w", err)
}
req.Header.Set("Content-Type", "application/json")
if c.token != "" {
req.Header.Set("Authorization", "Bearer "+c.token)
}
resp, err := c.http.Do(req)
if err != nil {
return fmt.Errorf("http: %w", err)
}
defer func() { _ = resp.Body.Close() }()
raw, err := io.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("read body: %w", err)
}
if resp.StatusCode >= 400 {
return fmt.Errorf("mcp http %d: %s", resp.StatusCode, string(raw))
}
var rpc rpcResponse
if err := json.Unmarshal(raw, &rpc); err != nil {
return fmt.Errorf("decode response: %w (body=%s)", err, string(raw))
}
if rpc.Error != nil {
return &Error{Code: rpc.Error.Code, Message: rpc.Error.Message}
}
if result == nil {
return nil
}
// MCP success result shape: { content: [{type:"text", text:"<json>"}] }
var wrap struct {
Content []struct {
Type string `json:"type"`
Text string `json:"text"`
} `json:"content"`
}
if err := json.Unmarshal(rpc.Result, &wrap); err != nil {
return fmt.Errorf("decode wrap: %w (result=%s)", err, string(rpc.Result))
}
if len(wrap.Content) == 0 {
return fmt.Errorf("empty content in tool response")
}
if err := json.Unmarshal([]byte(wrap.Content[0].Text), result); err != nil {
return fmt.Errorf("decode tool result text: %w (text=%s)", err, wrap.Content[0].Text)
}
return nil
}

View File

@@ -0,0 +1,82 @@
package mcpclient_test
import (
"context"
"encoding/json"
"errors"
"io"
"net/http"
"net/http/httptest"
"testing"
"github.com/mathiasbq/supervisor/internal/mcpclient"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
func TestCallTool_Success(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, http.MethodPost, r.Method)
assert.Equal(t, "Bearer tok", r.Header.Get("Authorization"))
b, _ := io.ReadAll(r.Body)
var got map[string]any
_ = json.Unmarshal(b, &got)
assert.Equal(t, "tools/call", got["method"])
params := got["params"].(map[string]any)
assert.Equal(t, "x_y", params["name"])
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"{\"ok\":true,\"n\":7}"}]}}`))
}))
defer srv.Close()
c := mcpclient.New(srv.URL, "tok")
var out struct {
OK bool `json:"ok"`
N int `json:"n"`
}
err := c.CallTool(context.Background(), "x_y", map[string]any{"a": 1}, &out)
require.NoError(t, err)
assert.True(t, out.OK)
assert.Equal(t, 7, out.N)
}
func TestCallTool_RPCError(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"jsonrpc":"2.0","id":1,"error":{"code":-32003,"message":"already exists"}}`))
}))
defer srv.Close()
c := mcpclient.New(srv.URL, "")
err := c.CallTool(context.Background(), "x", nil, nil)
require.Error(t, err)
var me *mcpclient.Error
require.True(t, errors.As(err, &me))
assert.Equal(t, -32003, me.Code)
assert.Contains(t, me.Message, "already exists")
}
func TestCallTool_HTTPError(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.WriteHeader(http.StatusUnauthorized)
_, _ = w.Write([]byte(`unauthorized`))
}))
defer srv.Close()
c := mcpclient.New(srv.URL, "")
err := c.CallTool(context.Background(), "x", nil, nil)
require.Error(t, err)
assert.Contains(t, err.Error(), "401")
}
func TestCallTool_NilResult(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"jsonrpc":"2.0","id":1,"result":{"content":[{"type":"text","text":"{}"}]}}`))
}))
defer srv.Close()
c := mcpclient.New(srv.URL, "")
require.NoError(t, c.CallTool(context.Background(), "x", nil, nil))
}

View File

@@ -0,0 +1,265 @@
package project
import (
"context"
"encoding/json"
"errors"
"fmt"
"strings"
"time"
"github.com/mathiasbq/supervisor/internal/mcpclient"
)
type createArgs struct {
Name string `json:"name"`
Description string `json:"description"`
Hypothesis string `json:"hypothesis"`
Folder string `json:"folder"`
Stack string `json:"stack"`
Private bool `json:"private"`
}
type createResult struct {
GiteaURL string `json:"gitea_url"`
GitHubURL string `json:"github_url"`
IssueURL string `json:"issue_url"`
NextSteps string `json:"next_steps"`
// Reached records the steps that completed. Populated on partial failure
// so callers can resume manually instead of guessing what already ran.
Reached []string `json:"reached,omitempty"`
// FailedStep is non-empty when a downstream gitea-mcp call returned an
// error; the error itself is surfaced via the JSON-RPC error response,
// this field tells the operator which step it happened in.
FailedStep string `json:"failed_step,omitempty"`
}
func errUnknownTool(name string) error { return fmt.Errorf("unknown tool: %s", name) }
// step names — must match what we surface in failed_step / reached.
const (
stepCreateRepo = "create_repo"
stepMirror = "mirror"
stepInfraCommit = "infra_commit"
stepIssue = "issue"
)
func (s *Skill) handleCreate(ctx context.Context, raw json.RawMessage) (json.RawMessage, error) {
var args createArgs
if err := json.Unmarshal(raw, &args); err != nil {
return nil, fmt.Errorf("parse args: %w", err)
}
if err := validate(args); err != nil {
return nil, err
}
tmpl := templateFor(args.Stack)
giteaURL := fmt.Sprintf("http://gitea.d-ma.be/%s/%s", s.cfg.GiteaOwner, args.Name)
githubURL := fmt.Sprintf("https://github.com/%s/%s", s.cfg.GitHubOwner, args.Name)
res := createResult{
GiteaURL: giteaURL,
GitHubURL: githubURL,
}
// Step 1: create_project_from_template. If the repo already exists,
// gitea-mcp returns -32003 Conflict; we treat that as idempotent success
// and continue to the next steps so re-running self-heals partial runs.
existed, err := s.callCreateRepo(ctx, args, tmpl)
if err != nil {
return marshalPartial(res, stepCreateRepo, err)
}
res.Reached = append(res.Reached, stepCreateRepo)
// Step 2: configure push mirror to GitHub. Idempotent: if a mirror with
// the same remote already exists, gitea-mcp returns Conflict; we swallow it.
if err := s.callMirror(ctx, args.Name); err != nil {
if !isConflict(err) {
return marshalPartial(res, stepMirror, err)
}
}
res.Reached = append(res.Reached, stepMirror)
// Step 3: commit staging namespace manifest to infra repo. Done before
// the issue so the staging env is reconciling by the time the issue lands.
branch := fmt.Sprintf("staging/%s", args.Name)
if err := s.callInfraCommit(ctx, args.Name, branch); err != nil {
if !isConflict(err) {
return marshalPartial(res, stepInfraCommit, err)
}
}
res.Reached = append(res.Reached, stepInfraCommit)
// Step 4: open the experiment-brief issue on the new repo.
issueURL, err := s.callIssue(ctx, args, existed)
if err != nil {
return marshalPartial(res, stepIssue, err)
}
res.IssueURL = issueURL
res.Reached = append(res.Reached, stepIssue)
folder := args.Folder
if folder == "" {
folder = "."
}
res.NextSteps = fmt.Sprintf(
"cd ~/dev/%s/%s && task new-project -- %s personal %s %s && git remote add origin http://gitea.d-ma.be/%s/%s.git && git push -u origin main",
folder, args.Name, args.Name, folder, args.Stack, s.cfg.GiteaOwner, args.Name,
)
return marshalResult(res)
}
// callCreateRepo invokes create_project_from_template. Returns (existed, err)
// where existed=true means the destination was already present and we should
// treat it as a no-op success (idempotency).
func (s *Skill) callCreateRepo(ctx context.Context, args createArgs, template string) (bool, error) {
var out struct {
HTMLURL string `json:"html_url"`
}
err := s.cfg.Client.CallTool(ctx, "create_project_from_template", map[string]any{
"owner": s.cfg.GiteaOwner,
"name": args.Name,
"description": args.Description,
"private": args.Private,
"template_name": template,
}, &out)
if err == nil {
return false, nil
}
if isConflict(err) {
return true, nil
}
return false, err
}
// callMirror configures the push mirror to GitHub.
func (s *Skill) callMirror(ctx context.Context, name string) error {
remote := fmt.Sprintf("https://github.com/%s/%s.git", s.cfg.GitHubOwner, name)
return s.cfg.Client.CallTool(ctx, "repo_mirror_push", map[string]any{
"owner": s.cfg.GiteaOwner,
"name": name,
"action": "add",
"remote_address": remote,
"remote_username": s.cfg.GitHubOwner,
"remote_password": s.cfg.GitHubPAT,
"interval": "8h0m0s",
"sync_on_commit": true,
}, nil)
}
// callInfraCommit writes the staging namespace manifest into the infra repo
// on a dedicated branch. Flux picks it up after merge.
func (s *Skill) callInfraCommit(ctx context.Context, name, branch string) error {
manifest := stagingNamespaceManifest(name, time.Now().UTC().Format(time.RFC3339))
return s.cfg.Client.CallTool(ctx, "file_write_branch", map[string]any{
"owner": s.cfg.GiteaOwner,
"name": s.cfg.InfraRepo,
"path": fmt.Sprintf("k3s/staging/%s/namespace.yaml", name),
"content": manifest,
"branch": branch,
"base": "main",
"message": fmt.Sprintf("feat(staging): add namespace for %s\n\nGenerated by hyperguild project_create.", name),
}, nil)
}
// callIssue opens the experiment-brief issue on the newly-created repo.
// existed=true (repo pre-existed) still posts a new brief — repeated runs
// can intentionally restate intent without colliding.
func (s *Skill) callIssue(ctx context.Context, args createArgs, existed bool) (string, error) {
body := experimentBrief(args, existed)
var out struct {
HTMLURL string `json:"html_url"`
}
err := s.cfg.Client.CallTool(ctx, "issue_create", map[string]any{
"owner": s.cfg.GiteaOwner,
"name": args.Name,
"title": "experiment brief: " + args.Description,
"body": body,
}, &out)
if err != nil {
return "", err
}
return out.HTMLURL, nil
}
func stagingNamespaceManifest(name, createdAt string) string {
return fmt.Sprintf(`apiVersion: v1
kind: Namespace
metadata:
name: staging-%s
labels:
managed-by: hyperguild
project: %s
created-at: "%s"
`, name, name, createdAt)
}
func experimentBrief(args createArgs, existed bool) string {
var b strings.Builder
b.WriteString("## Hypothesis\n\n")
b.WriteString(args.Hypothesis)
b.WriteString("\n\n## Description\n\n")
b.WriteString(args.Description)
b.WriteString("\n\n## Stack\n\n`")
b.WriteString(args.Stack)
b.WriteString("`\n\n## Provisioning\n\n")
b.WriteString("- Repo created from `template-")
b.WriteString(args.Stack)
b.WriteString("` on Gitea.\n")
b.WriteString("- Push-mirror configured to GitHub.\n")
b.WriteString("- Staging namespace manifest committed to infra repo.\n\n")
if existed {
b.WriteString("> Note: this repo already existed when `project_create` ran — provisioning steps were re-applied idempotently.\n")
}
return b.String()
}
func validate(args createArgs) error {
if args.Name == "" {
return errors.New("name is required")
}
if args.Description == "" {
return errors.New("description is required")
}
if args.Hypothesis == "" {
return errors.New("hypothesis is required")
}
if args.Stack != "go-agent" && args.Stack != "go-web" {
return fmt.Errorf("stack must be go-agent or go-web, got %q", args.Stack)
}
return nil
}
func templateFor(stack string) string {
switch stack {
case "go-agent":
return "template-go-agent"
default:
return "template-go-web"
}
}
func isConflict(err error) bool {
var me *mcpclient.Error
if errors.As(err, &me) && me.Code == -32003 {
return true
}
return false
}
func marshalResult(r createResult) (json.RawMessage, error) {
b, err := json.Marshal(r)
if err != nil {
return nil, fmt.Errorf("marshal result: %w", err)
}
return b, nil
}
func marshalPartial(r createResult, step string, inner error) (json.RawMessage, error) {
r.FailedStep = step
b, _ := json.Marshal(r)
return b, fmt.Errorf("project_create step %q failed: %w", step, inner)
}

View File

@@ -0,0 +1,244 @@
package project_test
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"sync"
"testing"
"github.com/mathiasbq/supervisor/internal/mcpclient"
"github.com/mathiasbq/supervisor/internal/skills/project"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// fakeGiteaMCP implements just enough of the JSON-RPC tools/call surface
// to drive project_create end-to-end without an actual gitea-mcp server.
type fakeGiteaMCP struct {
mu sync.Mutex
// Recorded calls in order.
Calls []recordedCall
// Per-tool response. Default is a generic success object.
Responses map[string]any
// Per-tool error response, takes precedence over Responses.
Errors map[string]rpcErr
}
type rpcErr struct {
Code int
Message string
}
type recordedCall struct {
Tool string
Args map[string]any
}
func (f *fakeGiteaMCP) handler() http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
var req struct {
ID int `json:"id"`
Params json.RawMessage `json:"params"`
}
_ = json.NewDecoder(r.Body).Decode(&req)
var p struct {
Name string `json:"name"`
Arguments json.RawMessage `json:"arguments"`
}
_ = json.Unmarshal(req.Params, &p)
var args map[string]any
_ = json.Unmarshal(p.Arguments, &args)
f.mu.Lock()
f.Calls = append(f.Calls, recordedCall{Tool: p.Name, Args: args})
errResp, hasErr := f.Errors[p.Name]
var resp any
if r, ok := f.Responses[p.Name]; ok {
resp = r
} else {
resp = map[string]any{"html_url": "http://gitea.example/" + p.Name}
}
f.mu.Unlock()
w.Header().Set("Content-Type", "application/json")
if hasErr {
body, _ := json.Marshal(map[string]any{
"jsonrpc": "2.0",
"id": req.ID,
"error": map[string]any{"code": errResp.Code, "message": errResp.Message},
})
_, _ = w.Write(body)
return
}
respText, _ := json.Marshal(resp)
body, _ := json.Marshal(map[string]any{
"jsonrpc": "2.0",
"id": req.ID,
"result": map[string]any{
"content": []map[string]any{{"type": "text", "text": string(respText)}},
},
})
_, _ = w.Write(body)
})
}
func newSkill(t *testing.T, f *fakeGiteaMCP) *project.Skill {
t.Helper()
srv := httptest.NewServer(f.handler())
t.Cleanup(srv.Close)
return project.New(project.Config{
Client: mcpclient.New(srv.URL, ""),
GiteaOwner: "mathias",
GitHubOwner: "mathiasb",
GitHubPAT: "ghp_test",
InfraRepo: "infra",
})
}
func happyArgs() json.RawMessage {
return json.RawMessage(`{
"name":"my-experiment",
"description":"One-line desc",
"hypothesis":"We believe X produces Y",
"folder":"AGENTS",
"stack":"go-agent",
"private":true
}`)
}
func TestProjectCreate_HappyPath(t *testing.T) {
f := &fakeGiteaMCP{
Responses: map[string]any{
"issue_create": map[string]any{"html_url": "http://gitea.d-ma.be/mathias/my-experiment/issues/1"},
},
}
skill := newSkill(t, f)
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
require.NoError(t, err)
var res map[string]any
require.NoError(t, json.Unmarshal(out, &res))
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment", res["gitea_url"])
assert.Equal(t, "https://github.com/mathiasb/my-experiment", res["github_url"])
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment/issues/1", res["issue_url"])
assert.Contains(t, res["next_steps"], "cd ~/dev/AGENTS/my-experiment")
assert.Contains(t, res["next_steps"], "git remote add origin")
// All four steps in order.
require.Len(t, f.Calls, 4)
assert.Equal(t, "create_project_from_template", f.Calls[0].Tool)
assert.Equal(t, "repo_mirror_push", f.Calls[1].Tool)
assert.Equal(t, "file_write_branch", f.Calls[2].Tool)
assert.Equal(t, "issue_create", f.Calls[3].Tool)
// template selection wired from stack
assert.Equal(t, "template-go-agent", f.Calls[0].Args["template_name"])
// mirror config
assert.Equal(t, "add", f.Calls[1].Args["action"])
assert.Equal(t, "https://github.com/mathiasb/my-experiment.git", f.Calls[1].Args["remote_address"])
assert.Equal(t, "ghp_test", f.Calls[1].Args["remote_password"])
// infra commit path
assert.Equal(t, "k3s/staging/my-experiment/namespace.yaml", f.Calls[2].Args["path"])
assert.Contains(t, f.Calls[2].Args["content"], "name: staging-my-experiment")
assert.Contains(t, f.Calls[2].Args["content"], "managed-by: hyperguild")
// PAT must NOT appear in the response
assert.NotContains(t, string(out), "ghp_test")
}
func TestProjectCreate_Idempotent_RepoExists(t *testing.T) {
f := &fakeGiteaMCP{
Errors: map[string]rpcErr{
"create_project_from_template": {Code: -32003, Message: "already exists"},
},
Responses: map[string]any{
"issue_create": map[string]any{"html_url": "http://gitea.d-ma.be/mathias/my-experiment/issues/1"},
},
}
skill := newSkill(t, f)
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
require.NoError(t, err)
var res map[string]any
require.NoError(t, json.Unmarshal(out, &res))
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment", res["gitea_url"])
assert.Equal(t, "http://gitea.d-ma.be/mathias/my-experiment/issues/1", res["issue_url"])
// Still ran all 4 steps; idempotent flow falls through the conflict.
require.Len(t, f.Calls, 4)
}
func TestProjectCreate_MirrorFails(t *testing.T) {
f := &fakeGiteaMCP{
Errors: map[string]rpcErr{
"repo_mirror_push": {Code: -32000, Message: "github unreachable"},
},
}
skill := newSkill(t, f)
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
require.Error(t, err)
assert.Contains(t, err.Error(), `"mirror" failed`)
var res map[string]any
require.NoError(t, json.Unmarshal(out, &res))
assert.Equal(t, "mirror", res["failed_step"])
reached := res["reached"].([]any)
assert.Equal(t, []any{"create_repo"}, reached)
// Only steps 1 + 2 actually called.
require.Len(t, f.Calls, 2)
}
func TestProjectCreate_InfraCommitFails(t *testing.T) {
f := &fakeGiteaMCP{
Errors: map[string]rpcErr{
"file_write_branch": {Code: -32000, Message: "write rejected"},
},
}
skill := newSkill(t, f)
out, err := skill.Handle(context.Background(), "project_create", happyArgs())
require.Error(t, err)
var res map[string]any
require.NoError(t, json.Unmarshal(out, &res))
assert.Equal(t, "infra_commit", res["failed_step"])
reached := res["reached"].([]any)
assert.Equal(t, []any{"create_repo", "mirror"}, reached)
require.Len(t, f.Calls, 3)
}
func TestProjectCreate_ValidationErrors(t *testing.T) {
f := &fakeGiteaMCP{}
skill := newSkill(t, f)
cases := []struct {
name string
body string
want string
}{
{"missing name", `{"description":"d","hypothesis":"h","stack":"go-agent"}`, "name"},
{"missing description", `{"name":"x","hypothesis":"h","stack":"go-agent"}`, "description"},
{"missing hypothesis", `{"name":"x","description":"d","stack":"go-agent"}`, "hypothesis"},
{"bad stack", `{"name":"x","description":"d","hypothesis":"h","stack":"python"}`, "stack"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
_, err := skill.Handle(context.Background(), "project_create", json.RawMessage(tc.body))
require.Error(t, err)
assert.True(t, strings.Contains(err.Error(), tc.want), "want %q in %v", tc.want, err)
})
}
assert.Empty(t, f.Calls, "no upstream calls should occur on validation failure")
}
func TestProjectCreate_UnknownTool(t *testing.T) {
f := &fakeGiteaMCP{}
skill := newSkill(t, f)
_, err := skill.Handle(context.Background(), "nope", happyArgs())
require.Error(t, err)
}

View File

@@ -0,0 +1,90 @@
// Package project implements the `project_create` MCP tool: a single-call
// pipeline that creates a Gitea repo from a template, configures push-mirror
// to GitHub, commits a staging namespace manifest to the infra repo, and
// opens an experiment-brief issue on the new repo. See hyperguild gitea
// issue #10 for the design.
package project
import (
"context"
"encoding/json"
"github.com/mathiasbq/supervisor/internal/mcpclient"
"github.com/mathiasbq/supervisor/internal/registry"
)
// Config holds the orchestration dependencies for the project skill.
type Config struct {
// Client talks to the gitea-mcp server. project_create makes 4 sequential
// calls (create_project_from_template, repo_mirror_push, file_write_branch,
// issue_create) through this client.
Client *mcpclient.Client
// GiteaOwner is the org/user that owns the new repo and the infra repo
// the namespace manifest is committed to (typically "mathias").
GiteaOwner string
// GitHubOwner is the GitHub org/user the push-mirror targets
// (typically "mathiasb").
GitHubOwner string
// GitHubPAT is the personal access token used as the push-mirror
// password. Must have `repo` scope. Never logged.
GitHubPAT string
// InfraRepo is the name of the infra repo on Gitea where the
// k3s/staging/<name>/namespace.yaml manifest gets committed
// (typically "infra").
InfraRepo string
}
// Skill exposes project_create as an MCP tool.
type Skill struct{ cfg Config }
// New constructs the project Skill.
func New(cfg Config) *Skill { return &Skill{cfg: cfg} }
// Name returns the skill identifier.
func (s *Skill) Name() string { return "project" }
// Tools returns the MCP tool definitions for this skill.
func (s *Skill) Tools() []registry.ToolDef {
schema, _ := json.Marshal(map[string]any{
"type": "object",
"properties": map[string]any{
"name": map[string]any{
"type": "string",
"pattern": `^[a-z][a-z0-9-]{1,38}[a-z0-9]$`,
"description": "Lowercase repo name. 3-40 chars, must start with a letter.",
},
"description": map[string]any{"type": "string"},
"hypothesis": map[string]any{"type": "string"},
"folder": map[string]any{
"type": "string",
"description": "Informational only — appears in next_steps. Example: AGENTS, AI, QKX.",
},
"stack": map[string]any{
"type": "string",
"enum": []string{"go-agent", "go-web"},
"description": "Selects template-go-agent or template-go-web.",
},
"private": map[string]any{"type": "boolean"},
},
"required": []string{"name", "description", "hypothesis", "stack"},
})
return []registry.ToolDef{
{
Name: "project_create",
Description: "Bootstrap a new project: Gitea repo from template, GitHub push-mirror, staging namespace manifest, experiment-brief issue. Idempotent — re-running with an existing repo returns the existing URLs.",
InputSchema: schema,
},
}
}
// Handle dispatches the tool call.
func (s *Skill) Handle(ctx context.Context, tool string, args json.RawMessage) (json.RawMessage, error) {
if tool != "project_create" {
return nil, errUnknownTool(tool)
}
return s.handleCreate(ctx, args)
}