Implements escalation chains per skill with three-layer priority: 1. Caller override (model param) — no escalation 2. Per-skill chain from models.yaml 3. default_chain fallback New APIs: - Verifier() — fixed verifier for output validation - LlamaSwapURL() — base URL for warm-state probing - ChainFor(skill, override) — ordered model list for escalation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
42 lines
956 B
YAML
42 lines
956 B
YAML
# Model routing chains — three-layer priority:
|
|
# 1. model param in MCP tool call (caller override — collapses to single entry, no escalation)
|
|
# 2. per-skill chain here
|
|
# 3. default_chain fallback
|
|
|
|
verifier: claude-sonnet-4-6 # fixed verifier for all local tiers
|
|
|
|
llama_swap_url: http://koala:8080 # for warm-state probing
|
|
|
|
default_chain:
|
|
- ollama/qwen3-coder-30b-tuned
|
|
- claude-sonnet-4-6
|
|
|
|
skills:
|
|
tdd:
|
|
chain:
|
|
- ollama/qwen3-coder-30b-tuned
|
|
- claude-sonnet-4-6
|
|
review:
|
|
chain:
|
|
- ollama/devstral-tuned
|
|
- ollama/gemma4
|
|
- claude-sonnet-4-6
|
|
debug:
|
|
chain:
|
|
- ollama/deepseek-r1-tuned
|
|
- claude-sonnet-4-6
|
|
spec:
|
|
chain:
|
|
- ollama/phi4
|
|
- ollama/gemma4
|
|
- claude-sonnet-4-6
|
|
- claude-opus-4-6
|
|
retrospective:
|
|
chain:
|
|
- ollama/qwen3-coder-30b-tuned
|
|
- claude-sonnet-4-6
|
|
trainer:
|
|
chain:
|
|
- ollama/qwen3-coder-30b-tuned
|
|
- claude-sonnet-4-6
|