Drop the three-layer Claude subprocess orchestration (local model →
Claude verifier → cloud escalation). Skills now call LiteLLM directly
and return plain text to Claude Code, which decides what to do with it.
- Delete executor, orchestrator, verifier, result, attempts packages
- Simplify LiteLLMExecutor: Run(Request)→Result becomes Complete(model,sys,user)→(string,int64,error)
- Replace ExecutorFn with CompleteFunc in all 6 skill configs
- Rewrite all skill handlers to call Complete and return {"text","model","duration_ms"}
- Simplify config/models: remove Verifier/LlamaSwapURL, add ModelFor
- Bump version to v0.5.0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
26 lines
492 B
YAML
26 lines
492 B
YAML
# Model selection — first entry per skill is used.
|
|
# Override per-call by passing model in the MCP tool args.
|
|
|
|
default_chain:
|
|
- ollama/qwen3-coder-30b-tuned
|
|
|
|
skills:
|
|
tdd:
|
|
chain:
|
|
- ollama/qwen3-coder-30b-tuned
|
|
review:
|
|
chain:
|
|
- ollama/devstral-tuned
|
|
debug:
|
|
chain:
|
|
- ollama/deepseek-r1-tuned
|
|
spec:
|
|
chain:
|
|
- ollama/phi4
|
|
retrospective:
|
|
chain:
|
|
- ollama/qwen3-coder-30b-tuned
|
|
trainer:
|
|
chain:
|
|
- ollama/qwen3-coder-30b-tuned
|