refactor: replace orchestrator/verifier chain with direct LiteLLM calls

Drop the three-layer Claude subprocess orchestration (local model → Claude verifier → cloud escalation). Skills now call LiteLLM directly and return plain text to Claude Code, which decides what to do with it. - Delete executor, orchestrator, verifier, result, attempts packages - Simplify LiteLLMExecutor: Run(Request)→Result becomes Complete(model,sys,user)→(string,int64,error) - Replace ExecutorFn with CompleteFunc in all 6 skill configs - Rewrite all skill handlers to call Complete and return {"text","model","duration_ms"} - Simplify config/models: remove Verifier/LlamaSwapURL, add ModelFor - Bump version to v0.5.0 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 16:19:09 +02:00
parent 823de23213
commit ce45592730
34 changed files with 266 additions and 1432 deletions
--- a/config/models.yaml
+++ b/config/models.yaml
@@ -1,41 +1,25 @@
-# Model routing chains — three-layer priority:
-# 1. model param in MCP tool call (caller override — collapses to single entry, no escalation)
-# 2. per-skill chain here
-# 3. default_chain fallback
-
-verifier: claude-sonnet-4-6   # fixed verifier for all local tiers
-
-llama_swap_url: http://koala:8080   # for warm-state probing
+# Model selection — first entry per skill is used.
+# Override per-call by passing model in the MCP tool args.

 default_chain:
  - ollama/qwen3-coder-30b-tuned
-  - claude-sonnet-4-6

 skills:
  tdd:
    chain:
      - ollama/qwen3-coder-30b-tuned
-      - claude-sonnet-4-6
  review:
    chain:
      - ollama/devstral-tuned
-      - ollama/gemma4
-      - claude-sonnet-4-6
  debug:
    chain:
      - ollama/deepseek-r1-tuned
-      - claude-sonnet-4-6
  spec:
    chain:
      - ollama/phi4
-      - ollama/gemma4
-      - claude-sonnet-4-6
-      - claude-opus-4-6
  retrospective:
    chain:
      - ollama/qwen3-coder-30b-tuned
-      - claude-sonnet-4-6
  trainer:
    chain:
      - ollama/qwen3-coder-30b-tuned
-      - claude-sonnet-4-6