Retry Policy
Every LLM call is wrapped in an exponential-backoff retry loop:
- 4 attempts by default (configurable) - 1 s → 32 s max delay (2ⁿ s × jitter) - Retries on HTTP 429, 500, 503 - Respects `Retry-After` response headers
Circuit Breaker
Per-provider circuit breaker with three states:
| State | Behaviour | |-------|----------| | CLOSED | Normal operation | | OPEN | Fast-fail all requests; no LLM calls | | HALF_OPEN | One probe request; success → CLOSED, failure → OPEN |
Threshold: 5 consecutive failures → OPEN. Cooldown: 60 s.
Supervisor Verdicts
When a supervisor check fails beyond its `retryBudget`, the engine chooses a verdict:
- RETRY — Re-run with injected corrective prompt - HANDOFF — Forward output to a specialist lane - ESCALATE — Halt DAG, fire `human-review-required` event