ADR-004: Budget Enforcement at the Runtime Level¶
Status: Accepted
Date: 2026-06-06
Context¶
Long-running agents can loop indefinitely, exhaust API budgets, or consume unbounded wall-clock time. Responders should not need to self-police — they should focus on domain logic.
Decision¶
The BudgetEnforcer sits inside Runtime and checks four limits after every event:
max_events: total events appended to the ledgermax_llm_cost_usd: cumulative cost tracked by summingcost_usdfromllm.responseeventsmax_seconds: wall-clock elapsed since the run startedmax_depth: maximum causal depth (longest chain ofcauseedges)
When any limit is breached, the runtime emits a BUDGET_EXHAUSTED event, appends it to the ledger, applies it to the world (setting world.context["budget_exhausted"] = True), and halts the event loop. No responder is called after exhaustion.
Consequences¶
Positive:
- Responders are never responsible for their own termination.
- Budget exhaustion is auditable: the BUDGET_EXHAUSTED event is in the ledger with the reason and current counters.
- Callers can inspect world.context.get("budget_exhausted") to determine if a run completed normally or was truncated.
- LLM cost tracking is automatic: any responder that emits llm.response with cost_usd contributes to the budget.
Negative:
- Budget check is per-event, not per-token or per-wall-clock-tick. A single slow LLM call that takes 5 minutes is not interrupted mid-call — the budget is checked when it returns.
- The max_llm_cost_usd limit only works if executors faithfully emit llm.response with accurate cost_usd. A rogue responder that makes direct LLM calls without emitting events bypasses cost tracking.
Defaults¶
Budget() defaults: max_events=1000, max_llm_cost_usd=10.0, max_seconds=300, max_depth=50. All are overridable at Runtime construction time.