Chapter 5 – The Ouroboros Protocol (Why Loops Converge)
Start with a convergence trace:
attempt 1 -> FAIL schema.required_field_missing(path=items[0].country)
attempt 2 -> FAIL scope.out_of_allowlist(path=docs/runbook.md)
attempt 3 -> FAIL unit_test.tax_rounding
attempt 4 -> PASS
That is the whole chapter in four lines. The loop is not just “trying again.” It is tightening against specific deterministic findings until it reaches a pass state or a circuit breaker says stop.
In Part I, you built the loop: propose a change, validate it, and feed deterministic failures into the next attempt.
Chapter 5 explains why that loop can settle instead of wandering. We
call that pattern the Ouroboros Protocol, with a deliberate nod to the
myth of the ouroboros, the serpent that eats its own tail: the loop
feeds its own output back into the next turn and evolves as it goes. It
is not a static circle, but a self-correcting one. Each attempt uses the
last candidate and the last validator findings as input to the next one,
and each Mission still ends in a small set of deterministic outcomes:
PASS and commit, FAIL and revert, or escalate
with evidence.
That is the difference between a useful retry loop and blind sampling. The system is not just “trying again.” It is retrying against a shrinking set of explicit constraints.
Ouroboros Protocol (Write → Judge → Refine)
This is the retry loop behind reliable agent work: produce one bounded candidate, run deterministic checks, then retry only against the exact failure signal. Repeat until the checks pass or the circuit breaker stops the run.
Two properties make this loop more than “retry until it passes”:
Self-reference is a feature. The loop feeds its own artifacts back in: the last diff, the last validator failures, the last scope decision, and the last trace. It is not “try again.” It is “try again with the exact evidence of what failed.”
Maintenance is drift repair. Over time, the Terrain changes and the Map falls behind: docs, inventories, indexes, and rules drift. Ouroboros is the small repair loop that turns that drift into a bounded fix under deterministic checks (Physics), whether the diff touches code (Terrain) or a Map surface.
The core insight: you do not need a deterministic large language model (LLM). You need a deterministic process around a stochastic engine.
Traceability Needs Convergence
The classical V-model gives you traceability between declared intent and the checks that prove correspondence. Ouroboros adds the missing control law: deterministic findings, bounded retries, and explicit stop conditions that turn a stochastic implementation step into a converging process.
That distinction matters because a traceable loop can still thrash. Ouroboros is the mechanism that shrinks failure against exact evidence until the candidate reaches an admissible state or stops with a deterministic reason.
The Core Loop: Write → Judge → Refine
At its heart, Ouroboros is a feedback loop with three steps:
Write: The model gets the mission, scope, and prior findings, then produces one candidate output. Since LLMs are stochastic engines (Chapter 4), this step is inherently probabilistic. Even with identical inputs, the output might vary between runs.
Judge: Run deterministic validators. In this book, the checker stack is Physics and this decision stage is the Judge. It produces structured findings and the next move (commit, refine, revert, or defer).
Refine: Feed the exact findings back into the next attempt, with scope and output constraints unchanged.
This cycle repeats until the Judge reports success, or a
circuit breaker is tripped.
The Loop as a State Machine (what persists, what resets)
Ouroboros sounds mystical in prose. In implementation, it is a small state machine with a short memory.
If you’re implementing this in a real repository, build it in layers:
- One-shot candidate + Physics: generate once, validate once, stop.
- Strict parsing first: fail fast on invalid diff/JSON before expensive checks.
- Structured findings feedback: retry only with specific, machine-readable failures.
- Budgets + escalation: stop deterministically when progress stalls; route hard cases to humans.
flowchart TD
Start([START]) --> Prep["PREP<br/>- build slice<br/>- compile authority<br/>- select validators<br/>- set budgets"]
Prep --> Write["WRITE (Effector)<br/>- propose candidate<br/>(diff / JSON / etc.)"]
Write --> Judge["JUDGE (Physics)<br/>- parse + scope + policy<br/>- run validators"]
Judge --> Pass{PASS?}
Pass -- yes --> Commit["COMMIT + TRACE"] --> End([END])
Pass -- no --> Budgets{"BUDGETS / PROGRESS OK?"}
Budgets -- yes --> Refine["REFINE FEEDBACK<br/>+ UPDATE STATE"] --> Write
Budgets -- no --> Escalate["ESCALATE / DEFER + TRACE"] --> End
Same loop, rendered as implementation pseudocode:
state = init_state(mission, scope, budgets)
while True:
candidate = write(state.slice, state.mission, state.findings)
parsed = parse_candidate(candidate)
if parsed == invalid:
return escalate("parse_shape_failure", trace=state.trace + [candidate])
findings = judge(parsed, state.mission, state.scope)
if findings == []:
return commit(candidate, trace=state.trace + [candidate])
if should_stop(state, findings):
return escalate("non_converging_or_budget_exhausted", findings=findings, trace=state.trace + [candidate])
state = refine(state, candidate, findings)
State (what the loop remembers)
There are two kinds of memory in Ouroboros:
- Ledger memory (for audit): the full trace (candidates, findings, timings, budgets).
- Working memory (for the next attempt): a compact bundle used to generate the next candidate.
Keep that working memory small and deterministic. Do not feed the entire run history back into the model request.
Practical state bundle per attempt:
mission: the Mission Object (authority)scope: allowlist/denylist + allowed edit regions (policy)budgets: remaining attempts, time, spend, diff/scope limitscandidate: the last candidate artifact (usually a diff, not the whole repo)findings: the last structured Judge output (what failed, where, why)history_signature: a tiny ring buffer of(candidate_hash, findings_signature)for oscillation detection
Feedback is constraint (the model isn’t “learning”)
Inside one Ouroboros run, the model is not getting smarter. Its weights do not change. The loop gets better only by adding explicit, deterministic constraints.
The difference between a good loop and a bad one is the feedback contract:
- Good feedback is structured, specific, and bounded (“edit only these files; fix only these findings; output only a diff”).
- Bad feedback is a paste of logs and vibes (“it failed, try again”) that encourages scope leak and churn.
Illustrative feedback template (diff-shaped work):
ROLE: Effector (component, not chat)
OUTPUT: Unified diff only. No prose.
## Mission (authority)
{mission_summary}
## Scope (policy)
- write_allowlist: {allowlist}
- denylist: {denylist}
- allowed_edit_regions: {regions}
## Previous candidate (artifact)
{previous_diff}
## Deterministic findings (Physics)
{findings_json}
## Instructions
- Fix ONLY the recorded findings.
- Do not change files outside the allowlist.
- Keep the diff minimal and localized.
If your work is JSON-only, swap “Unified diff only” for “Valid JSON only” and make strict parsing the first gate.
Convergence vs. Thrashing
Convergence means the failing set shrinks to zero and the loop exits
with PASS. Thrashing means the loop keeps moving without
getting closer: signatures repeat or scope expands while the same
failures remain.
Convergence signals in practice
In real traces, convergence usually has three visible signals:
- the failing validator set shrinks across attempts
- the diff gets more localized (fewer files, smaller edit regions)
- retries shift from structural fixes to small semantic fixes
You do not need perfectly monotonic progress every attempt, but you should see net progress inside a bounded window.
A common starting point is a 3-attempt progress window: require either a smaller failing-code set or a smaller authorized diff surface inside that window. If neither changes, classify the run as non-converging and escalate.
Attractors: the region that counts as “done”
Convergence is not luck. It comes from defining what a successful candidate looks like.
An attractor is the set of candidates your checks would accept. In this book’s terms, it is the region where the candidate parses cleanly, stays in scope, respects budgets, and passes every validator.
You can picture it as a valley in the space of possible diffs: once a candidate lands inside it, the validators stop pushing it back out.
If you prefer a less poetic definition:
admissible(candidate) =
parse_ok
AND in_scope
AND budgets_ok
AND all_validators_pass
That boolean predicate defines the attractor. The loop settles when successive candidates land in the same admissible region and stop moving.
This framing is useful because it turns “the model is being weird” into a design diagnosis:
Attractor too flat (constraints too loose): too many candidates are admissible, but there is no canonical shape. The loop wanders: formatting churn, reordering, and needless variation. Fix it by tightening the pass region: smaller allowed edit regions, stricter output schemas, canonical formatting, and stronger validators that reject variance you do not want to review.
No attractor (constraints too tight or contradictory): no candidate can satisfy all gates at once. The loop cycles until a circuit breaker fires. Fix it by making
PASSreachable: correct conflicting rules, include the missing contract in the slice (schema, interface, policy), or split the Mission into smaller steps.
When you debug thrash, ask one question first: did I define
an attractor that actually exists? In plain language: is there
a reachable PASS state inside the declared scope and
budgets?
Thrash debug checklist (Attractor):
- Reachability: can any candidate pass all gates inside the declared scope + budgets? (Look for contradictory rules, missing contracts in the slice, or an impossible acceptance test.)
- Flatness: are too many candidates admissible? (Tighten the allowed edit region, output schema, canonical formatting, and validators that reject churn.)
- Signal quality: does the Judge output point to specific files/lines/checks? (Vague errors produce wander; fix validators to emit structured failures.)
- Circuit breakers: are iteration/time budgets and “minimum progress” checks enforced?
Convergence criteria (heuristics you can implement)
PASS is the termination condition, but it is not enough
to steer a loop. For guidance you need heuristics: small, deterministic
checks that answer a simpler question, “are we getting closer?”
This is the Ratchet Principle inside Ouroboros: once you can measure progress or safety, you stop letting it slip. (Chapter 11 applies the same idea to repo-wide quality metrics.)
1) Reachability: prove
PASS is possible
Before you spend retries, ask if the attractor exists at all. In
plain language: can any candidate actually reach PASS under
the declared scope and budgets?
- Spec feasibility: does the Mission describe an achievable change inside the declared scope and budgets?
- Contract presence: does the slice include the schema/interface/policy the loop must satisfy, or is it guessing?
- Human reachability check: can a human-written candidate pass the same gates without changing the gates?
If the answer is “no,” retries are just burning budget. Split the Mission, fix the slice, or fix the conflicting rules.
2) Minimum progress: require monotonic improvement (or stop)
Progress doesn’t have to be strictly monotonic every iteration, but it should be monotonic within a window.
A practical “distance to attractor” signature:
parse_ok(bool)in_scope(bool)policy_ok(bool)failing_validator_codes(set)diff_size(lines changed, or a coarse bucket)
Then enforce a simple ratchet:
- Never accept a candidate that increases blast radius (new files, new directories, broader scope) unless the Mission explicitly authorizes it.
- Require the failing set to shrink (or change in a clearly “closer” direction) within W attempts.
- If you keep seeing the same failure signature, stop and escalate: you don’t have a signal-rich path to convergence.
3) Oscillation detection: catch A↔︎B ping-pong
Thrash is often a two-state loop: fix A, break B, fix B, break A.
Detect it mechanically by keeping a tiny ring buffer of recent signatures:
signature = (diff_hash, sorted(failing_validator_codes))
if signature repeats within last K attempts:
abort("oscillation detected; split mission or tighten slice")
This turns “it feels stuck” into a deterministic stop condition.
4) Failure classes: don’t treat all failures as equal
Not all FAIL signals deserve the same response.
- Parse/shape failures (invalid diff/JSON): tighten the output contract and fail fast; don’t run expensive validators.
- Scope/policy failures (out-of-allowlist, touched protected paths): treat as governance, not “try again.”
- Semantic failures (violated invariants): retry only if the findings are specific enough to fix deterministically; otherwise escalate.
- Infrastructure failures (rate limits, timeouts): backoff and retry, but do not conflate these with “candidate quality.”
The goal is a loop that is strict about boundaries, cheap about early failure, and honest about when a human needs to intervene.
Failure routing matrix (fast policy)
Codify the first response per failure class so retries stay deterministic:
| Failure class | First response | Retry policy |
|---|---|---|
Parse/shape (invalid diff,
invalid JSON) |
tighten output contract, fail fast | retry after contract tightening |
Scope/policy (out_of_allowlist) |
reject candidate, keep mission scope fixed | retry only with stricter scope |
| Semantic validator failures | feed exact findings back to Refine | retry within progress window |
Infra/transient (timeout, rate limit) |
backoff, preserve last good state | retry with bounded attempts |
This prevents a common anti-pattern: treating every FAIL
as equivalent and blindly retrying.
Practical stop defaults
If you do not have better repo-specific numbers yet, start here:
| Control | Default | Why |
|---|---|---|
max_iterations |
3 | Enough to use structured feedback once or twice without hiding thrash |
| progress window | 3 attempts | Long enough to see net improvement, short enough to stop budget burn |
max_files_changed |
3 for bounded missions | Keeps blast radius small while the loop is immature |
max_lines_changed |
120 | Forces splitting before changes become hard to review |
| repeated failure signature | 2 identical signatures | Signals the loop is stuck, not learning |
| infra retry budget | 2 retries with backoff | Enough for transient noise without masking system issues |
Loop latency: fast local Judges vs. slow remote Judges
This chapter can sound unrealistically snappy if your full integration suite takes 20-30 minutes. In many real systems, that is normal. If every retry waits on the slowest gate, the loop dies of latency.
The fix is not to drop the deep checks. The fix is to tier the Judge:
- Fast local Judge: parse/shape checks, scope/policy checks, lint, types, unit tests, narrow contract checks. These are the signals you can afford to run on every retry.
- Slow remote Judge: integration suites, staging smoke tests, performance/security scans, or checks that require real infrastructure. These are promotion gates, not inner-loop feedback on every attempt.
This is also one reason the book keeps pushing bounded units. Breaking a monolith into smaller, explicit surfaces is not only a context optimization. It is also a testing and verification optimization: smaller units give you smaller validator surfaces, faster local checks, and a much better chance that the inner loop can run at human-useful speed.
One practical pattern:
- Retry only against the fast local Judge.
- Run the slow remote Judge only after the candidate is locally admissible.
- If the slow Judge fails, feed back one structured summary, then either retry with a narrower slice or escalate.
- If the same slow gate keeps failing, stop pretending it is a cheap retry loop and change the slice, the Mission, or the validator layout.
In other words: fast local validation keeps the loop alive; slow remote validation keeps the system honest. Later chapters split this out into governance and CI policy. The key point here is simpler: Ouroboros survives enterprise latency only when the Judge is staged, not monolithic.
Thrashing: deterministic stop signals
Treat thrash as a state classification, not a vibe:
- repeated
(diff_hash, failing_validator_codes)signatures - no shrinking failure set within your progress window
- expanding blast radius without better Judge output
- recurring scope/policy failures
When these signals appear, abort and escalate. Re-running unchanged inputs is budget burn, not progress. Typical fixes are: tighten the slice, tighten the acceptance criteria, or split the Mission into smaller units.
Escalate immediately when
- the same findings signature repeats without a smaller failing set
- the candidate violates scope or protected-path policy more than once
- the loop needs a constraint or validator that does not exist yet
- the diff keeps expanding while the Judge output stays equally vague
- a human approval surface is touched and the mission does not explicitly allow it
Circuit Breakers: Guardrails for Stochasticity
A loop that can retry can also thrash. Circuit breakers make failure cheap and deterministic.
- Iteration limit: stop after N attempts.
- Scope + diff budgets: stop when the candidate touches protected paths or exceeds file/line limits.
- Cost + time budgets: stop when you hit spend or wall-clock limits.
- Minimum progress: stop when the Judge signal isn’t improving (same errors, same churn).
- Review throughput limits (humans are scarce): cap how many agent PRs can be open / created per day so governance stays real.
- Deployment guardrails (when loops ship): validate in a sandbox, canary first, ship behind flags, and automate rollback on regressions.
Dry Runs (Plan Mode)
Before you call the model, run Prep only. A dry run
should print:
- the slice (what evidence is in-bounds)
- the validators that will gate the change
- an estimate of token/cost budget
The companion repo (github.com/kjwise/aoi_code) includes
a mission-dry-run target that demonstrates this. It reads a
Mission Object and prints the slice, validators, and budgets without
actually calling a model.
Precedence Order (Cheap Checks First)
A practical check order:
- Parse/structure: if you can’t parse the candidate, you can’t judge it.
- Policy: scope allowlists, protected paths, diff budgets, review limits.
- Validation: run the Judge and collect signal-rich failures.
- Stop conditions: max iterations, time/cost limits, minimum-progress windows.
The Economics of Determinism (Why Loop Budgets Matter)
The book has talked about ROI for building loops. You also need ROI for running them.
The right comparison is not “a loop costs tokens.” It is “a bounded retry loop costs tokens” versus “a human debugging session costs time” versus “a production mistake costs money and trust.”
Illustrative cost comparison (order of magnitude; plug in your numbers):
| Event | Typical cost |
|---|---|
N retries inside a constrained loop |
~$0.50–$2.00 |
| One human debugging session | ~$100–$300 |
| One production incident | ~$5,000+ |
This is why circuit breakers matter. You want to spend a small, bounded budget before merge so you do not spend an unbounded budget after deploy.
Now widen the lens: your inference cost model is a strategic choice. It determines whether you can afford retries, strict gating, and background maintenance at scale.
| Inference model | Cost structure | Moat durability |
|---|---|---|
| Pay-per-token API | Variable, provider-controlled | Fragile |
| Fixed-cost SLA | Predictable, negotiated | Moderate |
| Self-hosted / owned inference | CapEx + energy, you control | Strong |
State the thesis explicitly:
A constrained loop with a weaker model can beat an unconstrained flagship model. Teams that control inference costs, including teams running open-weight models behind validated loops, can outcompete teams renting flagship models without constraints. The model is the engine; the loop is the vehicle. The loop is also where the value compounds: once it exists, new tasks can reuse the same scope controls, validators, and audit trail.
“Vehicle” is not poetry here. It cashes out into governance: your loop defines scope, budgets, validation, and audit evidence. That is what lets you scale autonomy without scaling incidents.
Compressed Trace Readout
You can classify loop behavior quickly from a short trace:
| attempt | failing codes | diff scope | classification |
|---|---|---|---|
| 1 | 5 | 4 files | initial |
| 2 | 3 | 2 files | converging |
| 3 | 1 | 1 file | converging |
| 4 | 0 | 1 file | PASS |
A thrashing trace tends to alternate signatures
(A -> B -> A) or keep the same failing set while
scope expands. That is your signal to stop retries and change inputs,
not keep sampling.
Actionable: What you can do this week
Pick a small, schema-driven generation task in your current work (e.g., generating a JSON configuration file, a SQL query based on a schema, or a simple code snippet that must pass a linter).
Define your deterministic checker: Write a simple tool or use an existing validator (like a JSON schema validator, a Kubernetes schema validator such as
kubeconform, or a linter/type checker) that can deterministically validate the output of your chosen task. In this chapter, that checker is theJudge, and it should return clear, structured errors.Build a manual Ouroboros loop:
- Write a Mission Object: an explicit task contract for the model.
- Run your
Judgeagainst the output. - If it fails, paste the
Judgeoutput into the next request and ask for a fix that addresses only the recorded failures. - Repeat until it converges or you hit a manual
max_iterationslimit.
Debug thrashing as a reachability problem: If you cannot converge, do not “try harder.” Check reachability:
- Can a human produce a
PASSartifact under the same scope and budgets? - Do your Validators conflict (two rules that cannot both be true)?
- Are you missing the contract in your slice (schema/interface/policy) so the loop is guessing?
- Can a human produce a
This exercise makes the core point tangible: deterministic checks drive a stochastic model toward something reviewable and safe.