Introduction

11 min read

Chapter 0 – Introduction

Friday, 4:03 PM.

You use an AI assistant to ship a feature fast:

“Add pagination to GET /api/users.”

It updates the handler, tweaks a test, edits the API contract (OpenAPI), and slips in a refactor you did not ask for.

The diff looks reasonable. CI passes. You merge it.

Velocity is high. Friction is low. Your confidence is mostly based on how reasonable the diff looks.

Saturday morning, your mobile app is down.

The assistant introduced a contract drift: limit became a string in the OpenAPI spec while the rest of the system still treats it as an integer.

--- a/openapi.json
+++ b/openapi.json
@@ -312,7 +312,7 @@
   "limit": {
-    "type": "integer"
+    "type": "string"
   }

Now run the same scenario inside a bounded loop:

$ make ship MISSION="Paginate GET /api/users"
[PREP] Slice: openapi.json, src/users_handler.ts, clients/mobile/*
[WRITE] Attempt 1: proposed patch
[VALIDATE] Contract alignment (Map/Terrain): FAIL (OpenAPI drift: `limit` must remain integer)
[REFINE] Retrying with constraint: keep `limit` as integer everywhere
[WRITE] Attempt 2: proposed patch
[VALIDATE] Contract alignment (Map/Terrain): PASS
[VALIDATE] Generated client Validator: PASS
[LEDGER] build/ledger/2026-02-06T1603Z/ (diff + validator output)
[RESULT] PR ready for review (gates passing)

Illustrative transcript: the commands, paths, and gate shape are representative. Chapter 1 shows the runnable loop, and Appendix A marks what is fully runnable versus explanatory.

That is the core wager here: do not trust raw generation. Engineer the loop that decides what ships.

Executive Summary

This book is about using AI to speed up software development without giving up control.

The core move is simple: do not trust raw generation. Put the model inside an engineering loop you can inspect, rerun, and block.

That loop has five parts:

Narrow context: give the model a small, versioned slice of the real system.
Clear intent: describe the requested change in a form that is specific and checkable.
Hard checks: run validators that return PASS/FAIL before merge.
Bounded retries: let the system try again, but with scope limits and retry limits.
Evidence: keep the diff, logs, and decisions so you can explain what happened later.

What this buys you:

Faster iteration without turning speed into drift.
Changes you can review, replay, and audit.
A pattern that starts with one file and scales to teams and organization-level change control.

How to read this book:

Part I gets a working loop running first.
Later parts explain why it works and how to scale it.

The Software Development as Code (SDaC) Stack (One Diagram)

This is the operating stack used throughout the manuscript:

flowchart TB
  S["Substrate"] --> L["Loop"]
  L --> G["Governance"]
  G --> M["Maintenance"]
  M --> R["Reflection"]

Read top to bottom: build on a concrete substrate, run bounded loops, enforce governance, sustain through maintenance, and improve through reflection.

The Shape of All Work

Every productive process follows the same pattern at every scale: P(Zn) = Z(n+1). You start somewhere, run a process, and end up somewhere else. Sometimes the process improves too.

Scale of Iteration

SDaC Guardrails

Standard mode: faster work without better guardrails. Speed can amplify mistakes.

Iteration: P(Zn) = Z(n+1)

Evolution: Pk → P(k+1)

Z(n) Input

Process

Team + Practices

STANDARD LOOP

⏱ ~days

Current State (Zn)

Codebase

The Process (P)

Team + Practices

The same pattern shows up at every scale.

Whether you type one character or steer a company, you still transform a current state into a next state through a process. What changes is the time constant and impact radius.

Every productive process has the same shape:

P(Z_n) = Z_n + 1

Start with a state (Z_n). Run a process (P). You get the next state (Z_n + 1).

That could be a code edit. A sprint. A release. A quarter. The pattern is the same.

The hope is always that the new state is better:

Z_n + 1 ≻ Z_n

Here “≻” just means “better than,” by whatever measure matters: closer to the goal, more stable, more complete.

That measure is intent. It answers:

What are we optimizing for?
What is allowed to change?
What must never drift?
What must stay true at every layer?

SDaC is what happens when you write those answers down in artifacts and gates instead of leaving them as shared intuition.

The important twist is that the process can change too.

P_k(Z_n) → (P_k + 1, Z_n + 1)

k indexes the process itself over time. Each iteration does not just produce a new state. It can also improve the process that produced it.

You write code, and you also improve how you write code. You ship a feature, and you also learn how to choose and validate features better. The system that does the work becomes part of the work.

If you prefer pseudocode, it’s the same idea:

next_state = process(current_state)
process = improve(process, evidence)

The pattern repeats at every scale:

Scale	State (Z)	Process (P)	Iteration
Keystroke	Buffer content	Editor + habits	~100ms
Function	Implementation	Developer + tools	~minutes
Feature	Codebase	Team + practices	~days
Product	System	Organization + strategy	~months
Company	Market position	Culture + leadership	~years

Each level sits on top of faster loops beneath it. That is why small improvements compound, and why small failures eventually surface at higher levels.

The Velocity Trap

AI makes each loop cheaper and faster.

Work that took a day can now take minutes. Work that took a sprint can now happen in a day. In some cases, work that needed a small team can now be pushed forward by one engineer and an agent.

That is useful only if the loop stays sane. AI changes the clock speed; the trap is thinking speed alone is progress.

A faster loop that drifts is faster drift. A faster loop that breaks is faster breakage. Speed amplifies what is already there: good practices get stronger, bad practices get more expensive.

This is the question SDaC answers: How do you harness accelerated iteration without accelerating into a wall?

The answer is simple: engineer the loop, not just the output.

You design, validate, and improve the loop that produces output. If the loop is sound, the outputs get better. If the loop is broken, no amount of model cleverness saves you.

Here, SDaC (Software Development as Code) names the mechanisms that make accelerated iteration safe: explicit intent, bounded writes, deterministic validators, and loops that converge.

If the process is validated, then P(Z_n) = Z_n + 1 becomes trustworthy.

Part I builds a loop first. Once you can trust the loop, you can use it on many kinds of work.

The same structure means this scales.

A validated loop at the function level follows the same pattern as a validated loop at the organization level. The artifacts differ. The clock speed differs. The shape does not.

You scale by composition: function-level loops roll up into feature loops, feature loops roll up into service loops, and service loops roll up into org-level change control—same shape, different artifacts.

SDaC does not give you a magical way to write code. It gives you a safer way to iterate.

The Friday failure is the motivating case for the rest of the book. Everything from here on is a way to make that second transcript routine instead of exceptional.

Who this book is for

This book is for engineers asking “what now?” after the first wave of AI assistants, especially people building or governing AI-assisted development who want mechanisms, not hype.

If you want to build immediately, start with Part I. It gives you a working loop first, then earns the theory by explaining why it works.

What SDaC is (Quickstart)

This book is not about sprinkling AI into CI. It is about turning the software development process itself into something explicit, versioned, and enforceable: Software Development as Code.

CI compiles the program. SDaC compiles the development of the program.

SDaC in one picture:

Follow the arrows: express intent, slice context, propose a patch, run checks, record evidence, merge. On FAIL, tighten constraints and retry.

flowchart TD
  H[Human intent] --> M["Mission Object<br/>(structured request)"]
  M --> P["Prep: Context Packet<br/>(the slice)"]
  P --> LLM[Model]
  LLM --> D["Candidate Diff<br/>(proposed patch)"]
  D --> V["Validation: Physics<br/>(checks)"]
  V -->|PASS| L["Ledger Evidence<br/>(receipts)"]
  L --> R[Review + Merge]
  V -->|FAIL| F["Refine / Revert<br/>(tighten + retry)"]
  F --> M

Caption: The FAIL path loops back to refine context and retry. This is the core mechanism that makes the loop self-correcting.

From the V-Model to Governed Loops

The classical V-model is not obsolete; it is incomplete for AI-assisted engineering. It still describes the deterministic spine of development: formalize intent on the left, implement at the bottom, prove correspondence on the right. Once a probabilistic model enters implementation, that center must be wrapped in deterministic preparation, validation, governance, and runtime feedback.

Here, the left side of the V is not just requirements and design documents. It is the formalization chain that makes intent admissible to a loop: goal -> Mission Object -> scope -> constraints -> acceptance criteria -> context slice -> output contract. That is how traceability becomes executable instead of aspirational.

The bottom of the V is no longer an unexamined implementation step. It is one bounded model call inside the Deterministic Sandwich: Prep -> Model -> Validation. The model is an Effector that proposes a candidate diff. It is not the thing being trusted.

On the right side, proof is layered rather than deferred to one final “test phase”: strict parsing, schema and contract checks, tests and ratchets, scope and policy validators, Judge decisions, Ledger evidence, review, and merge. The classical V-model gives you traceability. The loop here adds convergence: when the candidate fails, the system tightens against deterministic findings until it passes or stops with evidence.

Classic V-model	Architects of Intent canon
Requirements	Mission Object
System design	Map / Context Architecture
Module design	bounded slice / contracts / skeleton
Implementation	Effector / candidate diff / model step
Unit verification	Validators / tests / types / schemas
Integration verification	contract checks / policy / scope gates
System validation	Judge / acceptance criteria / review
Acceptance / operational validation	Ledger / Dream / Map-Updaters / governance

Why now (economics)

SDaC becomes practical once the economics of automation change. The boring but necessary scaffolding that used to be expensive to build and maintain, like schemas, extractors, validators, and runners, can now be generated and tightened much faster.

A rough back-of-envelope: a first-pass validator or extractor that used to take 1–3 engineer-days can now often be generated in 15–60 minutes, then tightened over a few iterations. That changes which architectures are worth building early.

At the same time, models are good at the linguistic parts of engineering: decomposition, summarization, classification, and drafting. Generation got cheaper. Governance and review did not. The bottleneck becomes review throughput: how many diffs humans can safely govern.

The litmus test

If someone says “this is just CI,” ask:

Can you recreate the team’s engineering behavior (planning → implementation → review → tests → release → postmortem → docs) from the repo + compute?
Are the rules (contracts, validators, policies, merge criteria, risk thresholds) first-class, versioned, and enforceable?
Is the model an assistant, or an Effector inside a constrained machine?

If the answer is “no,” you do not have SDaC. You have CI-adjacent automation. If the answer is “yes, humans are operators of an executable system,” that is SDaC.

Start here (one loop, one gate)

Go to Chapter 1 and run Minimum Viable Factory (MVF) v0 (“Quick Bootstrap”). You should see a small diff and a deterministic PASS/FAIL gate.

Minimum vocabulary for the first pass:

Terrain: what runs (real behavior).
Map: a versioned intent surface (contracts, schemas, docs).
Ledger: evidence log (diffs + Validator output).
Sensor: read-only measurement (emits structured signals).
Effector: proposes a patch (writes diffs).
Validator: returns PASS/FAIL (tests, checks, invariants).
Loop: repeat Effector → Validator; commit only if PASS.

If you want a dependency graph, a week-by-week ladder, and a concept map, jump to Appendix A.

The Manifesto

We do not optimize for “smart” models; we optimize for converging systems.

AI is non-deterministic. The same request can produce different answers. We constrain it.

We buy down uncertainty with constraints, logs, and tests – not hope. We treat language as a contract. We do not ship vibes. We ship verifiable state.

We do not trust the model. We trust the loop.

Start reading Friday, ship it Monday.

If you want the longer version (the stance behind the mechanics), it appears at the end of Part I, after you’ve built the first loop.

A Note on Substrate

These principles are portable, but the examples use a simple, familiar setup: standard files, Git for version control, make for orchestration, and Python for scripting.

That gives us a practical environment for demonstrating the core ideas without hiding behind framework-specific machinery. The patterns are meant to transfer to other languages, build systems, and version-control setups.

Appendix C includes example validator recipes for other stacks (for example TypeScript/Node). The focus is on the engineering principles, not the specific tooling.

Prerequisites (Assumed Knowledge)

This book assumes you’re already comfortable with day-to-day software engineering.

You should be able to:

Use a terminal and run commands like make, python3, and git.
Read a diff and understand what will change before it merges.
Interpret Validator output (lint errors, schema failures, failing Immune System cases).
Read and edit plain-text files (Markdown, JSON/YAML, configuration).

You do not need:

ML theory, model training, or GPU tooling.
Control theory beyond the intuition we build as we go.

Acknowledgements

Thanks to Abler’s CTO, who implemented Axiom and used it for production-grade code after reading a previous iteration of this book.

I’m also grateful to my co-founders at AIGB and LumiLoop.

About the author

I’m Jóhann Haukur Gunnarsson, a systems architect and senior software engineer. I’ve built and operated reliability-critical systems, including in finance, where failure modes are not theoretical and audit trails matter.

If you want to continue the conversation, you can reach me at:

Email: johannhaukur@gmail.com
LinkedIn: https://www.linkedin.com/in/jhaukur/

Actionable: What you can do this week

Set up the substrate so Chapter 1 is frictionless:

Install git, make, and Python 3.11+.
Clone the companion repo (kjwise/aoi_code).
Run its “quick bootstrap” loop and confirm you get a clean PASS/FAIL gate.

git clone https://github.com/kjwise/aoi_code.git
cd aoi_code
make all

These commands are runnable as written in the companion repo. The earlier make ship transcript is illustrative; this quickstart is not.