When the model writes the implementation, the specification becomes the artifact you actually version and defend.

There is a line I keep coming back to, one I first started sketching in The Spec Is the Program: Intent as the new source code, and it goes like this: when the machine writes the implementation, the thing you wrote before the machine runs becomes the thing that matters most. Not just a planning artifact. Not a ticket. The actual source of truth you version, review, and defend.

I have been operating inside that reality at Devlyn for long enough now that I want to say something concrete about it, not in the abstract sense of "AI is changing software," but in the very specific sense of what my senior engineers do differently on a Monday morning than they did three years ago. What they argue about. What they reject. What they own.

The answer, compressed: they own the spec. And the spec is the program now.

Key takeaways

The spec is the artifact you version and defend. When the model writes the implementation, the specification becomes the source of truth you review, version in git, and read first in a postmortem. The code is a projection of it.
You author constraint, not code. A skilled engineer writes structured intent, the model produces a diff, and the engineer evaluates that diff against the spec, then tightens and regenerates. Writing precise constraint is harder than writing the implementation directly.
Specs for model implementers must be explicit. The model fills any gap you leave with something plausible but not necessarily correct, so a good spec states intent, happy path, edge cases, invariants, and what NOT to build.
Test invariants, not mechanism. Behavior-coupled tests stay green across model-generated rewrites and actually mean something when they fail; mechanism-coupled tests break every regeneration.
Spec drift is the new technical debt. When code quietly becomes the implicit new spec, divergence compounds one model-run at a time. The defense is to treat the spec as primary and assume it is right when code disagrees.

What "the machine writes it" actually means in practice

Let me be precise about the claim, because it gets sloppy fast. When I say the machine writes the implementation, I do not mean a junior engineer pastes a vague prompt into a chat window and ships whatever comes out. That is not what is happening in our senior-engineer layer and it is not what I am describing.

What I mean is this: a skilled engineer writes a structured artifact, call it a spec, a design, a declaration of intent, and then a code-generating model produces a diff from that artifact. The engineer does not write the for-loops. The engineer does not write the SQL. The engineer writes the thing that governs the behavior, and the model fills in the mechanism. Then the engineer reads the diff, evaluates it against the spec, and either accepts it or tightens the spec and runs again.

That loop, write intent, generate mechanism, evaluate against intent, tighten, is the new cycle. And the thing that has changed is what the engineer is actually authoring. They are not authoring code anymore. They are authoring constraint.

This sounds cleaner than it is. Writing precise constraint is hard. It is, in many ways, harder than writing the implementation directly, because when you write the implementation you can cut corners in your own head. When you write the spec, you have to externalize those corners. The model will find every gap you leave.

The spec as executable artifact

Here is the part that changes how you think about software development. In the old model, the spec was upstream of the code and the code was upstream of the tests. The spec lived in a document somewhere, maybe a Confluence page, maybe a JIRA epic. The code was the thing that ran. The spec was aspirational at best, archaeological at worst, something you read to understand what someone intended six months ago before the implementation drifted.

In the new model, the spec is the thing that runs, not directly, but in the sense that the model will faithfully implement whatever the spec says, including the parts you did not think carefully about. The spec becomes load-bearing. It is no longer aspirational. It is operative.

That is a genuinely different relationship with documentation. When a document can cause code to exist, the document is no longer soft. It has to be precise about behavior, not just intent. It has to specify not just what the system does in the happy path, but what it does at the edges. It has to encode the invariants, the things that must always be true, not just the features.

When a document can cause code to exist, the document is no longer soft. It has to be precise about behavior, not just intent.

At Devlyn, we have started treating specs with the same rigor we used to reserve for schemas and contracts. We review them. We version them in git alongside the code they generate. We have spec review as a step in our engineering process, not as a precursor to the real work but as the real work. When someone proposes a change to a system, they submit a spec change first. The code change is downstream of that, and often it is generated.

What a good spec looks like when the model is the implementer

This is where it gets concrete enough to be useful. A spec that is written for human implementers is different from a spec that is written for model implementers, and not in the direction most people expect.

A spec for a human implementer can leave a lot implicit. The human brings contextual judgment. They know the codebase. They will fill in reasonable defaults. They will notice when something seems off and ask. The spec can be high-level and still produce a good implementation because the human bridges the gap between intent and mechanism.

A spec for a model implementer needs to be explicit about things that human implementers would infer. Not because the model is dumb, it is not, but because the model will produce a plausible implementation for any gap you leave, and "plausible" and "correct" are not the same thing. The model does not push back. It fills in. That filling-in is where bugs are born in AI-assisted development, not in the code the model writes for things you specified, but in the code the model writes for things you left unspecified.

So a good spec, in our current practice, has four layers:

Spec: Order eligibility check

Intent

Determine whether a customer order can proceed to fulfillment given current inventory state and any active holds.

Behavior (happy path)

Given a valid order and no holds, return eligible: true with a fulfillment window based on warehouse proximity.

Edge cases / constraints

If any line item is out of stock: return eligible: false, reason: inventory, include which SKUs are blocked.
If account has active fraud hold: return eligible: false, reason: hold, do NOT expose hold details to client response.
Partial availability (some items in stock): eligible: false unless customer has opted into partial fulfillment.
Timeout from inventory service: fail open toward ineligible, log for async review, never fail silently.

Invariants (must always be true)

Fraud hold reason must never appear in external API response.
Response time must not depend on number of line items (O(1) calls).
All eligibility decisions must be logged with order ID + reason.

What NOT to build

Do not add caching at this layer; caching lives upstream.
Do not call pricing service; eligibility is inventory + holds only.

That last section, "what not to build", turns out to be one of the most important parts of any spec written for a model implementer. The model has strong priors about what a system like this "usually" includes. It will add things. You have to explicitly exclude the things you do not want, not just specify the things you do.

Accepting machine-authored diffs is a new engineering skill

There is a skill that nobody trained for that is now table stakes for senior engineers working in this mode, and it is the skill of reading a machine-authored diff with the right mental posture.

The wrong posture is to read the diff like you would read code you wrote yourself, looking for syntax errors and obvious bugs. The model does not produce syntax errors. It produces plausible code that looks right. The errors are at a higher level of abstraction, a behavior that is slightly off, an invariant that is almost but not quite maintained, a happy-path implementation that does not handle the one edge case that the spec mentioned but did not fully specify.

The right posture is to read the diff against the spec, clause by clause. Not "does this code look correct" but "does this code implement what the spec says." Those are different questions. The first is a code review. The second is a spec audit. We have shifted almost entirely to the second.

This is also why I believe strongly in properties-based thinking, what I sometimes call invariant-first specification. If your spec encodes the invariants, the things that must always be true, you can write tests that check the invariants, not the code paths. And those tests remain valid even after the model regenerates the implementation, because you are testing behavior, not mechanism. You are testing the thing that is supposed to be stable across implementations.

You are testing behavior, not mechanism. The thing that is supposed to be stable across implementations, not the implementation itself.

This matters more than it seems. In the old model, tests were tightly coupled to implementation. When you refactored, tests broke. Not because behavior changed but because the mechanism changed and the tests were testing mechanism. In the model-generated world, the mechanism regenerates constantly. If your tests are coupled to mechanism, you will be rewriting tests constantly. If your tests are coupled to behavior, to the invariants in the spec, they stay green across model-generated rewrites, and they actually tell you something when they fail.

Spec drift is the new technical debt

In traditional software development, the great enemy of long-term maintainability is code entropy: the codebase drifts from the original design, abstractions collapse, complexity accumulates. You accrue technical debt.

In the model-generated world, there is a new kind of entropy that I think is actually more dangerous, and I call it spec drift. It works like this: the spec says one thing, the generated code does something slightly different because of a gap in the spec, and then the gap quietly becomes the implicit new spec. People start writing the next spec against what the code actually does rather than what the original spec said. Over time, the spec and the code diverge in a different direction, the code diverges from the spec, and then the spec gets rewritten to match the code rather than the original intent, and then the next generation of code is generated from a spec that already incorporated one round of drift.

This compounds. Each generation of code incorporates the drift from all previous generations. Within a few months, you have a system that does something recognizable but not exactly what anyone intended at any point, a sort of evolutionary drift away from the original design, one model-run at a time.

The defense against spec drift is to treat the spec as the primary artifact and the code as a projection of it. When there is a discrepancy between spec and code, the default assumption should be that the spec is right and the code is wrong, not the other way around. When you change the spec, you should change it deliberately, with review, and you should record why. The spec is the commit message that explains the code, but more than a commit message: it is the document that future generations of the model will use to regenerate the code, so it has to stay accurate and intentional.

At Devlyn, we have started doing spec audits, periodic reviews where we read the spec for a system against what the system actually does and flag divergences. It takes time. It is worth it. It is how we keep the model-generated codebase aligned with the decisions that senior engineers made.

What senior engineers own now

I want to be direct about what this means for engineering roles, because there is a lot of hand-waving about whether AI replaces engineers and I find it mostly useless. The real question is: what does the senior engineer own in the model-generated world?

The answer, in our practice: they own the architecture and the spec. They own production readiness, the judgment calls about failure modes, observability, security posture, and operational behavior that cannot be derived from features alone. They own the invariants. They own the spec review process. They own the decision about when the model's output is good enough to ship and when the spec needs to be tightened.

What they do not own, or own only lightly: the line-by-line implementation. The first-draft code. The boilerplate. The translation from a well-specified behavior into a working function. The model owns that, under supervision.

This is not a diminishment of the senior engineer role. It is, if anything, a sharpening. The model takes away the implementation work that junior engineers used to do as practice and senior engineers used to do as tax. What is left is the work that required senior judgment all along, the design, the constraint, the production posture, except now that work is more visible and more load-bearing because it directly governs what gets built.

I think this is a better deal for senior engineers who embrace it. The frustrating parts of senior engineering, translating a clear design intent into hundreds of lines of boilerplate that mostly express the intent without adding any value, those parts compress. The rewarding parts, making hard architectural calls, writing down the invariants you care about, holding the line on production readiness, those parts expand.

The catch is that you have to actually write the spec. You cannot hold the intent in your head and expect the model to read it. You have to externalize the judgment. That is the discipline. And it is a real discipline. Writing a spec good enough to govern a model-generated implementation requires you to surface assumptions you would otherwise leave implicit, specify behaviors you would otherwise leave to the implementer's judgment, and enumerate the edge cases you would otherwise trust a senior engineer to catch.

The more I work in this mode, the more I think good spec-writing is actually a more rigorous form of engineering than good code-writing. Code lets you hide in the mechanism. A spec has to say what you mean.

Software development when the machine writes it

Step back from the individual engineer and look at what software development looks like when model-generation is the primary path from spec to code. It looks, in structural terms, like this:

The work of a software project is front-loaded into intent. The early phases of a project, what used to be architecture and design phases, often abbreviated or skipped in agile shops, become the primary creative work. The spec for each component is the thing that gets iterated on, reviewed, and refined. The code is generated downstream and evaluated against the spec.

Changes to a system start with the spec, not the code. You do not open the implementation file and start editing. You open the spec and change what you want the system to do. Then you generate a new implementation from the new spec. This is not always clean, there are cases where the generated implementation has been hand-modified and you have to reconcile, but the general direction of change should be spec-first.

Reviews are behavior reviews, not code reviews. The question is not "is this code well-structured" but "does this code implement the spec." The model can produce well-structured code that does not implement the spec. It can produce ugly code that does. You care about the spec compliance.

Testing is invariant-testing. Your test suite encodes the properties that must always hold. When the implementation is regenerated, the tests run against the new implementation. If they pass, the invariants hold. You ship. The tests do not care how the invariants are achieved; they care that they are achieved.

All of this is described at much greater length in The AI-Native SDLC, where I go through the full development lifecycle in the model-generated world, from initial design through incident response. But the core move is the one I have been describing here: shift what you author from mechanism to intent, and treat the intent artifact as the primary thing you version and defend.

The artifact you actually defend

There is a question I ask in engineering reviews at Devlyn: "If this system does the wrong thing in production, what document would you read first to understand why?" Three years ago, the answer was always: the code. Now the answer is: the spec. Because the spec is the thing that governed the behavior. The code is the thing that expressed it.

That shift in what you read first is, I think, the deepest change. It means the spec is the artifact you defend in a postmortem. It means the spec is the artifact you update when requirements change. It means the spec is the thing a new engineer reads to understand what the system is supposed to do, not just what it does.

We are still early in this transition. The tooling for managing specs, versioning them alongside code, tracking spec drift, and making spec review a first-class part of the engineering process, all of that is still being built, including by us. The practices are ahead of the tools. That is fine. It has always been fine. The practices are right.

The machine writes the implementation. The engineer writes the spec. The spec is the program now.

Frequently asked questions

What does "the spec is the program" mean?

It means that when a code-generating model writes the implementation, the specification, the structured artifact of intent you wrote before the machine ran, becomes the thing you actually version, review, and defend. The spec is no longer aspirational documentation that drifts from the code. It is operative, because the model faithfully implements whatever it says, including the parts you did not think carefully about. The code becomes a projection of the spec rather than the source of truth.

How is a spec for a model implementer different from one for a human?

A spec for a human can leave a lot implicit, because the human brings contextual judgment, knows the codebase, and asks when something seems off. A model does not push back; it fills any gap you leave with a plausible implementation, and plausible is not the same as correct. So a spec written for a model implementer has to be explicit about the things a human would infer: intent, the happy path, edge cases and constraints, the invariants that must always hold, and an explicit list of what NOT to build.

What is spec drift?

Spec drift is the new technical debt. The spec says one thing, the generated code does something slightly different because of a gap, and that gap quietly becomes the implicit new spec. The next spec gets written against what the code does rather than what was intended, and each generation of code incorporates the drift from all previous generations. The defense is to treat the spec as the primary artifact: when spec and code disagree, assume the spec is right and the code is wrong, and change the spec deliberately, with review.

The spec is the program now