Engineers Constrain Agents. Producers Brief Them.

Thirty years of production thinking turns out to be the correct mental model for agentic AI. The research explains why.

May 11, 2026

When something goes wrong in an agent workflow, engineers add a constraint. Producers fix the brief. These are not the same response.
The Claude Code architecture paper validates the producer approach: minimal decision scaffolding, maximal operational infrastructure
An engineer’s system accumulates constraints over time. A producer’s system accumulates clarity. One gets more rigid; one gets better.
Vibe-production is not a low-structure mode. It’s what the right mental model looks like in practice.

When something goes wrong in an agent workflow, engineers and producers do different things.

The engineer adds a constraint. A rule, a conditional, a state node, a tighter prompt. The mental model is that the agent is an unreliable function that needs to be pinned down. It did the wrong thing, so you restrict what it can do next time. The fix lives in the code.

The producer asks what the agent didn’t know. Or what it wasn’t clear on. Or whether the scope was tight enough to prevent it from drifting. The mental model is that the agent is capable but underinformed. It did the wrong thing because the conditions weren’t right, not because it needed a shorter leash. The fix lives in the brief.

Those are not different tactics. They’re different operating philosophies, and they produce structurally different systems over time.

How engineers build agent systems

LangGraph is the clearest expression of the engineering instinct. It models agent workflows as directed graphs: nodes for each state, typed edges for each transition, explicit decision logic at every branch point. The developer defines exactly what the agent can do at each step, what it can do next, and what happens when specific conditions are met. Control is the design goal. Predictability is the design goal. The agent’s autonomy is precisely bounded by the graph.

This is good software engineering. State machines are a solved problem. Finite, auditable, testable. When something breaks you can trace it to a node. When something needs to change you update the edge. The system is explicit about everything.

It is also, in the framing of the research paper that tore apart Claude Code’s source architecture, the opposite of what the most capable production agent systems actually do.

The paper contrasts Claude Code’s approach directly with LangGraph and systems like it. Claude Code uses a single reactive loop. The model reasons about what to do, the surrounding infrastructure handles execution, permission checking, context management, and recovery. There are no explicit state graphs constraining the model’s choices. There is no developer-defined decision logic routing the model from node to node.

The design bet, stated plainly in the paper: increasingly capable models benefit more from a rich operational environment than from frameworks that constrain their choices.

That is not how an engineer’s instinct reads the problem. And it is exactly how a producer’s does.

How producers build agent systems

A producer’s mental model for a team member is not a state machine.

You don’t manage a designer by defining every possible state they can be in and specifying which transitions are permitted. You brief them. You tell them what good looks like. You give them the context they need to make the right call without asking you first. You define the scope clearly enough that their judgment operates inside the territory you’ve agreed on. When something goes wrong, the first question is whether the brief was clear, not whether you need tighter rules.

This is so instinctive to producers that most don’t think of it as a method. It’s just how you work with people.

It turns out it’s also how you work with agents.

When my agent workflows produce wrong output, the fix is almost never to add a constraint. It’s to look at what the agent was working with. Did it have the right context document? Was the output format specified precisely enough, or was it left to interpretation? Did the scope allow it to make a judgment call where I should have made a decision in advance?

The answer is usually one of those three. I update the brief. The next run is better. I didn’t write any code. I didn’t add a rule. I clarified the conditions.

The engineer’s system grows by accumulating constraints on what the agent can do. The producer’s system grows by accumulating clarity about what good looks like.

Those trajectories diverge fast. The constraint-based system gets more rigid as rules multiply, harder to change, more brittle when the task changes shape. The clarity-based system gets more reliable as the brief sharpens, more adaptable, because clearer conditions work across more situations than narrow rules do.

What the research validates

The Claude Code paper identifies thirteen design principles behind the architecture. One of them is called “minimal scaffolding, maximal operational harness.” The description: invest in the infrastructure that lets the model reason freely, not in scaffolding-side reasoning that constrains its choices.

The paper estimates that roughly 1.6% of Claude Code’s codebase is AI decision logic. The remaining 98.4% is deterministic infrastructure: context assembly, permission rules, recovery logic, tool routing, compaction pipelines. None of that infrastructure tells the model what to decide. It creates the conditions under which the model can decide well.

The contrast with LangGraph is explicit. LangGraph “encodes decision logic as explicit state graphs with typed edges, choosing scaffolding over minimal harness.” Claude Code does the opposite: “the harness creates conditions under which the model can decide well, rather than constraining its choices.”

Anthropic built one of the most widely used production coding agents in the world on the producer’s mental model. Not because producers were consulted. Because the model’s capabilities had advanced to the point where constraint-based design was the wrong tool. The model doesn’t need a shorter leash. It needs better conditions.

This is the structural advantage that producers bring to agentic AI, and most of them don’t know they have it.

Why the difference compounds

The March 2026 Claude Code regression is instructive here, even for producers building their own workflows rather than using Claude Code directly. Anthropic published a postmortem confirming that three infrastructure changes had degraded the system for six weeks. None of them added constraints. They removed conditions: less reasoning time, lost reasoning history, truncated responses. The model’s capabilities hadn’t changed. The environment it was operating in had gotten worse.

Anthropic’s fix was not to add more rules. It was to restore the conditions. Revert the reasoning effort. Fix the caching bug. Remove the verbosity cap. Put back what had been taken away.

The engineering instinct, faced with a system behaving badly, is to add control. The production instinct, faced with a team member behaving badly, is to check the conditions first. Are they missing context? Are they unclear on the scope? Do they understand what good looks like?

Those instincts produce different responses to the same problem. In the Claude Code case, and in most agent workflow failures I’ve seen, the production instinct is the correct one.

What this looks like in practice

I’ve written before about vibe-production as a distinct way of working with agents. The producer-vs-engineer contrast gives me a more precise way to define it.

Vibe-production is not low-structure work. It’s a specific structure. The investment goes into the brief: context documents, output standards, scope definitions, permission models. The agent then operates inside that brief with significant autonomy. You review the output. You refine the brief. You don’t add rules unless a rule is genuinely what the situation requires, which is less often than you’d think.

I direct agents the way I’d direct a senior team member on a tight brief. I don’t specify every step. I specify the destination, the format, the constraints that are genuinely constraints, and the context they need to get there without asking me. When the output is wrong, I look at the brief before I look at the agent.

This works because capable models, briefed well, make good decisions. They don’t need a state graph. They need to know what you actually want.

The producers who haven’t figured this out yet are building agent workflows that feel like managing a difficult contractor: constant correction, constant intervention, outputs that never quite match the spec. The problem is almost never the model. It’s that the brief is doing work it isn’t equipped to do, and the conditions that would let the model succeed on its own haven’t been built.

The producers who have figured it out are running something closer to a well-briefed team on a clear project. The agent executes. They review. They sharpen the brief. The system improves without getting more complicated.

Discussion about this post

Ready for more?