Inside a Coding Agent: The Loop That Actually Matters

April 2, 2026

After the Claude Code leak made the rounds, a lot of people focused on the prompts. The interesting part is not the prompts. It is the loop.

The shape is small. The user states a goal. The model proposes a tool call: read a file, run a test, edit a region, search the codebase. The executor runs the tool and returns the result. The result is appended to the context. The model reads the new context and proposes the next call. Repeat until the model decides it is done.

Three things make this loop work.

First, context is the state. Nothing persists outside the message log by default. The current file contents, the last test output, what the agent already tried, all of it lives in the conversation. That is why long context windows matter for code: a whole-file edit needs the whole file in view, and a multi-file refactor needs every touched file plus the test that validates them.

Second, tools beat tool calls. A general-purpose run_shell tool will let the model do anything and choose poorly. Specific tools with narrow schemas guide the model toward the right action. A dedicated edit_file tool with exact-string replacement constrains the model into a precise diff. A grep tool that returns line numbers wastes fewer tokens than reading entire files. Tool design is prompt engineering you cannot bypass.

Third, the environment is the judge. The model proposes. The compiler, the type checker, the test suite, the linter, those decide whether the change was correct. A good agent loop keeps that judgment fast and visible: run the test, surface the failure, let the model react. Without that feedback the agent generates code and hopes.

Here is the failure mode I keep seeing in production. A coding agent loops on a failing test. Each iteration it tries an edit, runs the test, reads the failure, and tries again. The context grows. After fifteen rounds it is past the model's working-memory sweet spot. The model forgets which fixes it already tried, repeats a prior bad edit, and the loop never converges. You spend tokens, the test stays red, and the agent looks broken.

The fix is to put memory outside the message log. Maintain an explicit scratchpad: tried edits, failure signatures, ruled-out hypotheses. Summarize at checkpoints, drop stale tool output, and keep the active window focused on the current hypothesis. The model is good at one step at a time. It is your job to remember the rest.

Key takeaway

A coding agent is a control loop, not a prompt. The model proposes, the environment disposes, and the conversation is the only state. Get the loop right and the model performance is almost a footnote.

Originally posted on LinkedIn. View original.