Coding Agents

Introduction to Agentic AI

Topics Covered

How Coding Agents Work

The ReAct Loop Applied to Code

The Core Tool Set

ReAct in Action

Human-in-the-Loop Safety

The Edit-Test-Iterate Loop

Diff-Based Edits in Practice

Test Output as the Feedback Signal

Multiple Iteration Cycles

The Absence of Tests

File and Codebase Navigation

Search-First Navigation

Building a Mental Map

Multi-Hop Navigation

File Structure Conventions

When to Re-Read Files

Context Management for Large Codebases

The Context Budget

Conversation Compression

Selective Reading

Sub-Agents and Context Isolation

Project-Level Context Files

Computer Use Agents

Limitations and Failure Modes

Editing the Wrong File

Silent Correctness Bugs

Test Loop Stagnation

Context Window Overflow

Cascading Edits

Over-Confidence

Open Claude Code, describe a feature you want to add, and watch what happens. The agent reads your codebase, identifies the relevant files, plans the change, edits the code, runs your tests, and iterates until everything passes. This is not a demo trick. It is the same ReAct loop from earlier in this course, applied to software engineering with a specialized tool set. Coding agents are the most mature and widely-deployed AI agents in production today, and they demonstrate every pattern you have learned: tool use, reasoning, context engineering, safety, and error handling.

The ReAct Loop Applied to Code

A coding agent is an LLM in a ReAct loop with tools for file I/O, terminal execution, and code search. The agent observes (reads files, reads test output), reasons (plans what to change), and acts (edits files, runs commands). This is the observe-reason-act cycle you studied earlier, but the tools are specifically designed for software engineering. The agent does not plan the entire change upfront. It reads a file, understands the current code, decides what to change, makes the edit, observes the result, and adjusts. This interleaving of reasoning and action is what allows coding agents to handle unexpected situations. The code does not always look how the agent expects, and the agent adapts in real time.

Consider the contrast with a one-shot code generator: given a task, it generates code once and hands it to you. If the generated code has a bug, you fix it manually. A coding agent, by contrast, discovers its own bugs through test execution and fixes them autonomously. The difference is the loop: the ability to observe the consequences of actions and adjust based on reality.

This distinction matters because software engineering is inherently iterative. Even experienced human developers do not write perfect code on the first attempt. They write, compile, test, debug, and refine. A coding agent mirrors this workflow. The quality of the final output comes not from getting it right the first time, but from the speed and effectiveness of the iteration cycle.

The Core Tool Set

Every coding agent (Claude Code, GitHub Copilot agent mode, Cursor composer, Windsurf, Cline) has roughly the same core tool set despite different UIs and interaction models. This convergence is not coincidence; the task domain dictates the tools.

File read lets the agent read file contents, specific line ranges, or entire files. This is the observation tool: the agent needs to see the current state of the code before it can reason about changes.

File edit applies diff-based replacements where the agent specifies the exact old string and its replacement, rather than rewriting entire files. This is the action tool: surgically modifying code at specific locations.

File search includes glob patterns (find files by name) and grep (search content by regex). This is the navigation tool: finding relevant code in a codebase with thousands of files.

Terminal execution runs shell commands: tests, builds, git operations, linters, package managers. This is the verification tool: executing code to observe whether changes work.

Web search lets the agent look up documentation, error messages, and API references when it encounters something unfamiliar. This is the knowledge tool: extending beyond the codebase to external information.

Together, these five tools form the minimal complete set for autonomous software engineering. Remove any one of them and the agent cannot close the full loop from task to verified solution. An agent without file search cannot find the right code. An agent without terminal execution cannot verify its changes. An agent without file edit can identify problems but cannot fix them. The completeness of the tool set is what makes coding agents autonomous rather than advisory.

ReAct in Action

The ReAct pattern is visible in every coding agent session. The agent receives a task like "add input validation to the signup endpoint." It does not immediately start writing code. First, it searches for the signup handler with grep. It reads the file to understand the current implementation. It reasons about what validation is needed. It edits the file to add the validation logic. It runs the test suite. If tests fail, it reads the error output, reasons about the cause, and makes another edit.

Each step alternates between thinking and acting. This is why coding agents handle edge cases that a one-shot code generator misses. The agent discovers problems through execution and fixes them through iteration. The key insight is that the agent's plan evolves as it works. It might start with a plan to add three validation checks, but after reading the existing code, it discovers that two of those checks already exist. The plan adapts. A static plan generated upfront would have missed this and potentially created duplicate validation logic.

Human-in-the-Loop Safety

Human-in-the-loop is built into every production coding agent. Claude Code asks for approval before executing shell commands and can be configured with allowlists for trusted commands (like test runners) that skip approval. Cursor shows diffs before applying edits, letting the developer accept, reject, or modify each change. GitHub Copilot presents changes for review before committing, using the familiar pull request review model.

This is the permission model from the safety section of this course applied to real-world software engineering. The agent proposes, the human approves or rejects. This is not a limitation. It is a design choice that prevents the agent from running destructive commands, committing broken code, or making changes the developer did not intend.

The trust model is graduated: low-risk actions (reading files, searching code) proceed without approval. Medium-risk actions (editing files) may require approval depending on configuration. High-risk actions (running shell commands, pushing to git, deploying) always require explicit human confirmation.

This graduated model balances productivity with safety. The agent can move quickly on safe operations while still requiring human judgment for actions with significant consequences. As users build trust with a specific coding agent over time, they can selectively expand the autonomy boundary, allowing the agent to run tests without approval while still gating deployment commands. This is graduated trust in practice, evolving with experience rather than being fixed at design time.

Key Insight

Coding agents are the existence proof that ReAct + tool use + context engineering works at scale. Every concept in this course (tool design, reasoning patterns, memory management, safety, error handling) is demonstrated in a single coding agent session.

Course

Introduction to Agentic AI

LLM Foundations

The Agent Paradigm

Reasoning and Planning

Memory and Knowledge

Agent Architectures

Safety and Reliability

Production Engineering

Real-World Agent Patterns

Coding Agents

How Coding Agents Work

The ReAct Loop Applied to Code

The Core Tool Set

ReAct in Action

Human-in-the-Loop Safety

1/27