Introduction to Agentic AI
LLM Foundations
The Agent Paradigm
Reasoning and Planning
Memory and Knowledge
Safety and Reliability
Production Engineering
Real-World Agent Patterns
Single-Agent Patterns
Every agent, no matter how sophisticated, runs the same core loop: observe the current state, decide what to do, take an action, observe the result, and repeat until the task is done. This is the ReAct (Reason + Act) loop. The LLM receives a prompt with the task description, the conversation history, and the available tools. It reasons about what to do next, emits a tool call (or a final answer), the system executes the tool, appends the result to the conversation, and the loop continues.
The basic agent loop is powerful but expensive. Every iteration costs an LLM call. A 10-step task with GPT-4 might cost $0.30 and take 30 seconds. For simple, predictable tasks, this is overkill. You do not need an agent to extract a name from an email, translate a sentence, or classify a support ticket. These tasks have known steps and known structure.
This is the key insight behind single-agent patterns: most production AI tasks do not need the full agent loop. They need composable building blocks (prompt chaining, routing, parallelization, and evaluator-optimizer loops) that provide structure without sacrificing the LLM's ability to handle language and reasoning. Anthropic's research on building effective agents emphasizes this point: the most successful implementations use simple, composable patterns rather than complex autonomous agents.

When to Use the Full Agent Loop
The full agent loop is the right choice when you genuinely cannot predict the steps in advance. A coding assistant that reads files, writes code, runs tests, and fixes errors needs the agent loop because the sequence of steps depends on what the tests reveal. A research agent that searches the web, reads results, formulates follow-up queries, and synthesizes findings needs the loop because each step depends on what was found.
The hallmark of an agent-appropriate task is that the number of steps and the specific tools used vary per execution. If you can draw a flowchart of the steps before runtime, you probably do not need an agent. You need a workflow with LLM steps.
Production Considerations
Running the agent loop in production requires guardrails that are unnecessary in prototypes:
- Max iterations: cap the loop at a reasonable limit (10-20 steps for most tasks) to prevent infinite loops and runaway costs. When the cap is hit, return the best result so far with a warning.
- Timeouts: set per-iteration and total-task timeouts. An LLM call that hangs for 60 seconds should be retried, not waited on indefinitely.
- Cost tracking: log the token count and estimated cost per iteration. Alert when a single task exceeds a cost threshold.
- Observability: log every iteration with the prompt, tool call, tool result, and reasoning. Without this, debugging a 15-step agent failure is nearly impossible.
- Graceful degradation: if a tool call fails (API timeout, rate limit, permission error), the agent should retry with backoff, try an alternative tool, or return a partial result, not crash.
The most important principle in agent design is to use the simplest pattern that solves the problem. The full agent loop is the most flexible but also the most expensive, slowest, and hardest to debug. Before reaching for an agent, ask: can I solve this with prompt chaining (fixed steps), routing (classify and dispatch), or parallelization (independent subtasks)? If yes, use the simpler pattern.