Planning and Task Decomposition

Introduction to Agentic AI

Planning and Task Decomposition

Topics Covered

Task Decomposition Strategies

Top-Down Decomposition

Dependency-Aware Decomposition

When Decomposition Goes Wrong

Plan-Then-Execute

How It Works

When Plan-Then-Execute Works Well

When Plan-Then-Execute Fails

Adaptive Replanning

How Adaptive Replanning Works

The Replanning Decision

Cost of Replanning

Reflection and Self-Correction

How Reflection Works

Self-Correction Loops

Inner Loop vs Outer Loop

When you ask an agent to "build a web application with user authentication, a product catalog, and a shopping cart," it cannot do this in one step. The task is too complex, with too many interdependent parts. The agent needs to decompose it into smaller, manageable subtasks that can be executed sequentially or in parallel.

Task decomposition is the first thing skilled developers do when facing a complex problem. They do not start coding immediately. They break the work into pieces, identify dependencies between pieces, and decide on an order. Agent systems do the same thing, except the LLM performs the decomposition.

Top-Down Decomposition

The most common strategy is hierarchical. Start with the high-level goal and break it into major components. Then break each component into specific steps. "Build a web app" becomes "design database schema, create API endpoints, build frontend, write tests." Each of these becomes more specific: "design database schema" becomes "identify entities, define relationships, write CREATE TABLE statements, create indexes."

The LLM is surprisingly good at this when given clear instructions. A prompt like "Break this task into 3-7 subtasks. Each subtask should be completable independently. List them in execution order with dependencies noted" produces useful decompositions for most software engineering tasks.

Dependency-Aware Decomposition

Not all subtasks are independent. The API endpoints depend on the database schema. The frontend depends on the API endpoints. Tests depend on both. A good decomposition identifies these dependencies explicitly so the agent knows which tasks must complete before others can start.

Key Insight

Task decomposition is where agent systems gain their biggest advantage over single-shot LLM calls. A single prompt asking for a complete web application overwhelms the model's working memory. Breaking it into subtasks lets the model focus on one piece at a time, bringing the full quality of its reasoning to each step rather than spreading thin across the entire problem.

When Decomposition Goes Wrong

Over-decomposition splits a simple task into too many tiny pieces, adding orchestration overhead without benefit. Under-decomposition leaves subtasks that are still too complex for the agent to handle in one step. The sweet spot is 3-7 subtasks per level, each taking 1-5 agent iterations to complete. If a subtask takes more than 5 iterations, it probably needs further decomposition.

Plan-Then-Execute

The simplest planning strategy is to generate the complete plan before taking any action. The agent receives a task, produces a numbered list of steps, and then executes each step in order. This is intuitive, easy to implement, and works well for predictable tasks.

How It Works

The orchestrator sends the task to the LLM with a prompt like "Create a detailed plan to accomplish this task. List each step with its expected output." The LLM returns a plan. The orchestrator then iterates through each step, sending it to the LLM as a focused instruction. The output of each step is added to the context for subsequent steps.

This approach has clear advantages. The user can review the plan before execution begins. The orchestrator knows how many steps remain. Progress tracking is straightforward: step 3 of 7 is 42% complete. The plan itself can be saved, reused, and modified for similar future tasks.

When Plan-Then-Execute Works Well

Tasks with well-known structure. Writing a CRUD API follows a predictable pattern: define models, create routes, add validation, write tests. A plan generated from experience covers the steps reliably.

Tasks where the environment is stable. Deploying to a staging server involves known steps that rarely surprise. The deployment script is essentially a plan-then-execute system.

Tasks where you want human review. When the stakes are high (production database migration, security-critical changes), generating the plan first lets a human review and approve before any changes are made.

Interview Tip

Plan-then-execute is the right default when the task is familiar and the environment is predictable. But always build in the ability to abort or modify the plan mid-execution. Even predictable tasks occasionally surprise, and an agent that blindly follows a broken plan does more damage than one that stops and asks for help.

When Plan-Then-Execute Fails

The fundamental weakness is brittleness. The plan is generated based on assumptions about the environment. If those assumptions are wrong (the file does not exist, the API returns an unexpected format, a dependency is missing), the plan breaks. A rigid plan-then-execute agent either fails outright or continues executing steps that no longer make sense.

The second weakness is that the plan cannot account for information discovered during execution. Step 3 might reveal that the database has a different schema than assumed, making steps 4-7 invalid. Without the ability to replan, the agent either proceeds with the wrong plan or gives up entirely.

Adaptive Replanning

Adaptive replanning combines the structure of plan-then-execute with the flexibility of ReAct. The agent creates an initial plan but revises it after each step based on what it discovers. This is the most robust planning strategy for real-world tasks where the environment is partially unknown.

How Adaptive Replanning Works

The orchestrator generates an initial plan. Before executing each step, it sends the plan, the results of completed steps, and the current step to the LLM with the instruction: "Given what you have learned so far, is the current plan still valid? If not, revise the remaining steps before executing the current one."

This creates a feedback loop. Step 1 completes and reveals new information. The plan is re-evaluated. If the new information does not affect the plan, execution continues unchanged. If it does, the remaining steps are revised. The plan evolves with the task.

The Replanning Decision

Not every observation warrants replanning. Asking "should I replan?" after every step adds latency and cost. Production systems use heuristics to decide when replanning is warranted:

Step failure: If a step fails (tool error, unexpected result), always replan. The original plan assumed the step would succeed.

New information: If a step reveals something that contradicts a plan assumption (different file structure, missing dependency), replan.

On schedule: If a step completes as expected and reveals nothing surprising, continue without replanning.

Cost of Replanning

Each replan is an additional LLM call that includes the original task, the current plan, and all observations so far. As the task progresses, this context grows. For a 10-step task, the replanning call at step 8 includes the full history of 7 completed steps. This is expensive both in tokens and in latency. Production systems balance replanning frequency against cost by only replanning on failure or surprise, not on every step.

Key Insight

Adaptive replanning is the planning equivalent of eventual consistency in distributed systems. The plan is never perfectly accurate at any single moment. It is always based on possibly-stale assumptions. But it converges toward correctness over time as observations update the assumptions. The key design parameter is how frequently to re-synchronize the plan with reality, which is a trade-off between accuracy (replan often) and efficiency (replan rarely).

Reflection and Self-Correction

Reflection is what separates a capable agent from a mediocre one. After completing a task (or failing at one), a reflective agent reviews its own work, identifies what went well and what went wrong, and uses that assessment to improve its next attempt.

How Reflection Works

The simplest form of reflection is a prompt: "Review what you just produced. Does it fully satisfy the original requirements? Are there any bugs, edge cases, or improvements you missed?" The model re-reads its output with a critical lens, and often catches errors that it missed during generation.

This works because generation and evaluation use different cognitive patterns. When generating code, the model focuses on producing syntactically correct, functionally plausible output. When evaluating code, it focuses on correctness, edge cases, and requirement satisfaction. The same model performs better as a reviewer than as a generator because the evaluation task is simpler: checking whether something is correct is easier than producing something correct from scratch.

Self-Correction Loops

Self-correction extends reflection into action. When the agent detects an error (through reflection, test failure, or validation), it reasons about the cause and retries with a different approach.

The pattern is: generate, test, reflect, fix, test again. Code agents like Claude Code use this constantly. The agent writes a function, runs the tests, sees a failure, reads the error message, reasons about the cause, modifies the code, and runs the tests again. This loop continues until the tests pass or the agent determines it cannot fix the issue.

Common Pitfall

Self-correction can loop indefinitely if the agent keeps trying the same approach. Production systems limit correction attempts (typically 3-5) and require the agent to try a fundamentally different approach on each retry, not just minor variations of the same strategy. If the agent cannot fix the issue within the attempt limit, it escalates to a human with a description of what it tried and why each attempt failed.

Inner Loop vs Outer Loop

Production agent systems operate on two levels. The inner loop is ReAct, where the agent reasons about the current subtask, takes actions, and observes results. The outer loop is planning: task decomposition, progress tracking, and replanning.

Reflection happens at both levels. Inner-loop reflection reviews individual actions ("Did this code change fix the bug?"). Outer-loop reflection reviews overall progress ("I have completed 4 of 7 subtasks. Am I on track? Should I adjust the plan?"). The most capable agents use both, with the outer loop catching strategic errors that the inner loop misses.

Course

Introduction to Agentic AI

LLM Foundations

The Agent Paradigm

Reasoning and Planning

Memory and Knowledge

Agent Architectures

Safety and Reliability

Production Engineering

Real-World Agent Patterns