Learn how AI agents execute multi-step plans sequentially, handle failures gracefully, and adapt when things go wrong. Includes practical Python examples with Claude Sonnet 4.5.

This article is part of the free-to-read AI Agent Handbook
Plan and Execute
In the previous chapter, we learned how to break down complex tasks into manageable steps. Our assistant can now look at a request like "Plan my weekend getaway" and decompose it into a clear sequence: find destinations, check travel options, book accommodations, and plan activities. But having a plan is only half the battle. Now we need to actually execute it.
Think of it like following a recipe. You've read through the instructions and understand the steps, but you still need to gather ingredients, measure them out, and cook them in the right order. If you discover you're out of eggs halfway through, you'll need to adapt. The same goes for our AI agent. This chapter explores how an agent takes its plan and turns it into action, handling each step sequentially while remaining flexible enough to deal with the unexpected.
From Plan to Action
When our assistant has a plan, it needs to work through each step systematically. Let's continue with the weekend getaway example from the previous chapter. The agent created this plan:
- Find potential destinations within 3 hours of the user's location
- Check available flights or train options
- Search for hotels with availability
- Suggest activities at the destination
Now comes the execution phase. The agent starts with step 1, uses the tools at its disposal (perhaps a location search API), gets results, and moves to step 2. This sequential approach ensures that each step builds on the previous one. You can't book a hotel before you know which city you're visiting.
Here's a simple implementation of a plan executor:
1## Using Claude Sonnet 4.5 for its superior planning and tool-use capabilities
2import anthropic
3
4client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
5
6def execute_plan(plan_steps, tools):
7 """Execute a plan step by step, using available tools."""
8 results = []
9
10 for i, step in enumerate(plan_steps, 1):
11 print(f"\nExecuting step {i}: {step}")
12
13 # Ask the agent to execute this specific step
14 response = client.messages.create(
15 model="claude-sonnet-4.5",
16 max_tokens=1024,
17 tools=tools,
18 messages=[{
19 "role": "user",
20 "content": f"Execute this step: {step}\n\nPrevious results: {results}"
21 }]
22 )
23
24 # Process tool calls if any
25 if response.stop_reason == "tool_use":
26 tool_result = handle_tool_call(response.content[0])
27 results.append({
28 "step": step,
29 "result": tool_result
30 })
31 else:
32 results.append({
33 "step": step,
34 "result": response.content[0].text
35 })
36
37 return results
38
39## Example plan from previous chapter
40plan = [
41 "Find destinations within 3 hours",
42 "Check travel options to top destination",
43 "Search for available hotels",
44 "Suggest activities"
45]
46
47## Execute the plan
48execution_results = execute_plan(plan, available_tools)1## Using Claude Sonnet 4.5 for its superior planning and tool-use capabilities
2import anthropic
3
4client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
5
6def execute_plan(plan_steps, tools):
7 """Execute a plan step by step, using available tools."""
8 results = []
9
10 for i, step in enumerate(plan_steps, 1):
11 print(f"\nExecuting step {i}: {step}")
12
13 # Ask the agent to execute this specific step
14 response = client.messages.create(
15 model="claude-sonnet-4.5",
16 max_tokens=1024,
17 tools=tools,
18 messages=[{
19 "role": "user",
20 "content": f"Execute this step: {step}\n\nPrevious results: {results}"
21 }]
22 )
23
24 # Process tool calls if any
25 if response.stop_reason == "tool_use":
26 tool_result = handle_tool_call(response.content[0])
27 results.append({
28 "step": step,
29 "result": tool_result
30 })
31 else:
32 results.append({
33 "step": step,
34 "result": response.content[0].text
35 })
36
37 return results
38
39## Example plan from previous chapter
40plan = [
41 "Find destinations within 3 hours",
42 "Check travel options to top destination",
43 "Search for available hotels",
44 "Suggest activities"
45]
46
47## Execute the plan
48execution_results = execute_plan(plan, available_tools)Notice how each step receives the results from previous steps. This context is crucial. When the agent searches for hotels in step 3, it needs to know which destination was selected in step 1. The agent maintains this running context throughout execution.
Handling Failures Gracefully
Plans rarely execute perfectly. APIs go down, data is missing, or assumptions turn out to be wrong. A robust agent needs to handle these situations without completely falling apart.
Let's say our agent is executing step 2 (checking travel options) and the flight API returns an error. What should happen? The agent has several options:
Option 1: Retry the step. Maybe it was a temporary network issue. Try the same step again, perhaps with a small delay.
Option 2: Use an alternative approach. If the flight API is down, maybe try a train search API instead.
Option 3: Skip and continue. If this step isn't critical, mark it as failed and move on. The user can book their own travel.
Option 4: Re-plan. If the failure makes the rest of the plan impossible, go back to the planning phase and create a new approach.
Here's how we might implement basic error handling:
1## Using Claude Sonnet 4.5 for robust error handling in agent workflows
2def execute_step_with_retry(step, tools, previous_results, max_retries=2):
3 """Execute a single step with retry logic."""
4
5 for attempt in range(max_retries + 1):
6 try:
7 response = client.messages.create(
8 model="claude-sonnet-4.5",
9 max_tokens=1024,
10 tools=tools,
11 messages=[{
12 "role": "user",
13 "content": f"Execute: {step}\n\nContext: {previous_results}"
14 }]
15 )
16
17 if response.stop_reason == "tool_use":
18 return handle_tool_call(response.content[0])
19 else:
20 return response.content[0].text
21
22 except Exception as e:
23 if attempt < max_retries:
24 print(f"Step failed (attempt {attempt + 1}), retrying...")
25 continue
26 else:
27 # After all retries, ask agent what to do
28 print(f"Step failed after {max_retries + 1} attempts")
29 return handle_failure(step, str(e), previous_results)
30
31def handle_failure(failed_step, error, context):
32 """Ask the agent to decide how to handle a failure."""
33
34 response = client.messages.create(
35 model="claude-sonnet-4.5",
36 max_tokens=512,
37 messages=[{
38 "role": "user",
39 "content": f"""This step failed: {failed_step}
40
41Error: {error}
42
43Context so far: {context}
44
45What should we do? Options:
461. Try an alternative approach
472. Skip this step and continue
483. Stop and re-plan
49
50Respond with your choice and reasoning."""
51 }]
52 )
53
54 return response.content[0].text1## Using Claude Sonnet 4.5 for robust error handling in agent workflows
2def execute_step_with_retry(step, tools, previous_results, max_retries=2):
3 """Execute a single step with retry logic."""
4
5 for attempt in range(max_retries + 1):
6 try:
7 response = client.messages.create(
8 model="claude-sonnet-4.5",
9 max_tokens=1024,
10 tools=tools,
11 messages=[{
12 "role": "user",
13 "content": f"Execute: {step}\n\nContext: {previous_results}"
14 }]
15 )
16
17 if response.stop_reason == "tool_use":
18 return handle_tool_call(response.content[0])
19 else:
20 return response.content[0].text
21
22 except Exception as e:
23 if attempt < max_retries:
24 print(f"Step failed (attempt {attempt + 1}), retrying...")
25 continue
26 else:
27 # After all retries, ask agent what to do
28 print(f"Step failed after {max_retries + 1} attempts")
29 return handle_failure(step, str(e), previous_results)
30
31def handle_failure(failed_step, error, context):
32 """Ask the agent to decide how to handle a failure."""
33
34 response = client.messages.create(
35 model="claude-sonnet-4.5",
36 max_tokens=512,
37 messages=[{
38 "role": "user",
39 "content": f"""This step failed: {failed_step}
40
41Error: {error}
42
43Context so far: {context}
44
45What should we do? Options:
461. Try an alternative approach
472. Skip this step and continue
483. Stop and re-plan
49
50Respond with your choice and reasoning."""
51 }]
52 )
53
54 return response.content[0].textThis approach gives the agent some autonomy in handling failures. Instead of hard-coding every possible error scenario, we let the agent reason about what makes sense given the specific situation.
Maintaining Flexibility
The best plans are flexible. As the agent executes steps, it might discover new information that suggests a better approach. Maybe step 1 reveals that the user's preferred destination is fully booked this weekend. A rigid executor would continue anyway, trying to book unavailable hotels. A flexible one would recognize the problem and adjust.
We can build this flexibility into our executor by periodically checking if the plan still makes sense:
1## Using Claude Sonnet 4.5 for adaptive planning and execution
2def execute_with_flexibility(plan, tools):
3 """Execute a plan while remaining open to adjustments."""
4 results = []
5 current_plan = plan.copy()
6
7 for i, step in enumerate(current_plan):
8 print(f"\nStep {i + 1}: {step}")
9
10 # Execute the step
11 result = execute_step_with_retry(step, tools, results)
12 results.append({"step": step, "result": result})
13
14 # After each step, check if we should continue as planned
15 if i < len(current_plan) - 1:
16 should_adjust = check_if_adjustment_needed(
17 current_plan[i + 1:],
18 results
19 )
20
21 if should_adjust:
22 print("\nAdjusting plan based on new information...")
23 current_plan = replan(current_plan[i + 1:], results, tools)
24
25 return results
26
27def check_if_adjustment_needed(remaining_steps, results_so_far):
28 """Ask the agent if the plan needs adjustment."""
29
30 response = client.messages.create(
31 model="claude-sonnet-4.5",
32 max_tokens=256,
33 messages=[{
34 "role": "user",
35 "content": f"""Given these results: {results_so_far}
36
37And these remaining steps: {remaining_steps}
38
39Should we adjust the plan? Answer yes or no, with brief reasoning."""
40 }]
41 )
42
43 answer = response.content[0].text.lower()
44 return "yes" in answer1## Using Claude Sonnet 4.5 for adaptive planning and execution
2def execute_with_flexibility(plan, tools):
3 """Execute a plan while remaining open to adjustments."""
4 results = []
5 current_plan = plan.copy()
6
7 for i, step in enumerate(current_plan):
8 print(f"\nStep {i + 1}: {step}")
9
10 # Execute the step
11 result = execute_step_with_retry(step, tools, results)
12 results.append({"step": step, "result": result})
13
14 # After each step, check if we should continue as planned
15 if i < len(current_plan) - 1:
16 should_adjust = check_if_adjustment_needed(
17 current_plan[i + 1:],
18 results
19 )
20
21 if should_adjust:
22 print("\nAdjusting plan based on new information...")
23 current_plan = replan(current_plan[i + 1:], results, tools)
24
25 return results
26
27def check_if_adjustment_needed(remaining_steps, results_so_far):
28 """Ask the agent if the plan needs adjustment."""
29
30 response = client.messages.create(
31 model="claude-sonnet-4.5",
32 max_tokens=256,
33 messages=[{
34 "role": "user",
35 "content": f"""Given these results: {results_so_far}
36
37And these remaining steps: {remaining_steps}
38
39Should we adjust the plan? Answer yes or no, with brief reasoning."""
40 }]
41 )
42
43 answer = response.content[0].text.lower()
44 return "yes" in answerThis creates a feedback loop. The agent executes a step, evaluates whether the plan still makes sense, and adjusts if needed. It's like driving with GPS. The GPS gives you a route, but if you miss a turn or encounter road construction, it recalculates.
Bringing It Together: A Complete Example
Let's see a full execution cycle for our weekend getaway assistant. We'll use the plan from the previous chapter and execute it with all the error handling and flexibility we've discussed.
1## Using Claude Sonnet 4.5 for end-to-end plan execution
2import anthropic
3import json
4
5client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6
7## Define our tools (simplified for illustration)
8tools = [
9 {
10 "name": "search_destinations",
11 "description": "Find destinations within specified travel time",
12 "input_schema": {
13 "type": "object",
14 "properties": {
15 "max_hours": {"type": "number"},
16 "origin": {"type": "string"}
17 }
18 }
19 },
20 {
21 "name": "search_hotels",
22 "description": "Search for available hotels in a city",
23 "input_schema": {
24 "type": "object",
25 "properties": {
26 "city": {"type": "string"},
27 "check_in": {"type": "string"},
28 "check_out": {"type": "string"}
29 }
30 }
31 }
32]
33
34def execute_complete_plan(user_request):
35 """Full plan-and-execute cycle."""
36
37 # Step 1: Create the plan (from previous chapter)
38 plan_response = client.messages.create(
39 model="claude-sonnet-4.5",
40 max_tokens=1024,
41 messages=[{
42 "role": "user",
43 "content": f"""Create a step-by-step plan for: {user_request}
44
45List each step clearly."""
46 }]
47 )
48
49 plan_text = plan_response.content[0].text
50 print(f"Plan created:\n{plan_text}\n")
51
52 # Step 2: Execute each step
53 print("Beginning execution...\n")
54
55 results = []
56 messages = [{
57 "role": "user",
58 "content": f"Execute this plan step by step:\n{plan_text}"
59 }]
60
61 # Execute until complete
62 while True:
63 response = client.messages.create(
64 model="claude-sonnet-4.5",
65 max_tokens=2048,
66 tools=tools,
67 messages=messages
68 )
69
70 # Add assistant response to conversation
71 messages.append({
72 "role": "assistant",
73 "content": response.content
74 })
75
76 # Check if agent wants to use a tool
77 if response.stop_reason == "tool_use":
78 # Find the tool use block
79 tool_use = next(
80 block for block in response.content
81 if block.type == "tool_use"
82 )
83
84 print(f"Using tool: {tool_use.name}")
85
86 # Simulate tool execution (in real code, call actual APIs)
87 tool_result = simulate_tool_call(tool_use.name, tool_use.input)
88
89 # Add tool result to conversation
90 messages.append({
91 "role": "user",
92 "content": [{
93 "type": "tool_result",
94 "tool_use_id": tool_use.id,
95 "content": json.dumps(tool_result)
96 }]
97 })
98
99 elif response.stop_reason == "end_turn":
100 # Agent is done
101 final_response = next(
102 block.text for block in response.content
103 if hasattr(block, "text")
104 )
105 print(f"\nFinal result:\n{final_response}")
106 break
107
108 return final_response
109
110## Example usage
111result = execute_complete_plan(
112 "Plan a weekend getaway within 3 hours of San Francisco"
113)1## Using Claude Sonnet 4.5 for end-to-end plan execution
2import anthropic
3import json
4
5client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6
7## Define our tools (simplified for illustration)
8tools = [
9 {
10 "name": "search_destinations",
11 "description": "Find destinations within specified travel time",
12 "input_schema": {
13 "type": "object",
14 "properties": {
15 "max_hours": {"type": "number"},
16 "origin": {"type": "string"}
17 }
18 }
19 },
20 {
21 "name": "search_hotels",
22 "description": "Search for available hotels in a city",
23 "input_schema": {
24 "type": "object",
25 "properties": {
26 "city": {"type": "string"},
27 "check_in": {"type": "string"},
28 "check_out": {"type": "string"}
29 }
30 }
31 }
32]
33
34def execute_complete_plan(user_request):
35 """Full plan-and-execute cycle."""
36
37 # Step 1: Create the plan (from previous chapter)
38 plan_response = client.messages.create(
39 model="claude-sonnet-4.5",
40 max_tokens=1024,
41 messages=[{
42 "role": "user",
43 "content": f"""Create a step-by-step plan for: {user_request}
44
45List each step clearly."""
46 }]
47 )
48
49 plan_text = plan_response.content[0].text
50 print(f"Plan created:\n{plan_text}\n")
51
52 # Step 2: Execute each step
53 print("Beginning execution...\n")
54
55 results = []
56 messages = [{
57 "role": "user",
58 "content": f"Execute this plan step by step:\n{plan_text}"
59 }]
60
61 # Execute until complete
62 while True:
63 response = client.messages.create(
64 model="claude-sonnet-4.5",
65 max_tokens=2048,
66 tools=tools,
67 messages=messages
68 )
69
70 # Add assistant response to conversation
71 messages.append({
72 "role": "assistant",
73 "content": response.content
74 })
75
76 # Check if agent wants to use a tool
77 if response.stop_reason == "tool_use":
78 # Find the tool use block
79 tool_use = next(
80 block for block in response.content
81 if block.type == "tool_use"
82 )
83
84 print(f"Using tool: {tool_use.name}")
85
86 # Simulate tool execution (in real code, call actual APIs)
87 tool_result = simulate_tool_call(tool_use.name, tool_use.input)
88
89 # Add tool result to conversation
90 messages.append({
91 "role": "user",
92 "content": [{
93 "type": "tool_result",
94 "tool_use_id": tool_use.id,
95 "content": json.dumps(tool_result)
96 }]
97 })
98
99 elif response.stop_reason == "end_turn":
100 # Agent is done
101 final_response = next(
102 block.text for block in response.content
103 if hasattr(block, "text")
104 )
105 print(f"\nFinal result:\n{final_response}")
106 break
107
108 return final_response
109
110## Example usage
111result = execute_complete_plan(
112 "Plan a weekend getaway within 3 hours of San Francisco"
113)When you run this, you'll see the agent working through its plan step by step. It might produce output like this:
1Plan created:
21. Search for destinations within 3 hours of San Francisco
32. Check weather forecasts for top 3 destinations
43. Search for available hotels in the best option
54. Compile recommendations with activities
6
7Beginning execution...
8
9Using tool: search_destinations
10Found destinations: Napa Valley, Lake Tahoe, Monterey
11
12Using tool: search_hotels
13Found 12 available hotels in Napa Valley
14
15Final result:
16I recommend a weekend in Napa Valley. I found several excellent hotels
17with availability, and the weather looks perfect. Here are three options
18with different price points, plus suggested wineries and restaurants to visit.1Plan created:
21. Search for destinations within 3 hours of San Francisco
32. Check weather forecasts for top 3 destinations
43. Search for available hotels in the best option
54. Compile recommendations with activities
6
7Beginning execution...
8
9Using tool: search_destinations
10Found destinations: Napa Valley, Lake Tahoe, Monterey
11
12Using tool: search_hotels
13Found 12 available hotels in Napa Valley
14
15Final result:
16I recommend a weekend in Napa Valley. I found several excellent hotels
17with availability, and the weather looks perfect. Here are three options
18with different price points, plus suggested wineries and restaurants to visit.The agent moved through each step, used tools when needed, and synthesized everything into a coherent recommendation. If any step had failed (say, no hotels were available in Napa), the agent could have fallen back to the second destination on its list.
When to Re-Plan
Sometimes execution reveals that the original plan won't work. Maybe the user's requirements were unclear, or the situation changed. In these cases, the agent needs to go back to the planning phase.
Here are signs that re-planning might be necessary:
- Multiple consecutive failures: If several steps in a row fail, the plan might be fundamentally flawed.
- Changed requirements: The user provides new information that invalidates the plan.
- Impossible constraints: The agent discovers that the goal can't be achieved as originally envisioned.
When re-planning, the agent should use everything it learned from the failed execution. This is valuable information. If the hotel search failed because everywhere is booked, the new plan might include alternative dates or nearby cities.
1## Using Claude Sonnet 4.5 for adaptive re-planning
2def replan_if_needed(original_plan, execution_results, user_request):
3 """Create a new plan if execution revealed problems."""
4
5 # Analyze the execution results
6 failures = [r for r in execution_results if "error" in r]
7
8 if len(failures) > 2: # Multiple failures suggest plan issues
9 print("\nMultiple failures detected. Re-planning...\n")
10
11 response = client.messages.create(
12 model="claude-sonnet-4.5",
13 max_tokens=1024,
14 messages=[{
15 "role": "user",
16 "content": f"""Original request: {user_request}
17
18Original plan: {original_plan}
19
20Execution results: {execution_results}
21
22The original plan encountered problems. Create a new plan that accounts
23for what we learned. Be specific about what to do differently."""
24 }]
25 )
26
27 new_plan = response.content[0].text
28 print(f"New plan:\n{new_plan}")
29 return new_plan
30
31 return original_plan1## Using Claude Sonnet 4.5 for adaptive re-planning
2def replan_if_needed(original_plan, execution_results, user_request):
3 """Create a new plan if execution revealed problems."""
4
5 # Analyze the execution results
6 failures = [r for r in execution_results if "error" in r]
7
8 if len(failures) > 2: # Multiple failures suggest plan issues
9 print("\nMultiple failures detected. Re-planning...\n")
10
11 response = client.messages.create(
12 model="claude-sonnet-4.5",
13 max_tokens=1024,
14 messages=[{
15 "role": "user",
16 "content": f"""Original request: {user_request}
17
18Original plan: {original_plan}
19
20Execution results: {execution_results}
21
22The original plan encountered problems. Create a new plan that accounts
23for what we learned. Be specific about what to do differently."""
24 }]
25 )
26
27 new_plan = response.content[0].text
28 print(f"New plan:\n{new_plan}")
29 return new_plan
30
31 return original_planThis creates a learning loop. The agent tries a plan, sees what works and what doesn't, and adjusts accordingly. Over time, with enough examples, an agent can get better at creating plans that are more likely to succeed on the first try.
Practical Considerations
As you build plan-and-execute agents, keep these principles in mind:
Start simple. Begin with straightforward sequential execution. Add error handling and flexibility only when you need it. A simple executor that works is better than a complex one that doesn't.
Log everything. When debugging why a plan failed, you'll want to see exactly what happened at each step. Log the plan, each step's input and output, any errors, and the final result.
Set timeouts. Some steps might take a long time or get stuck. Set reasonable timeouts so your agent doesn't wait forever for a response that will never come.
Consider costs. Each step might involve API calls to your language model and external services. If a plan has 20 steps, that's 20+ API calls. Sometimes it's worth combining steps or using a simpler approach.
Test edge cases. What happens if step 1 returns zero results? What if a tool returns malformed data? Your executor should handle these gracefully rather than crashing.
Looking Ahead
We now have an agent that can plan and execute multi-step tasks. It breaks down complex requests, works through them systematically, and handles problems along the way. This is a significant capability. Our personal assistant can now tackle requests that would have been impossible with simple question-answering.
But we've been assuming our agent works alone. What if we had multiple agents, each specialized in different tasks? One agent could focus on planning while another handles execution. Or different agents could work on different parts of the plan in parallel. That's the world of multi-agent systems, which we'll explore in the next chapter.
For now, try building a plan-and-execute agent for a domain you care about. Maybe it's a research assistant that plans and executes literature reviews, or a personal finance agent that plans and executes budget analyses. The pattern is the same: break it down, execute step by step, and stay flexible.
Glossary
Execution: The process of carrying out each step in a plan, using available tools and resources to accomplish the intended goal.
Sequential Execution: Performing plan steps one after another in order, where each step can build on the results of previous steps.
Retry Logic: A strategy for handling failures by attempting a failed operation multiple times before giving up or trying an alternative approach.
Graceful Degradation: The ability of a system to continue operating in a limited capacity when parts of it fail, rather than crashing completely.
Re-planning: The process of creating a new plan when execution reveals that the original plan won't work or needs significant adjustment.
Feedback Loop: A cycle where the results of actions inform future decisions, allowing the agent to learn and adapt during execution.
Context Propagation: Passing the results and information from earlier steps to later steps so the agent maintains awareness of what has happened so far.
Quiz
Ready to test your understanding? Take this quick quiz to reinforce what you've learned about plan execution in AI agents.
Reference

About the author: Michael Brenndoerfer
All opinions expressed here are my own and do not reflect the views of my employer.
Michael currently works as an Associate Director of Data Science at EQT Partners in Singapore, where he drives AI and data initiatives across private capital investments.
With over a decade of experience spanning private equity, management consulting, and software engineering, he specializes in building and scaling analytics capabilities from the ground up. He has published research in leading AI conferences and holds expertise in machine learning, natural language processing, and value creation through data.
Related Content

Scaling Up without Breaking the Bank: AI Agent Performance & Cost Optimization at Scale
Learn how to scale AI agents from single users to thousands while maintaining performance and controlling costs. Covers horizontal scaling, load balancing, monitoring, cost controls, and prompt optimization strategies.

Managing and Reducing AI Agent Costs: Complete Guide to Cost Optimization Strategies
Learn how to dramatically reduce AI agent API costs without sacrificing capability. Covers model selection, caching, batching, prompt optimization, and budget controls with practical Python examples.

Speeding Up AI Agents: Performance Optimization Techniques for Faster Response Times
Learn practical techniques to make AI agents respond faster, including model selection strategies, response caching, streaming, parallel execution, and prompt optimization for reduced latency.
Stay updated
Get notified when I publish new articles on data and AI, private equity, technology, and more.

