Explore the trade-offs of multi-agent AI systems, from specialization and parallel processing to coordination challenges and complexity management. Learn when to use multiple agents versus a single agent.

This article is part of the free-to-read AI Agent Handbook
Benefits and Challenges of Multi-Agent Systems
You've seen how agents can work together and communicate. You've explored patterns like sequential handoffs, parallel execution, and consensus building. You've implemented communication protocols and message formats. But a crucial question remains: is all this complexity worth it? When should you use multiple agents instead of a single capable agent?
This chapter explores both sides of the multi-agent equation. We'll examine the real benefits that make multi-agent systems powerful, and we'll confront the challenges that come with coordinating multiple AI agents. By the end, you'll have a framework for deciding when to embrace the complexity of multiple agents and when to keep things simple.
The Case for Multiple Agents
Let's start with why you might choose a multi-agent architecture. We've touched on some benefits earlier, but now we'll dive deeper into each one with concrete examples.
Specialization: Experts vs. Generalists
Think about a hospital. You have general practitioners who handle common cases, but you also have cardiologists, neurologists, and oncologists who specialize in specific areas. When you have a heart problem, you want the cardiologist, not someone who knows a little about everything.
AI agents work the same way. A single agent can be a generalist, but specialized agents often perform better in their domains.
Here's a concrete example. Imagine you're building a customer service system. You could create one agent that handles everything:
1## Using GPT-5 for a generalist customer service agent
2import openai
3
4client = openai.OpenAI(api_key="OPENAI_API_KEY")
5
6def generalist_agent(customer_query):
7 """
8 A single agent that tries to handle all customer service tasks.
9 """
10 system_prompt = """You are a customer service agent. Handle:
11 - Technical support questions
12 - Billing inquiries
13 - Product recommendations
14 - Returns and refunds
15 - Account management
16
17 Be helpful and professional."""
18
19 response = client.chat.completions.create(
20 model="gpt-5",
21 messages=[
22 {"role": "system", "content": system_prompt},
23 {"role": "user", "content": customer_query}
24 ]
25 )
26
27 return response.choices[0].message.content
28
29## Example queries
30print("Query 1:", generalist_agent("My payment failed but I was still charged"))
31print("\nQuery 2:", generalist_agent("Which laptop is best for video editing?"))
32print("\nQuery 3:", generalist_agent("How do I reset my password?"))1## Using GPT-5 for a generalist customer service agent
2import openai
3
4client = openai.OpenAI(api_key="OPENAI_API_KEY")
5
6def generalist_agent(customer_query):
7 """
8 A single agent that tries to handle all customer service tasks.
9 """
10 system_prompt = """You are a customer service agent. Handle:
11 - Technical support questions
12 - Billing inquiries
13 - Product recommendations
14 - Returns and refunds
15 - Account management
16
17 Be helpful and professional."""
18
19 response = client.chat.completions.create(
20 model="gpt-5",
21 messages=[
22 {"role": "system", "content": system_prompt},
23 {"role": "user", "content": customer_query}
24 ]
25 )
26
27 return response.choices[0].message.content
28
29## Example queries
30print("Query 1:", generalist_agent("My payment failed but I was still charged"))
31print("\nQuery 2:", generalist_agent("Which laptop is best for video editing?"))
32print("\nQuery 3:", generalist_agent("How do I reset my password?"))This works, but notice the challenge. The system prompt tries to cover five different domains. The agent needs to handle technical details, understand billing systems, know product specifications, understand return policies, and manage account operations. That's a lot to ask from one prompt.
Now compare with specialized agents:
1## Using Claude Sonnet 4.5 for specialized customer service agents
2import anthropic
3
4client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
5
6class SpecializedCustomerService:
7 """
8 Customer service system with specialized agents.
9 """
10 def __init__(self):
11 self.model = "claude-sonnet-4.5"
12
13 def router_agent(self, query):
14 """
15 Routes queries to the appropriate specialist.
16 """
17 system_prompt = """You are a routing specialist. Categorize customer queries:
18 - technical: password resets, login issues, bugs
19 - billing: payments, charges, refunds, invoices
20 - products: recommendations, specifications, comparisons
21 - returns: return requests, warranty claims, exchanges
22
23 Return only the category name, nothing else."""
24
25 response = client.messages.create(
26 model=self.model,
27 max_tokens=50,
28 system=system_prompt,
29 messages=[{"role": "user", "content": query}]
30 )
31
32 return response.content[0].text.strip().lower()
33
34 def technical_agent(self, query):
35 """
36 Specialist in technical support.
37 """
38 system_prompt = """You are a technical support specialist.
39 You have deep knowledge of:
40 - Authentication systems and password resets
41 - Common technical issues and troubleshooting
42 - System requirements and compatibility
43
44 Provide clear, step-by-step technical guidance."""
45
46 response = client.messages.create(
47 model=self.model,
48 max_tokens=512,
49 system=system_prompt,
50 messages=[{"role": "user", "content": query}]
51 )
52
53 return response.content[0].text
54
55 def billing_agent(self, query):
56 """
57 Specialist in billing and payments.
58 """
59 system_prompt = """You are a billing specialist.
60 You have deep knowledge of:
61 - Payment processing and failed transactions
62 - Refund policies and procedures
63 - Invoice questions and billing disputes
64
65 Be empathetic and clear about financial matters."""
66
67 response = client.messages.create(
68 model=self.model,
69 max_tokens=512,
70 system=system_prompt,
71 messages=[{"role": "user", "content": query}]
72 )
73
74 return response.content[0].text
75
76 def product_agent(self, query):
77 """
78 Specialist in product recommendations.
79 """
80 system_prompt = """You are a product specialist.
81 You have deep knowledge of:
82 - Product specifications and features
83 - Use case matching and recommendations
84 - Competitive comparisons
85
86 Help customers find the right product for their needs."""
87
88 response = client.messages.create(
89 model=self.model,
90 max_tokens=512,
91 system=system_prompt,
92 messages=[{"role": "user", "content": query}]
93 )
94
95 return response.content[0].text
96
97 def handle_query(self, query):
98 """
99 Route and handle a customer query.
100 """
101 # Determine the right specialist
102 category = self.router_agent(query)
103 print(f"Routing to: {category} specialist")
104
105 # Delegate to the specialist
106 if "technical" in category:
107 return self.technical_agent(query)
108 elif "billing" in category:
109 return self.billing_agent(query)
110 elif "product" in category:
111 return self.product_agent(query)
112 else:
113 return "I'll connect you with the right specialist."
114
115## Example usage
116service = SpecializedCustomerService()
117
118print("=== Customer Service with Specialized Agents ===\n")
119
120queries = [
121 "My payment failed but I was still charged",
122 "Which laptop is best for video editing?",
123 "How do I reset my password?"
124]
125
126for query in queries:
127 print(f"\nCustomer: {query}")
128 response = service.handle_query(query)
129 print(f"Agent: {response[:100]}...") # Truncate for readability1## Using Claude Sonnet 4.5 for specialized customer service agents
2import anthropic
3
4client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
5
6class SpecializedCustomerService:
7 """
8 Customer service system with specialized agents.
9 """
10 def __init__(self):
11 self.model = "claude-sonnet-4.5"
12
13 def router_agent(self, query):
14 """
15 Routes queries to the appropriate specialist.
16 """
17 system_prompt = """You are a routing specialist. Categorize customer queries:
18 - technical: password resets, login issues, bugs
19 - billing: payments, charges, refunds, invoices
20 - products: recommendations, specifications, comparisons
21 - returns: return requests, warranty claims, exchanges
22
23 Return only the category name, nothing else."""
24
25 response = client.messages.create(
26 model=self.model,
27 max_tokens=50,
28 system=system_prompt,
29 messages=[{"role": "user", "content": query}]
30 )
31
32 return response.content[0].text.strip().lower()
33
34 def technical_agent(self, query):
35 """
36 Specialist in technical support.
37 """
38 system_prompt = """You are a technical support specialist.
39 You have deep knowledge of:
40 - Authentication systems and password resets
41 - Common technical issues and troubleshooting
42 - System requirements and compatibility
43
44 Provide clear, step-by-step technical guidance."""
45
46 response = client.messages.create(
47 model=self.model,
48 max_tokens=512,
49 system=system_prompt,
50 messages=[{"role": "user", "content": query}]
51 )
52
53 return response.content[0].text
54
55 def billing_agent(self, query):
56 """
57 Specialist in billing and payments.
58 """
59 system_prompt = """You are a billing specialist.
60 You have deep knowledge of:
61 - Payment processing and failed transactions
62 - Refund policies and procedures
63 - Invoice questions and billing disputes
64
65 Be empathetic and clear about financial matters."""
66
67 response = client.messages.create(
68 model=self.model,
69 max_tokens=512,
70 system=system_prompt,
71 messages=[{"role": "user", "content": query}]
72 )
73
74 return response.content[0].text
75
76 def product_agent(self, query):
77 """
78 Specialist in product recommendations.
79 """
80 system_prompt = """You are a product specialist.
81 You have deep knowledge of:
82 - Product specifications and features
83 - Use case matching and recommendations
84 - Competitive comparisons
85
86 Help customers find the right product for their needs."""
87
88 response = client.messages.create(
89 model=self.model,
90 max_tokens=512,
91 system=system_prompt,
92 messages=[{"role": "user", "content": query}]
93 )
94
95 return response.content[0].text
96
97 def handle_query(self, query):
98 """
99 Route and handle a customer query.
100 """
101 # Determine the right specialist
102 category = self.router_agent(query)
103 print(f"Routing to: {category} specialist")
104
105 # Delegate to the specialist
106 if "technical" in category:
107 return self.technical_agent(query)
108 elif "billing" in category:
109 return self.billing_agent(query)
110 elif "product" in category:
111 return self.product_agent(query)
112 else:
113 return "I'll connect you with the right specialist."
114
115## Example usage
116service = SpecializedCustomerService()
117
118print("=== Customer Service with Specialized Agents ===\n")
119
120queries = [
121 "My payment failed but I was still charged",
122 "Which laptop is best for video editing?",
123 "How do I reset my password?"
124]
125
126for query in queries:
127 print(f"\nCustomer: {query}")
128 response = service.handle_query(query)
129 print(f"Agent: {response[:100]}...") # Truncate for readabilityThe difference is striking. Each specialist agent has a focused system prompt that makes it genuinely expert in its domain. The billing agent knows billing inside and out. The product agent deeply understands products. They don't try to be good at everything; they excel at their specialty.
This specialization brings several advantages:
Deeper Expertise: Each agent can have a more detailed, focused prompt. The technical agent's prompt could include specific troubleshooting procedures. The billing agent could have exact refund policies. There's no need to cram everything into one prompt.
Easier Updates: When your refund policy changes, you update only the billing agent. You don't risk breaking technical support or product recommendations.
Better Performance: Specialized agents often give better answers because they're not spreading their attention across multiple domains. They can reason more deeply about their specific area.
Clearer Debugging: When something goes wrong with billing responses, you know exactly where to look. You debug one agent, not a monolithic system.
Parallel Processing: Speed Through Concurrency
A single agent must work sequentially. It finishes one task before starting the next. Multiple agents can work simultaneously, completing complex requests faster.
Let's see this in action with a travel planning example:
1## Using Claude Sonnet 4.5 for parallel agent execution
2import anthropic
3import concurrent.futures
4import time
5
6client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
7
8def flight_agent(destination, dates):
9 """
10 Researches flight options.
11 """
12 start = time.time()
13
14 system_prompt = """You are a flight research specialist.
15 Find the best flight options considering price, duration, and convenience."""
16
17 query = f"Find flights to {destination} for {dates}"
18
19 response = client.messages.create(
20 model="claude-sonnet-4.5",
21 max_tokens=512,
22 system=system_prompt,
23 messages=[{"role": "user", "content": query}]
24 )
25
26 elapsed = time.time() - start
27 return {
28 "agent": "flights",
29 "result": response.content[0].text,
30 "time": elapsed
31 }
32
33def hotel_agent(destination, dates):
34 """
35 Researches hotel options.
36 """
37 start = time.time()
38
39 system_prompt = """You are a hotel research specialist.
40 Find the best hotel options considering location, amenities, and value."""
41
42 query = f"Find hotels in {destination} for {dates}"
43
44 response = client.messages.create(
45 model="claude-sonnet-4.5",
46 max_tokens=512,
47 system=system_prompt,
48 messages=[{"role": "user", "content": query}]
49 )
50
51 elapsed = time.time() - start
52 return {
53 "agent": "hotels",
54 "result": response.content[0].text,
55 "time": elapsed
56 }
57
58def activities_agent(destination, interests):
59 """
60 Researches activities and attractions.
61 """
62 start = time.time()
63
64 system_prompt = """You are a local activities specialist.
65 Recommend activities, restaurants, and attractions based on interests."""
66
67 query = f"Recommend activities in {destination} for someone interested in {interests}"
68
69 response = client.messages.create(
70 model="claude-sonnet-4.5",
71 max_tokens=512,
72 system=system_prompt,
73 messages=[{"role": "user", "content": query}]
74 )
75
76 elapsed = time.time() - start
77 return {
78 "agent": "activities",
79 "result": response.content[0].text,
80 "time": elapsed
81 }
82
83## Sequential execution (single agent approach)
84def plan_trip_sequential(destination, dates, interests):
85 """
86 Plan a trip with one agent doing everything sequentially.
87 """
88 print("=== Sequential Planning ===")
89 total_start = time.time()
90
91 results = []
92 results.append(flight_agent(destination, dates))
93 results.append(hotel_agent(destination, dates))
94 results.append(activities_agent(destination, interests))
95
96 total_time = time.time() - total_start
97
98 for r in results:
99 print(f"{r['agent']}: {r['time']:.2f}s")
100 print(f"Total time: {total_time:.2f}s\n")
101
102 return results
103
104## Parallel execution (multi-agent approach)
105def plan_trip_parallel(destination, dates, interests):
106 """
107 Plan a trip with multiple agents working simultaneously.
108 """
109 print("=== Parallel Planning ===")
110 total_start = time.time()
111
112 # Execute all agents concurrently
113 with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
114 flight_future = executor.submit(flight_agent, destination, dates)
115 hotel_future = executor.submit(hotel_agent, destination, dates)
116 activities_future = executor.submit(activities_agent, destination, interests)
117
118 # Wait for all to complete
119 results = [
120 flight_future.result(),
121 hotel_future.result(),
122 activities_future.result()
123 ]
124
125 total_time = time.time() - total_start
126
127 for r in results:
128 print(f"{r['agent']}: {r['time']:.2f}s")
129 print(f"Total time: {total_time:.2f}s\n")
130
131 return results
132
133## Compare both approaches
134sequential_results = plan_trip_sequential("Tokyo", "March 15-22", "food and history")
135parallel_results = plan_trip_parallel("Tokyo", "March 15-22", "food and history")
136
137## Calculate speedup
138seq_time = sum(r['time'] for r in sequential_results)
139par_time = max(r['time'] for r in parallel_results)
140speedup = seq_time / par_time
141
142print(f"Speedup: {speedup:.2f}x faster with parallel agents")1## Using Claude Sonnet 4.5 for parallel agent execution
2import anthropic
3import concurrent.futures
4import time
5
6client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
7
8def flight_agent(destination, dates):
9 """
10 Researches flight options.
11 """
12 start = time.time()
13
14 system_prompt = """You are a flight research specialist.
15 Find the best flight options considering price, duration, and convenience."""
16
17 query = f"Find flights to {destination} for {dates}"
18
19 response = client.messages.create(
20 model="claude-sonnet-4.5",
21 max_tokens=512,
22 system=system_prompt,
23 messages=[{"role": "user", "content": query}]
24 )
25
26 elapsed = time.time() - start
27 return {
28 "agent": "flights",
29 "result": response.content[0].text,
30 "time": elapsed
31 }
32
33def hotel_agent(destination, dates):
34 """
35 Researches hotel options.
36 """
37 start = time.time()
38
39 system_prompt = """You are a hotel research specialist.
40 Find the best hotel options considering location, amenities, and value."""
41
42 query = f"Find hotels in {destination} for {dates}"
43
44 response = client.messages.create(
45 model="claude-sonnet-4.5",
46 max_tokens=512,
47 system=system_prompt,
48 messages=[{"role": "user", "content": query}]
49 )
50
51 elapsed = time.time() - start
52 return {
53 "agent": "hotels",
54 "result": response.content[0].text,
55 "time": elapsed
56 }
57
58def activities_agent(destination, interests):
59 """
60 Researches activities and attractions.
61 """
62 start = time.time()
63
64 system_prompt = """You are a local activities specialist.
65 Recommend activities, restaurants, and attractions based on interests."""
66
67 query = f"Recommend activities in {destination} for someone interested in {interests}"
68
69 response = client.messages.create(
70 model="claude-sonnet-4.5",
71 max_tokens=512,
72 system=system_prompt,
73 messages=[{"role": "user", "content": query}]
74 )
75
76 elapsed = time.time() - start
77 return {
78 "agent": "activities",
79 "result": response.content[0].text,
80 "time": elapsed
81 }
82
83## Sequential execution (single agent approach)
84def plan_trip_sequential(destination, dates, interests):
85 """
86 Plan a trip with one agent doing everything sequentially.
87 """
88 print("=== Sequential Planning ===")
89 total_start = time.time()
90
91 results = []
92 results.append(flight_agent(destination, dates))
93 results.append(hotel_agent(destination, dates))
94 results.append(activities_agent(destination, interests))
95
96 total_time = time.time() - total_start
97
98 for r in results:
99 print(f"{r['agent']}: {r['time']:.2f}s")
100 print(f"Total time: {total_time:.2f}s\n")
101
102 return results
103
104## Parallel execution (multi-agent approach)
105def plan_trip_parallel(destination, dates, interests):
106 """
107 Plan a trip with multiple agents working simultaneously.
108 """
109 print("=== Parallel Planning ===")
110 total_start = time.time()
111
112 # Execute all agents concurrently
113 with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
114 flight_future = executor.submit(flight_agent, destination, dates)
115 hotel_future = executor.submit(hotel_agent, destination, dates)
116 activities_future = executor.submit(activities_agent, destination, interests)
117
118 # Wait for all to complete
119 results = [
120 flight_future.result(),
121 hotel_future.result(),
122 activities_future.result()
123 ]
124
125 total_time = time.time() - total_start
126
127 for r in results:
128 print(f"{r['agent']}: {r['time']:.2f}s")
129 print(f"Total time: {total_time:.2f}s\n")
130
131 return results
132
133## Compare both approaches
134sequential_results = plan_trip_sequential("Tokyo", "March 15-22", "food and history")
135parallel_results = plan_trip_parallel("Tokyo", "March 15-22", "food and history")
136
137## Calculate speedup
138seq_time = sum(r['time'] for r in sequential_results)
139par_time = max(r['time'] for r in parallel_results)
140speedup = seq_time / par_time
141
142print(f"Speedup: {speedup:.2f}x faster with parallel agents")The parallel approach finishes in roughly the time of the slowest agent, not the sum of all agents. If each agent takes about 3 seconds, the sequential approach takes 9 seconds total, while the parallel approach takes only 3 seconds. That's a 3x speedup.
This matters for user experience. When someone asks your assistant to plan a trip, they don't want to wait 9 seconds. They want an answer as quickly as possible. Parallel agents deliver that speed.
Robustness: Redundancy and Verification
Multiple agents can check each other's work, catching errors that a single agent might miss. This is like having an editor review a writer's work, or a second doctor confirm a diagnosis.
Here's a practical example:
1## Using Claude Sonnet 4.5 for agent verification
2import anthropic
3import json
4
5client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6
7def research_agent(topic):
8 """
9 Researches a topic and provides findings.
10 """
11 system_prompt = """You are a research agent. Research the topic and provide factual information.
12 Return your findings as JSON with:
13 - claims: list of factual claims you're making
14 - confidence: your confidence level (0-1) for each claim
15 - sources: where this information comes from"""
16
17 response = client.messages.create(
18 model="claude-sonnet-4.5",
19 max_tokens=1024,
20 system=system_prompt,
21 messages=[{"role": "user", "content": f"Research: {topic}"}]
22 )
23
24 return response.content[0].text
25
26def verification_agent(research_findings):
27 """
28 Verifies research findings for accuracy and completeness.
29 """
30 system_prompt = """You are a fact-checking agent. Review research findings and:
31 - Check if claims are well-supported
32 - Identify any potential errors or inconsistencies
33 - Suggest additional information needed
34 - Rate overall reliability
35
36 Return JSON with:
37 - verified_claims: claims that seem accurate
38 - questionable_claims: claims that need more verification
39 - missing_information: important gaps in the research
40 - overall_confidence: your confidence in the research (0-1)"""
41
42 response = client.messages.create(
43 model="claude-sonnet-4.5",
44 max_tokens=1024,
45 system=system_prompt,
46 messages=[{"role": "user", "content": f"Verify this research:\n\n{research_findings}"}]
47 )
48
49 return response.content[0].text
50
51def synthesis_agent(research, verification):
52 """
53 Synthesizes verified information into a final answer.
54 """
55 system_prompt = """You are a synthesis agent. Combine research and verification to create a final answer.
56 Include only well-verified information. Acknowledge uncertainties.
57 Be clear about confidence levels."""
58
59 context = f"Research:\n{research}\n\nVerification:\n{verification}"
60
61 response = client.messages.create(
62 model="claude-sonnet-4.5",
63 max_tokens=1024,
64 system=system_prompt,
65 messages=[{"role": "user", "content": f"Synthesize:\n\n{context}"}]
66 )
67
68 return response.content[0].text
69
70## Example: Research with verification
71def research_with_verification(topic):
72 """
73 Multi-agent research with built-in verification.
74 """
75 print(f"=== Researching: {topic} ===\n")
76
77 # Step 1: Initial research
78 print("Research Agent working...")
79 research = research_agent(topic)
80 print(f"Research completed:\n{research[:200]}...\n")
81
82 # Step 2: Verification
83 print("Verification Agent checking...")
84 verification = verification_agent(research)
85 print(f"Verification completed:\n{verification[:200]}...\n")
86
87 # Step 3: Final synthesis
88 print("Synthesis Agent combining results...")
89 final_answer = synthesis_agent(research, verification)
90 print(f"Final Answer:\n{final_answer}")
91
92 return final_answer
93
94## Run the verified research
95result = research_with_verification(
96 "What are the health benefits of intermittent fasting?"
97)1## Using Claude Sonnet 4.5 for agent verification
2import anthropic
3import json
4
5client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6
7def research_agent(topic):
8 """
9 Researches a topic and provides findings.
10 """
11 system_prompt = """You are a research agent. Research the topic and provide factual information.
12 Return your findings as JSON with:
13 - claims: list of factual claims you're making
14 - confidence: your confidence level (0-1) for each claim
15 - sources: where this information comes from"""
16
17 response = client.messages.create(
18 model="claude-sonnet-4.5",
19 max_tokens=1024,
20 system=system_prompt,
21 messages=[{"role": "user", "content": f"Research: {topic}"}]
22 )
23
24 return response.content[0].text
25
26def verification_agent(research_findings):
27 """
28 Verifies research findings for accuracy and completeness.
29 """
30 system_prompt = """You are a fact-checking agent. Review research findings and:
31 - Check if claims are well-supported
32 - Identify any potential errors or inconsistencies
33 - Suggest additional information needed
34 - Rate overall reliability
35
36 Return JSON with:
37 - verified_claims: claims that seem accurate
38 - questionable_claims: claims that need more verification
39 - missing_information: important gaps in the research
40 - overall_confidence: your confidence in the research (0-1)"""
41
42 response = client.messages.create(
43 model="claude-sonnet-4.5",
44 max_tokens=1024,
45 system=system_prompt,
46 messages=[{"role": "user", "content": f"Verify this research:\n\n{research_findings}"}]
47 )
48
49 return response.content[0].text
50
51def synthesis_agent(research, verification):
52 """
53 Synthesizes verified information into a final answer.
54 """
55 system_prompt = """You are a synthesis agent. Combine research and verification to create a final answer.
56 Include only well-verified information. Acknowledge uncertainties.
57 Be clear about confidence levels."""
58
59 context = f"Research:\n{research}\n\nVerification:\n{verification}"
60
61 response = client.messages.create(
62 model="claude-sonnet-4.5",
63 max_tokens=1024,
64 system=system_prompt,
65 messages=[{"role": "user", "content": f"Synthesize:\n\n{context}"}]
66 )
67
68 return response.content[0].text
69
70## Example: Research with verification
71def research_with_verification(topic):
72 """
73 Multi-agent research with built-in verification.
74 """
75 print(f"=== Researching: {topic} ===\n")
76
77 # Step 1: Initial research
78 print("Research Agent working...")
79 research = research_agent(topic)
80 print(f"Research completed:\n{research[:200]}...\n")
81
82 # Step 2: Verification
83 print("Verification Agent checking...")
84 verification = verification_agent(research)
85 print(f"Verification completed:\n{verification[:200]}...\n")
86
87 # Step 3: Final synthesis
88 print("Synthesis Agent combining results...")
89 final_answer = synthesis_agent(research, verification)
90 print(f"Final Answer:\n{final_answer}")
91
92 return final_answer
93
94## Run the verified research
95result = research_with_verification(
96 "What are the health benefits of intermittent fasting?"
97)This three-agent system is more reliable than a single agent because:
Error Detection: The verification agent can catch mistakes the research agent made. If the research agent misunderstands something or makes an unsupported claim, the verification agent flags it.
Confidence Calibration: The verification step provides a second opinion on how confident we should be in the findings. This helps users understand when information is solid versus when it's uncertain.
Completeness Checking: The verification agent can identify gaps in the research, prompting more thorough investigation.
Final Quality Control: The synthesis agent combines only the verified information, filtering out questionable claims.
This pattern is especially valuable for high-stakes decisions. If you're building a medical information system, legal research tool, or financial advisor, having agents verify each other's work significantly reduces the risk of errors.
Modularity: Build Once, Reuse Everywhere
When agents are specialized and independent, you can reuse them across different applications. The billing agent you built for customer service might also be useful in your accounting system. The research agent might serve both your personal assistant and your content creation tool.
This modularity saves development time and ensures consistency. When you improve the billing agent, all systems using it get better automatically.
The Challenges of Multi-Agent Systems
Now let's confront the difficulties. Multi-agent systems bring real challenges that you need to understand and plan for.
Coordination Overhead: Keeping Everyone Aligned
The more agents you have, the more coordination you need. Agents must stay synchronized, share information correctly, and avoid conflicts.
Consider a simple example: three agents working on a report.
1## Using Claude Sonnet 4.5 to demonstrate coordination challenges
2import anthropic
3import time
4
5client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6
7class ReportWritingTeam:
8 """
9 Three agents collaborating on a report (with potential coordination issues).
10 """
11 def __init__(self):
12 self.model = "claude-sonnet-4.5"
13 self.shared_state = {
14 "outline": None,
15 "sections": {},
16 "final_report": None
17 }
18
19 def outlining_agent(self, topic):
20 """
21 Creates a report outline.
22 """
23 system_prompt = """You are an outlining specialist.
24 Create a clear outline for a report on the given topic.
25 Return a simple numbered list of sections."""
26
27 response = client.messages.create(
28 model=self.model,
29 max_tokens=512,
30 system=system_prompt,
31 messages=[{"role": "user", "content": f"Create outline for: {topic}"}]
32 )
33
34 self.shared_state["outline"] = response.content[0].text
35 return self.shared_state["outline"]
36
37 def writing_agent(self, section_number):
38 """
39 Writes a specific section of the report.
40 """
41 # Problem: What if the outline isn't ready yet?
42 outline = self.shared_state.get("outline")
43 if not outline:
44 return "ERROR: No outline available yet!"
45
46 system_prompt = f"""You are a writing specialist.
47 Write section {section_number} based on this outline:\n\n{outline}"""
48
49 response = client.messages.create(
50 model=self.model,
51 max_tokens=1024,
52 system=system_prompt,
53 messages=[{"role": "user", "content": f"Write section {section_number}"}]
54 )
55
56 section_text = response.content[0].text
57 self.shared_state["sections"][section_number] = section_text
58 return section_text
59
60 def editing_agent(self):
61 """
62 Edits and finalizes the report.
63 """
64 # Problem: What if sections aren't ready yet?
65 sections = self.shared_state.get("sections")
66 if not sections:
67 return "ERROR: No sections to edit yet!"
68
69 combined = "\n\n".join([
70 f"Section {num}:\n{text}"
71 for num, text in sorted(sections.items())
72 ])
73
74 system_prompt = """You are an editing specialist.
75 Review and polish this report for clarity and flow."""
76
77 response = client.messages.create(
78 model=self.model,
79 max_tokens=2048,
80 system=system_prompt,
81 messages=[{"role": "user", "content": combined}]
82 )
83
84 self.shared_state["final_report"] = response.content[0].text
85 return self.shared_state["final_report"]
86
87## Example: What happens with poor coordination?
88def write_report_poor_coordination(topic):
89 """
90 Demonstrates coordination problems when agents aren't synchronized.
91 """
92 team = ReportWritingTeam()
93
94 print("=== Poor Coordination Example ===\n")
95
96 # Problem: Starting all agents at once without coordination
97 import concurrent.futures
98
99 with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
100 # All agents start simultaneously
101 outline_future = executor.submit(team.outlining_agent, topic)
102 writing_future = executor.submit(team.writing_agent, 1)
103 editing_future = executor.submit(team.editing_agent)
104
105 # Results
106 outline = outline_future.result()
107 section = writing_future.result()
108 final = editing_future.result()
109
110 print(f"Outline: {outline[:100]}...")
111 print(f"Section: {section[:100]}...")
112 print(f"Final: {final[:100]}...")
113 print("\nNotice the errors: writing and editing agents failed because they started before outline was ready!")
114
115## Example: Better coordination
116def write_report_good_coordination(topic):
117 """
118 Demonstrates proper coordination with sequencing.
119 """
120 team = ReportWritingTeam()
121
122 print("\n=== Good Coordination Example ===\n")
123
124 # Step 1: Outline first
125 print("Step 1: Creating outline...")
126 outline = team.outlining_agent(topic)
127 print(f"Outline ready: {outline[:100]}...\n")
128
129 # Step 2: Write sections (could be parallel if multiple sections)
130 print("Step 2: Writing sections...")
131 section = team.writing_agent(1)
132 print(f"Section complete: {section[:100]}...\n")
133
134 # Step 3: Edit the complete report
135 print("Step 3: Editing final report...")
136 final = team.editing_agent()
137 print(f"Final report: {final[:100]}...\n")
138
139 print("Success: Proper sequencing avoided coordination errors!")
140
141## Demonstrate both approaches
142write_report_poor_coordination("The Future of Renewable Energy")
143write_report_good_coordination("The Future of Renewable Energy")1## Using Claude Sonnet 4.5 to demonstrate coordination challenges
2import anthropic
3import time
4
5client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6
7class ReportWritingTeam:
8 """
9 Three agents collaborating on a report (with potential coordination issues).
10 """
11 def __init__(self):
12 self.model = "claude-sonnet-4.5"
13 self.shared_state = {
14 "outline": None,
15 "sections": {},
16 "final_report": None
17 }
18
19 def outlining_agent(self, topic):
20 """
21 Creates a report outline.
22 """
23 system_prompt = """You are an outlining specialist.
24 Create a clear outline for a report on the given topic.
25 Return a simple numbered list of sections."""
26
27 response = client.messages.create(
28 model=self.model,
29 max_tokens=512,
30 system=system_prompt,
31 messages=[{"role": "user", "content": f"Create outline for: {topic}"}]
32 )
33
34 self.shared_state["outline"] = response.content[0].text
35 return self.shared_state["outline"]
36
37 def writing_agent(self, section_number):
38 """
39 Writes a specific section of the report.
40 """
41 # Problem: What if the outline isn't ready yet?
42 outline = self.shared_state.get("outline")
43 if not outline:
44 return "ERROR: No outline available yet!"
45
46 system_prompt = f"""You are a writing specialist.
47 Write section {section_number} based on this outline:\n\n{outline}"""
48
49 response = client.messages.create(
50 model=self.model,
51 max_tokens=1024,
52 system=system_prompt,
53 messages=[{"role": "user", "content": f"Write section {section_number}"}]
54 )
55
56 section_text = response.content[0].text
57 self.shared_state["sections"][section_number] = section_text
58 return section_text
59
60 def editing_agent(self):
61 """
62 Edits and finalizes the report.
63 """
64 # Problem: What if sections aren't ready yet?
65 sections = self.shared_state.get("sections")
66 if not sections:
67 return "ERROR: No sections to edit yet!"
68
69 combined = "\n\n".join([
70 f"Section {num}:\n{text}"
71 for num, text in sorted(sections.items())
72 ])
73
74 system_prompt = """You are an editing specialist.
75 Review and polish this report for clarity and flow."""
76
77 response = client.messages.create(
78 model=self.model,
79 max_tokens=2048,
80 system=system_prompt,
81 messages=[{"role": "user", "content": combined}]
82 )
83
84 self.shared_state["final_report"] = response.content[0].text
85 return self.shared_state["final_report"]
86
87## Example: What happens with poor coordination?
88def write_report_poor_coordination(topic):
89 """
90 Demonstrates coordination problems when agents aren't synchronized.
91 """
92 team = ReportWritingTeam()
93
94 print("=== Poor Coordination Example ===\n")
95
96 # Problem: Starting all agents at once without coordination
97 import concurrent.futures
98
99 with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
100 # All agents start simultaneously
101 outline_future = executor.submit(team.outlining_agent, topic)
102 writing_future = executor.submit(team.writing_agent, 1)
103 editing_future = executor.submit(team.editing_agent)
104
105 # Results
106 outline = outline_future.result()
107 section = writing_future.result()
108 final = editing_future.result()
109
110 print(f"Outline: {outline[:100]}...")
111 print(f"Section: {section[:100]}...")
112 print(f"Final: {final[:100]}...")
113 print("\nNotice the errors: writing and editing agents failed because they started before outline was ready!")
114
115## Example: Better coordination
116def write_report_good_coordination(topic):
117 """
118 Demonstrates proper coordination with sequencing.
119 """
120 team = ReportWritingTeam()
121
122 print("\n=== Good Coordination Example ===\n")
123
124 # Step 1: Outline first
125 print("Step 1: Creating outline...")
126 outline = team.outlining_agent(topic)
127 print(f"Outline ready: {outline[:100]}...\n")
128
129 # Step 2: Write sections (could be parallel if multiple sections)
130 print("Step 2: Writing sections...")
131 section = team.writing_agent(1)
132 print(f"Section complete: {section[:100]}...\n")
133
134 # Step 3: Edit the complete report
135 print("Step 3: Editing final report...")
136 final = team.editing_agent()
137 print(f"Final report: {final[:100]}...\n")
138
139 print("Success: Proper sequencing avoided coordination errors!")
140
141## Demonstrate both approaches
142write_report_poor_coordination("The Future of Renewable Energy")
143write_report_good_coordination("The Future of Renewable Energy")This example shows a fundamental challenge: agents must execute in the right order. The writing agent needs the outline. The editing agent needs the sections. Without proper coordination, agents fail or produce garbage.
Coordination requires:
Dependency Management: Understanding which agents depend on others and enforcing execution order.
State Synchronization: Ensuring all agents see consistent shared state. If Agent A updates a value, Agent B must see that update.
Deadlock Prevention: Making sure agents don't get stuck waiting for each other in a cycle. (Agent A waits for Agent B, which waits for Agent C, which waits for Agent A.)
Resource Contention: Handling cases where multiple agents need the same resource (like a database connection or API quota).
All of this adds complexity. Your code needs to manage these dependencies explicitly, whereas a single agent naturally does things in order.
Increased Complexity: More Moving Parts
More agents means more code, more potential failure points, and harder debugging.
With a single agent, debugging is straightforward. You look at the input, the prompt, and the output. With ten agents passing messages, you need to trace the entire flow to understand what went wrong.
Let's look at a debugging scenario:
1User Question: "What's the weather in Paris next Tuesday?"
2
3Single Agent System:
4- User $\to$ Agent $\to$ Weather API $\to$ Agent $\to$ User
5- Debug: Check agent's API call and response
6
7Multi-Agent System:
8- User $\to$ Router Agent $\to$ Intent Agent $\to$ Scheduling Agent $\to$ Weather Agent $\to$ Response Agent $\to$ User
9- Debug: Which agent failed? What did each agent pass to the next?
10- Check router's categorization
11- Check intent extraction
12- Check date parsing
13- Check weather API call
14- Check response formatting
15- Trace message flow between all agents1User Question: "What's the weather in Paris next Tuesday?"
2
3Single Agent System:
4- User $\to$ Agent $\to$ Weather API $\to$ Agent $\to$ User
5- Debug: Check agent's API call and response
6
7Multi-Agent System:
8- User $\to$ Router Agent $\to$ Intent Agent $\to$ Scheduling Agent $\to$ Weather Agent $\to$ Response Agent $\to$ User
9- Debug: Which agent failed? What did each agent pass to the next?
10- Check router's categorization
11- Check intent extraction
12- Check date parsing
13- Check weather API call
14- Check response formatting
15- Trace message flow between all agentsThe multi-agent system has more steps where things can go wrong. Each agent is a potential failure point.
This complexity affects:
Development Time: Writing and testing five agents takes longer than writing one.
Maintenance: When requirements change, you might need to update multiple agents and their interactions.
Cognitive Load: Understanding a multi-agent system requires keeping track of multiple components and their relationships.
Operational Costs: Running multiple agent calls costs more in API fees than running one.
Communication Failures: When Agents Misunderstand
We discussed communication protocols in the previous chapter, but even with good protocols, agents can misunderstand each other.
1## Using Claude Sonnet 4.5 to demonstrate communication misunderstandings
2import anthropic
3import json
4
5client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6
7def data_agent():
8 """
9 Agent that provides data (but in an ambiguous format).
10 """
11 system_prompt = """You are a data collection agent.
12 Provide the requested data in a clear format."""
13
14 response = client.messages.create(
15 model="claude-sonnet-4.5",
16 max_tokens=256,
17 system=system_prompt,
18 messages=[{"role": "user", "content": "Provide the quarterly revenue figures"}]
19 )
20
21 return response.content[0].text
22
23def analysis_agent(data):
24 """
25 Agent that analyzes data (expecting specific format).
26 """
27 system_prompt = """You are a data analysis agent.
28 You receive data as JSON with fields: q1, q2, q3, q4 (all numbers).
29 Calculate the total and average."""
30
31 # Problem: What if data isn't in the expected format?
32 try:
33 data_dict = json.loads(data)
34 total = sum(data_dict.values())
35 average = total / len(data_dict)
36 return f"Total: ${total:,.0f}, Average: ${average:,.0f}"
37 except:
38 return "ERROR: Could not parse data. Expected JSON format with quarterly numbers."
39
40## Demonstrate the communication issue
41print("=== Communication Misunderstanding ===\n")
42
43data = data_agent()
44print(f"Data Agent provided:\n{data}\n")
45
46result = analysis_agent(data)
47print(f"Analysis Agent result:\n{result}\n")
48
49print("Problem: If Data Agent didn't return strict JSON, Analysis Agent fails!")1## Using Claude Sonnet 4.5 to demonstrate communication misunderstandings
2import anthropic
3import json
4
5client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6
7def data_agent():
8 """
9 Agent that provides data (but in an ambiguous format).
10 """
11 system_prompt = """You are a data collection agent.
12 Provide the requested data in a clear format."""
13
14 response = client.messages.create(
15 model="claude-sonnet-4.5",
16 max_tokens=256,
17 system=system_prompt,
18 messages=[{"role": "user", "content": "Provide the quarterly revenue figures"}]
19 )
20
21 return response.content[0].text
22
23def analysis_agent(data):
24 """
25 Agent that analyzes data (expecting specific format).
26 """
27 system_prompt = """You are a data analysis agent.
28 You receive data as JSON with fields: q1, q2, q3, q4 (all numbers).
29 Calculate the total and average."""
30
31 # Problem: What if data isn't in the expected format?
32 try:
33 data_dict = json.loads(data)
34 total = sum(data_dict.values())
35 average = total / len(data_dict)
36 return f"Total: ${total:,.0f}, Average: ${average:,.0f}"
37 except:
38 return "ERROR: Could not parse data. Expected JSON format with quarterly numbers."
39
40## Demonstrate the communication issue
41print("=== Communication Misunderstanding ===\n")
42
43data = data_agent()
44print(f"Data Agent provided:\n{data}\n")
45
46result = analysis_agent(data)
47print(f"Analysis Agent result:\n{result}\n")
48
49print("Problem: If Data Agent didn't return strict JSON, Analysis Agent fails!")Common communication issues include:
Format Mismatches: Agent A sends free-form text, Agent B expects JSON.
Missing Context: Agent B doesn't have information from earlier in the conversation that Agent A assumes it knows.
Ambiguous Messages: Agent A sends "high priority," but Agent B doesn't know if that means "urgent" or just "important."
Version Incompatibility: Agent A uses an updated message format, but Agent B still expects the old format.
These issues require careful protocol design, schema validation, and robust error handling.
Testing and Validation Difficulties
Testing a single agent is relatively simple: provide inputs, check outputs. Testing a multi-agent system requires testing individual agents, their interactions, and emergent behaviors.
You need to test:
Individual Agent Behavior: Does each agent work correctly in isolation?
Integration: Do agents communicate correctly?
Edge Cases: What happens when an agent fails? When messages arrive out of order?
End-to-End Workflows: Does the entire system produce correct results?
Performance Under Load: What happens when many users make requests simultaneously?
Each layer of testing adds work. A system with five agents might require 5 individual agent tests, 10 integration tests (for each pair of communicating agents), and multiple end-to-end scenarios.
When Multi-Agent Systems Make Sense
Given these challenges, when should you embrace multi-agent complexity?
Use multiple agents when:
1. Specialization Provides Clear Value
If different parts of your task truly benefit from specialized expertise, the complexity is worth it. A customer service system with technical, billing, and product specialists makes sense because each domain is genuinely different.
2. Parallel Execution Matters
If speed is crucial and tasks are independent, parallel agents deliver real user experience improvements. Travel planning with simultaneous flight, hotel, and activity research is a good example.
3. Verification is Critical
For high-stakes domains (medical information, financial advice, legal research), having agents verify each other's work is worth the overhead. The cost of an error outweighs the cost of redundancy.
4. System Will Grow and Evolve
If you're building a platform that will add new capabilities over time, modular agents make evolution easier. You can add a new specialist without rewriting everything.
5. Different Agents Need Different Tools
If your system needs to use many different APIs, databases, or tools, specialized agents that each master their specific tools make sense.
Stick with a single agent when:
1. The Task is Straightforward
If the task doesn't benefit from specialization, keep it simple. A single agent that answers basic questions doesn't need to be split up.
2. Speed Isn't Critical
If users are happy waiting a few extra seconds, sequential processing with one agent is simpler than parallel agents.
3. Coordination Would Be Complex
If agents would need extensive back-and-forth communication, the coordination overhead might outweigh any benefits. Sometimes one agent reasoning through the entire problem is cleaner.
4. You Need Simplicity
For prototypes, MVPs, or learning projects, start with one agent. Add more only when you hit clear limitations.
5. Context Needs to Be Preserved
If maintaining conversation context is crucial and sharing it between agents would be difficult, a single agent that keeps all context is simpler.
Practical Design Principles
If you decide to build a multi-agent system, these principles help manage the complexity:
Start Simple, Add Agents Incrementally
Begin with a single agent. When you hit a clear limitation (one domain needs deep expertise, or speed becomes an issue), split off one specialized agent. Then iterate. Don't start with ten agents; grow into that complexity.
Design Clear Interfaces
Each agent should have a well-defined interface: what inputs it accepts, what outputs it produces, what side effects it might have. Document these interfaces clearly. Good interfaces make agents easier to test, debug, and replace.
Minimize Dependencies
The fewer dependencies between agents, the simpler your system. When possible, make agents independent. Prefer message passing over shared state. Avoid circular dependencies.
Invest in Observability
With multiple agents, logging and monitoring become essential. You need to trace messages through the system, measure performance of each agent, and identify bottlenecks. Build this instrumentation from the start.
Plan for Failures
Every agent can fail. Your system should handle failures gracefully. If the weather agent times out, the system should still give the user whatever information it can rather than failing entirely.
Use Standard Protocols
When possible, use established protocols like the A2A Protocol we discussed earlier. Standards make your agents interoperable and easier to understand.
A Balanced Example
Let's bring this together with an example that shows both the benefits and the complexity management:
1## Using Claude Sonnet 4.5 for a well-designed multi-agent system
2import anthropic
3import json
4from datetime import datetime
5
6client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
7
8class BalancedMultiAgentSystem:
9 """
10 A multi-agent system with clear interfaces and error handling.
11 """
12 def __init__(self):
13 self.model = "claude-sonnet-4.5"
14 self.log = []
15
16 def _log_event(self, agent, event, details=None):
17 """
18 Centralized logging for observability.
19 """
20 entry = {
21 "timestamp": datetime.utcnow().isoformat(),
22 "agent": agent,
23 "event": event,
24 "details": details
25 }
26 self.log.append(entry)
27 print(f"[{agent}] {event}")
28
29 def coordinator(self, user_request):
30 """
31 Coordinates the workflow with clear error handling.
32 """
33 self._log_event("coordinator", "Request received", user_request)
34
35 try:
36 # Step 1: Understand intent
37 intent = self.intent_agent(user_request)
38 if intent.get("error"):
39 return self._handle_error("intent", intent["error"])
40
41 # Step 2: Gather information
42 research = self.research_agent(intent["topic"])
43 if research.get("error"):
44 return self._handle_error("research", research["error"])
45
46 # Step 3: Formulate response
47 response = self.response_agent(research, intent)
48 if response.get("error"):
49 return self._handle_error("response", response["error"])
50
51 self._log_event("coordinator", "Request completed successfully")
52 return response
53
54 except Exception as e:
55 self._log_event("coordinator", "Unexpected error", str(e))
56 return {"error": "System error occurred", "details": str(e)}
57
58 def intent_agent(self, request):
59 """
60 Understands user intent with structured output.
61 """
62 self._log_event("intent_agent", "Processing intent")
63
64 try:
65 system_prompt = """Extract the intent from user requests.
66 Return JSON with:
67 - intent_type: "question", "task", or "command"
68 - topic: the main topic
69 - details: any specific requirements
70
71 Only return JSON, nothing else."""
72
73 response = client.messages.create(
74 model=self.model,
75 max_tokens=256,
76 system=system_prompt,
77 messages=[{"role": "user", "content": request}]
78 )
79
80 intent = json.loads(response.content[0].text)
81 self._log_event("intent_agent", "Intent extracted", intent.get("intent_type"))
82 return intent
83
84 except Exception as e:
85 self._log_event("intent_agent", "Failed", str(e))
86 return {"error": str(e)}
87
88 def research_agent(self, topic):
89 """
90 Researches the topic with error handling.
91 """
92 self._log_event("research_agent", "Researching", topic)
93
94 try:
95 system_prompt = """Research the given topic and provide key information.
96 Return JSON with:
97 - summary: brief overview
98 - key_points: list of main points
99 - confidence: 0-1 confidence score
100
101 Only return JSON, nothing else."""
102
103 response = client.messages.create(
104 model=self.model,
105 max_tokens=512,
106 system=system_prompt,
107 messages=[{"role": "user", "content": f"Research: {topic}"}]
108 )
109
110 research = json.loads(response.content[0].text)
111 self._log_event("research_agent", "Research completed",
112 f"confidence: {research.get('confidence')}")
113 return research
114
115 except Exception as e:
116 self._log_event("research_agent", "Failed", str(e))
117 return {"error": str(e)}
118
119 def response_agent(self, research, intent):
120 """
121 Formulates the final response.
122 """
123 self._log_event("response_agent", "Formulating response")
124
125 try:
126 system_prompt = """Create a clear, helpful response based on research and intent.
127 Be concise and directly address the user's needs."""
128
129 context = f"Intent: {json.dumps(intent)}\n\nResearch: {json.dumps(research)}"
130
131 response = client.messages.create(
132 model=self.model,
133 max_tokens=512,
134 system=system_prompt,
135 messages=[{"role": "user", "content": context}]
136 )
137
138 self._log_event("response_agent", "Response created")
139 return {"response": response.content[0].text}
140
141 except Exception as e:
142 self._log_event("response_agent", "Failed", str(e))
143 return {"error": str(e)}
144
145 def _handle_error(self, agent, error):
146 """
147 Graceful error handling.
148 """
149 self._log_event("coordinator", f"Handling error from {agent}")
150 return {
151 "response": f"I encountered an issue while processing your request. Could you try rephrasing?",
152 "internal_error": error
153 }
154
155 def get_log(self):
156 """
157 Return the execution log for debugging.
158 """
159 return self.log
160
161## Example usage
162system = BalancedMultiAgentSystem()
163
164print("=== Balanced Multi-Agent System ===\n")
165
166result = system.coordinator("What are the main benefits of renewable energy?")
167print(f"\nFinal Response: {result.get('response')}")
168
169print("\n=== Execution Log ===")
170for entry in system.get_log():
171 print(f"{entry['timestamp']} | {entry['agent']}: {entry['event']}")1## Using Claude Sonnet 4.5 for a well-designed multi-agent system
2import anthropic
3import json
4from datetime import datetime
5
6client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
7
8class BalancedMultiAgentSystem:
9 """
10 A multi-agent system with clear interfaces and error handling.
11 """
12 def __init__(self):
13 self.model = "claude-sonnet-4.5"
14 self.log = []
15
16 def _log_event(self, agent, event, details=None):
17 """
18 Centralized logging for observability.
19 """
20 entry = {
21 "timestamp": datetime.utcnow().isoformat(),
22 "agent": agent,
23 "event": event,
24 "details": details
25 }
26 self.log.append(entry)
27 print(f"[{agent}] {event}")
28
29 def coordinator(self, user_request):
30 """
31 Coordinates the workflow with clear error handling.
32 """
33 self._log_event("coordinator", "Request received", user_request)
34
35 try:
36 # Step 1: Understand intent
37 intent = self.intent_agent(user_request)
38 if intent.get("error"):
39 return self._handle_error("intent", intent["error"])
40
41 # Step 2: Gather information
42 research = self.research_agent(intent["topic"])
43 if research.get("error"):
44 return self._handle_error("research", research["error"])
45
46 # Step 3: Formulate response
47 response = self.response_agent(research, intent)
48 if response.get("error"):
49 return self._handle_error("response", response["error"])
50
51 self._log_event("coordinator", "Request completed successfully")
52 return response
53
54 except Exception as e:
55 self._log_event("coordinator", "Unexpected error", str(e))
56 return {"error": "System error occurred", "details": str(e)}
57
58 def intent_agent(self, request):
59 """
60 Understands user intent with structured output.
61 """
62 self._log_event("intent_agent", "Processing intent")
63
64 try:
65 system_prompt = """Extract the intent from user requests.
66 Return JSON with:
67 - intent_type: "question", "task", or "command"
68 - topic: the main topic
69 - details: any specific requirements
70
71 Only return JSON, nothing else."""
72
73 response = client.messages.create(
74 model=self.model,
75 max_tokens=256,
76 system=system_prompt,
77 messages=[{"role": "user", "content": request}]
78 )
79
80 intent = json.loads(response.content[0].text)
81 self._log_event("intent_agent", "Intent extracted", intent.get("intent_type"))
82 return intent
83
84 except Exception as e:
85 self._log_event("intent_agent", "Failed", str(e))
86 return {"error": str(e)}
87
88 def research_agent(self, topic):
89 """
90 Researches the topic with error handling.
91 """
92 self._log_event("research_agent", "Researching", topic)
93
94 try:
95 system_prompt = """Research the given topic and provide key information.
96 Return JSON with:
97 - summary: brief overview
98 - key_points: list of main points
99 - confidence: 0-1 confidence score
100
101 Only return JSON, nothing else."""
102
103 response = client.messages.create(
104 model=self.model,
105 max_tokens=512,
106 system=system_prompt,
107 messages=[{"role": "user", "content": f"Research: {topic}"}]
108 )
109
110 research = json.loads(response.content[0].text)
111 self._log_event("research_agent", "Research completed",
112 f"confidence: {research.get('confidence')}")
113 return research
114
115 except Exception as e:
116 self._log_event("research_agent", "Failed", str(e))
117 return {"error": str(e)}
118
119 def response_agent(self, research, intent):
120 """
121 Formulates the final response.
122 """
123 self._log_event("response_agent", "Formulating response")
124
125 try:
126 system_prompt = """Create a clear, helpful response based on research and intent.
127 Be concise and directly address the user's needs."""
128
129 context = f"Intent: {json.dumps(intent)}\n\nResearch: {json.dumps(research)}"
130
131 response = client.messages.create(
132 model=self.model,
133 max_tokens=512,
134 system=system_prompt,
135 messages=[{"role": "user", "content": context}]
136 )
137
138 self._log_event("response_agent", "Response created")
139 return {"response": response.content[0].text}
140
141 except Exception as e:
142 self._log_event("response_agent", "Failed", str(e))
143 return {"error": str(e)}
144
145 def _handle_error(self, agent, error):
146 """
147 Graceful error handling.
148 """
149 self._log_event("coordinator", f"Handling error from {agent}")
150 return {
151 "response": f"I encountered an issue while processing your request. Could you try rephrasing?",
152 "internal_error": error
153 }
154
155 def get_log(self):
156 """
157 Return the execution log for debugging.
158 """
159 return self.log
160
161## Example usage
162system = BalancedMultiAgentSystem()
163
164print("=== Balanced Multi-Agent System ===\n")
165
166result = system.coordinator("What are the main benefits of renewable energy?")
167print(f"\nFinal Response: {result.get('response')}")
168
169print("\n=== Execution Log ===")
170for entry in system.get_log():
171 print(f"{entry['timestamp']} | {entry['agent']}: {entry['event']}")This example demonstrates the key principles:
Clear Interfaces: Each agent has a defined input/output contract.
Error Handling: Every agent can fail gracefully and return errors.
Observability: Comprehensive logging lets you trace execution.
Coordinator Pattern: One agent manages the workflow.
Structured Communication: All agents use JSON for predictable parsing.
The system is more complex than a single agent, but the complexity is managed. You can test each agent independently. You can trace failures through the logs. You can add new agents without rewriting everything.
Looking Ahead
You now understand both the power and the pitfalls of multi-agent systems. Specialization, parallelism, and robustness are genuine benefits. Coordination overhead, increased complexity, and communication challenges are real costs. The key is making informed decisions about when the benefits outweigh the costs.
This completes our exploration of multi-agent systems. You've learned how agents can work together, how they communicate, and when to use multiple agents versus a single agent. These patterns will serve you as you build more sophisticated AI systems.
In the next chapter, we'll shift our focus to evaluation. How do you know if your agent (or agents) is actually doing a good job? You'll learn systematic approaches for measuring performance, gathering feedback, and continuously improving your AI systems.
Glossary
Coordination Overhead: The additional complexity and effort required to synchronize multiple agents, manage dependencies, and ensure they work together correctly without conflicts.
Deadlock: A situation where agents are stuck waiting for each other in a cycle, preventing any progress. For example, Agent A waits for Agent B, which waits for Agent C, which waits for Agent A.
Dependency Management: The practice of identifying which agents depend on outputs from other agents and ensuring they execute in the correct order to satisfy these dependencies.
Format Mismatch: A communication error where one agent sends data in a format (like plain text) that another agent cannot parse because it expects a different format (like JSON).
Graceful Degradation: The ability of a system to continue functioning, possibly with reduced capabilities, when one or more agents fail, rather than failing completely.
Modularity: The property of a system where components (agents) are independent and reusable, with clear interfaces that allow them to be combined in different ways.
Parallel Processing: The execution of multiple independent tasks simultaneously by different agents, resulting in faster overall completion than sequential execution.
Redundancy: Having multiple agents perform the same or similar tasks to provide verification, error checking, or backup capability, improving overall system reliability.
Shared State: Data or information that multiple agents need to access or modify, requiring synchronization mechanisms to prevent conflicts and ensure consistency.
Specialization: The practice of designing agents with focused expertise in specific domains or tasks, allowing each agent to perform better in its area than a generalist agent could.
Quiz
Ready to test your understanding? Take this quick quiz to reinforce what you've learned about the benefits and challenges of multi-agent systems.
Reference

About the author: Michael Brenndoerfer
All opinions expressed here are my own and do not reflect the views of my employer.
Michael currently works as an Associate Director of Data Science at EQT Partners in Singapore, where he drives AI and data initiatives across private capital investments.
With over a decade of experience spanning private equity, management consulting, and software engineering, he specializes in building and scaling analytics capabilities from the ground up. He has published research in leading AI conferences and holds expertise in machine learning, natural language processing, and value creation through data.
Related Content

Scaling Up without Breaking the Bank: AI Agent Performance & Cost Optimization at Scale
Learn how to scale AI agents from single users to thousands while maintaining performance and controlling costs. Covers horizontal scaling, load balancing, monitoring, cost controls, and prompt optimization strategies.

Managing and Reducing AI Agent Costs: Complete Guide to Cost Optimization Strategies
Learn how to dramatically reduce AI agent API costs without sacrificing capability. Covers model selection, caching, batching, prompt optimization, and budget controls with practical Python examples.

Speeding Up AI Agents: Performance Optimization Techniques for Faster Response Times
Learn practical techniques to make AI agents respond faster, including model selection strategies, response caching, streaming, parallel execution, and prompt optimization for reduced latency.
Stay updated
Get notified when I publish new articles on data and AI, private equity, technology, and more.

