Reasoning: Teaching AI Agents to Think Step-by-Step with Chain-of-Thought Prompting
Back to Writing

Reasoning: Teaching AI Agents to Think Step-by-Step with Chain-of-Thought Prompting

Michael Brenndoerfer•November 8, 2025•13 min read•1,801 words•Interactive

Learn how to use chain-of-thought prompting to get AI agents to reason through problems step by step, improving accuracy and transparency for complex questions, math problems, and decision-making tasks.

AI Agent Handbook Cover
Part of AI Agent Handbook

This article is part of the free-to-read AI Agent Handbook

View full handbook

Reasoning: Teaching the Agent to Think

You've learned how to write clear prompts and use strategies like roles and examples to guide your AI agent. But what happens when you ask a question that requires multiple steps to solve? What if the agent needs to think through a problem rather than just generate an answer?

This is where reasoning comes in. Just like showing your work in math class, we can prompt the AI to break down complex problems into smaller, logical steps. This chapter explores how to get your agent to reason through challenges, leading to more accurate and trustworthy results.

The Problem with Direct Answers

Let's start with a simple example. Ask an AI agent this question:

1A farmer has 17 sheep. All but 9 die. How many sheep are left?

If the agent jumps straight to an answer, it might say "8 sheep" by subtracting 9 from 17. But that's wrong. The correct answer is 9, because "all but 9 die" means 9 survive.

The agent made a common mistake: it rushed to calculate without carefully reading the question. This happens because language models are trained to predict the next words, and sometimes the most likely next words aren't the most thoughtful ones.

Now try asking the same question differently:

1A farmer has 17 sheep. All but 9 die. How many sheep are left?
2
3Let's think through this step by step.

With that simple addition, the agent is more likely to respond:

1Let me work through this carefully:
21. The farmer starts with 17 sheep
32. The phrase "all but 9 die" means that 9 sheep survive
43. If 9 sheep survive, then 9 sheep are left
5
6The answer is 9 sheep.

Much better. By prompting the agent to think step by step, we helped it avoid the trap and arrive at the correct answer. This is the core idea behind reasoning: giving the model space to work through problems methodically.

What Is Chain-of-Thought Prompting?

The technique we just used has a name: chain-of-thought prompting. The idea is simple: instead of asking for just an answer, you ask the model to show its reasoning process. You're essentially saying, "Don't just tell me the answer. Show me how you got there."

This approach works because it changes how the model generates its response. When you ask for step-by-step thinking, the model produces intermediate reasoning steps. Each step builds on the previous one, creating a chain of thought that leads to the final answer.

Think of it like this: if someone asks you a complex question and expects an immediate answer, you might guess or oversimplify. But if they say "take your time and explain your thinking," you'll naturally break the problem down, consider different aspects, and arrive at a more thoughtful response. Chain-of-thought prompting does the same thing for AI agents.

When Reasoning Helps

Not every question needs explicit reasoning. If you ask "What's the capital of France?" the agent can answer directly: "Paris." There's no need for step-by-step thinking.

But reasoning becomes valuable when:

The problem has multiple steps: Math problems, logic puzzles, or any task that requires sequential thinking benefits from breaking down the process.

The question is ambiguous: When a question could be interpreted different ways, reasoning helps the agent clarify what it's solving before jumping to an answer.

Accuracy matters more than speed: If you need a reliable answer and can afford a slightly longer response, reasoning reduces errors.

You need to verify the answer: When the agent shows its work, you can check whether its logic makes sense, even if you don't know the correct answer yourself.

How to Prompt for Reasoning

The good news is that prompting for reasoning is straightforward. You don't need special tools or complex setups. You just need to ask the model to think step by step.

Here are several ways to do this:

Direct Instruction

The simplest approach is to explicitly ask for step-by-step thinking:

1Question: [Your question here]
2
3Let's solve this step by step.

Or:

1Question: [Your question here]
2
3Think through this carefully, showing your reasoning at each step.

These phrases signal to the model that you want a methodical approach rather than a quick answer.

Example (OpenAI)

Let's see this in action with a math word problem:

1from openai import OpenAI
2
3client = OpenAI(api_key="your-api-key-here")
4
5## Without reasoning prompt
6response_direct = client.chat.completions.create(
7    model="gpt-4",
8    messages=[
9        {"role": "user", "content": "If a train travels 60 miles in 45 minutes, how far will it travel in 2 hours at the same speed?"}
10    ]
11)
12
13## With reasoning prompt
14response_reasoning = client.chat.completions.create(
15    model="gpt-4",
16    messages=[
17        {"role": "user", "content": """If a train travels 60 miles in 45 minutes, how far will it travel in 2 hours at the same speed?
18
19Let's work through this step by step."""}
20    ]
21)
22
23print("Direct answer:", response_direct.choices[0].message.content)
24print("\nWith reasoning:", response_reasoning.choices[0].message.content)

The direct version might give you the answer (160 miles) without explanation. The reasoning version will show the work:

1Let's work through this step by step:
2
31. First, find the train's speed in miles per hour
4   - The train travels 60 miles in 45 minutes
5   - 45 minutes = 0.75 hours
6   - Speed = 60 miles ÷ 0.75 hours = 80 miles per hour
7
82. Now calculate the distance for 2 hours
9   - Distance = Speed × Time
10   - Distance = 80 miles/hour × 2 hours = 160 miles
11
12The train will travel 160 miles in 2 hours.

Both answers are correct, but the reasoning version shows you how the agent arrived at the answer. This transparency builds trust and makes it easier to spot errors if they occur.

Teaching Reasoning Through Examples

Just like with other prompting strategies, you can teach reasoning by showing examples. This is especially useful when you want the agent to follow a specific reasoning format or approach.

Few-Shot Reasoning

Here's how you might use examples to teach a reasoning pattern:

1Question: If 5 apples cost $3, how much do 8 apples cost?
2Reasoning:
3- First, find the cost per apple: $3 ÷ 5 apples = $0.60 per apple
4- Then multiply by 8 apples: $0.60 × 8 = $4.80
5Answer: $4.80
6
7Question: If 3 books weigh 6 pounds, how much do 7 books weigh?
8Reasoning:
9- First, find the weight per book: 6 pounds ÷ 3 books = 2 pounds per book
10- Then multiply by 7 books: 2 pounds × 7 = 14 pounds
11Answer: 14 pounds
12
13Question: If 4 tickets cost $50, how much do 9 tickets cost?

By showing the pattern in the first two examples, you're teaching the agent how to structure its reasoning for the third question. The model will recognize the pattern: find the unit rate, then scale up.

Example (OpenAI)

Here's how this looks in code:

1from openai import OpenAI
2
3client = OpenAI(api_key="your-api-key-here")
4
5prompt = """Solve these word problems by showing your reasoning step by step.
6
7Question: If 5 apples cost $3, how much do 8 apples cost?
8Reasoning:
9- First, find the cost per apple: $3 ÷ 5 apples = $0.60 per apple
10- Then multiply by 8 apples: $0.60 × 8 = $4.80
11Answer: $4.80
12
13Question: If 3 books weigh 6 pounds, how much do 7 books weigh?
14Reasoning:
15- First, find the weight per book: 6 pounds ÷ 3 books = 2 pounds per book
16- Then multiply by 7 books: 2 pounds × 7 = 14 pounds
17Answer: 14 pounds
18
19Question: If 4 tickets cost $50, how much do 9 tickets cost?
20Reasoning:"""
21
22response = client.chat.completions.create(
23    model="gpt-4",
24    messages=[
25        {"role": "user", "content": prompt}
26    ]
27)
28
29print(response.choices[0].message.content)

The agent will follow the pattern you've established, breaking down the problem into unit cost and then scaling up.

Beyond Math: Reasoning for Complex Questions

While math problems are a clear use case for reasoning, the technique works for many types of questions. Any time you need the agent to think through multiple considerations, chain-of-thought prompting helps.

Logical Reasoning

Consider this logic puzzle:

1All roses are flowers.
2Some flowers fade quickly.
3Do all roses fade quickly?
4
5Let's reason through this step by step.

The agent might respond:

1Let me work through this logically:
2
31. We know that all roses are flowers (roses are a subset of flowers)
42. We know that some flowers fade quickly (but not necessarily all flowers)
53. The question asks if all roses fade quickly
6
7The key word is "some." Just because some flowers fade quickly doesn't mean all flowers do. Since roses are flowers, and only some flowers fade quickly, we can't conclude that all roses fade quickly. Some roses might fade quickly, but we don't have enough information to say all of them do.
8
9Answer: No, we cannot conclude that all roses fade quickly.

The step-by-step approach helps the agent avoid jumping to conclusions and carefully consider what the premises actually tell us.

Decision Making

Reasoning also helps with decisions that involve weighing trade-offs:

1I need to choose between two laptops:
2- Laptop A: $800, 8GB RAM, 256GB storage, 10-hour battery
3- Laptop B: $900, 16GB RAM, 512GB storage, 6-hour battery
4
5I mainly use my laptop for web browsing and documents, and I travel frequently. Which should I choose?
6
7Let's think through the key factors step by step.

The agent can reason through:

1Let's consider the key factors:
2
31. Usage needs:
4   - Web browsing and documents don't require heavy processing power
5   - 8GB RAM is sufficient for these tasks
6   - 256GB storage is adequate unless you store many large files
7
82. Travel considerations:
9   - You travel frequently, so battery life is important
10   - Laptop A offers 10 hours vs Laptop B's 6 hours
11   - This 4-hour difference is significant for travel days
12
133. Cost difference:
14   - Laptop B costs $100 more
15   - The extra RAM and storage are nice but not necessary for your use case
16   - The shorter battery life is actually a disadvantage for your needs
17
18Recommendation: Laptop A is the better choice. It costs less, has better battery life for travel, and meets your computing needs. The extra specs in Laptop B don't align with your usage patterns and come at the cost of battery life.

By reasoning through the decision, the agent provides not just a recommendation but the logic behind it. This makes the advice more useful and trustworthy.

Checking and Refining Reasoning

One powerful aspect of chain-of-thought prompting is that it makes the agent's thinking visible. This means you can check whether the reasoning makes sense, even if you're not sure about the final answer.

But we can go further. We can actually prompt the agent to check its own work.

Self-Verification

After the agent provides an answer with reasoning, you can ask it to verify:

1Question: [Original question]
2Answer: [Agent's answer with reasoning]
3
4Now double-check this answer. Is the reasoning sound? Are there any errors?

This prompts the agent to review its work from a critical perspective. Sometimes this catches mistakes that slipped through the first time.

Example (OpenAI)

Here's how you might implement self-verification:

1from openai import OpenAI
2
3client = OpenAI(api_key="your-api-key-here")
4
5## First, get the agent's answer with reasoning
6initial_response = client.chat.completions.create(
7    model="gpt-4",
8    messages=[
9        {"role": "user", "content": """A store has a 20% off sale. If a shirt originally costs $40, and you have a coupon for an additional $5 off, what's the final price?
10
11Let's calculate this step by step."""}
12    ]
13)
14
15initial_answer = initial_response.choices[0].message.content
16print("Initial answer:", initial_answer)
17
18## Now ask the agent to verify
19verification_response = client.chat.completions.create(
20    model="gpt-4",
21    messages=[
22        {"role": "user", "content": f"""Here's a solution to a problem:
23
24{initial_answer}
25
26Please review this solution carefully. Is the reasoning correct? Are there any errors? If there are mistakes, provide the corrected solution."""}
27    ]
28)
29
30print("\nVerification:", verification_response.choices[0].message.content)

This two-step process can catch errors. For instance, the agent might initially apply the $5 coupon before the 20% discount, when the correct order might be to apply the percentage discount first. The verification step gives it a chance to reconsider.

For intermediate readers: This self-verification technique works because language models are better at evaluating text than generating it perfectly on the first try. When you ask the model to critique its own reasoning, you're essentially running a second inference pass with a different framing. The model can spot inconsistencies or errors in the reasoning chain that it might have missed when generating the original response. This is similar to how humans often catch their own mistakes when proofreading. However, it's not foolproof. The model might reinforce an error if it's confident in the wrong answer. For critical applications, you'd want human review or multiple independent reasoning paths.

Practical Tips for Reasoning Prompts

As you incorporate reasoning into your agent, keep these guidelines in mind:

Be explicit about wanting reasoning: Phrases like "Let's think step by step," "Show your work," or "Explain your reasoning" clearly signal what you want.

Use examples for complex reasoning patterns: If you want a specific type of reasoning, show an example or two so the agent understands the format.

Break very complex problems into stages: For multi-part questions, you might prompt for reasoning on each part separately rather than trying to solve everything at once.

Verify when accuracy matters: For important decisions or calculations, use the self-verification technique or check the reasoning yourself.

Balance reasoning with efficiency: Not every question needs full reasoning. Save it for cases where accuracy and transparency matter.

Reasoning in Your Personal Assistant

Let's see how reasoning fits into the personal assistant we're building. Imagine a user asks:

1I have a meeting at 2 PM that usually runs 90 minutes. I need to leave for the airport by 4:30 PM, and it takes 20 minutes to get back to my office after the meeting. Will I have time to stop by the office before heading to the airport?

Without reasoning, the agent might give a simple yes or no. With reasoning:

Example (OpenAI)

1from openai import OpenAI
2
3client = OpenAI(api_key="your-api-key-here")
4
5response = client.chat.completions.create(
6    model="gpt-4",
7    messages=[
8        {"role": "system", "content": "You are a helpful personal assistant that thinks through scheduling questions carefully."},
9        {"role": "user", "content": """I have a meeting at 2 PM that usually runs 90 minutes. I need to leave for the airport by 4:30 PM, and it takes 20 minutes to get back to my office after the meeting. Will I have time to stop by the office before heading to the airport?
10
11Let's work through the timeline step by step."""}
12    ]
13)
14
15print(response.choices[0].message.content)

The agent might respond:

1Let me work through your timeline step by step:
2
31. Meeting start: 2:00 PM
42. Meeting duration: 90 minutes (1.5 hours)
53. Meeting end: 2:00 PM + 1:30 = 3:30 PM
64. Travel time to office: 20 minutes
75. Arrival at office: 3:30 PM + 0:20 = 3:50 PM
86. Departure deadline for airport: 4:30 PM
97. Available time at office: 4:30 PM - 3:50 PM = 40 minutes
10
11Yes, you'll have time to stop by the office. You'll have approximately 40 minutes at the office before you need to leave for the airport. However, this assumes the meeting ends exactly on time. If there's a risk of the meeting running over, you might want to plan for a shorter office stop or skip it to ensure you make your airport departure time.

This response is far more useful than a simple "yes." The reasoning shows the user exactly how much time they'll have and even includes a practical consideration about meetings running over.

Common Pitfalls with Reasoning

As you use chain-of-thought prompting, watch out for these issues:

Over-reasoning simple questions: Don't force reasoning onto questions that don't need it. "What's 2 + 2?" doesn't need a multi-step explanation.

Assuming reasoning guarantees correctness: Reasoning helps, but the agent can still make mistakes. The reasoning might be flawed, or based on incorrect assumptions. Always check important results.

Making prompts too long: If you're adding extensive examples or instructions for reasoning, you might make your prompts unwieldy. Keep them as concise as possible while still being clear.

Forgetting to use the reasoning: If you prompt for reasoning but then only look at the final answer, you're missing the value. The reasoning itself is useful for verification and understanding.

Building on Reasoning

You now understand how to get your AI agent to think through problems step by step. This capability transforms your agent from a pattern-matching text generator into something that can tackle complex, multi-step challenges.

In the next chapter, we'll explore how to extend your agent's abilities even further by giving it access to external tools. You'll learn how an agent can recognize when it needs help, call a calculator or search function, and integrate that information into its reasoning. The combination of reasoning and tool use creates agents that are both thoughtful and capable.

Key Takeaways

  • Reasoning improves accuracy: Prompting for step-by-step thinking helps the agent avoid rushing to wrong conclusions
  • Chain-of-thought is simple: Just ask the model to "think step by step" or "show your work"
  • Transparency builds trust: When you can see the reasoning, you can verify it makes sense
  • Examples teach patterns: Few-shot prompting works for reasoning just like other tasks
  • Self-verification catches errors: Asking the agent to check its work can improve results
  • Not everything needs reasoning: Save it for complex questions where accuracy matters

With reasoning in your toolkit, your AI agent can handle more sophisticated challenges. The next chapter explores how to expand your agent's capabilities even further through tool use.

Glossary

Chain-of-Thought Prompting: A technique where you prompt the AI to show its reasoning process step by step, rather than jumping directly to an answer. This leads to more accurate and transparent results.

Reasoning: The process of thinking through a problem logically, considering multiple steps or factors before reaching a conclusion. In AI agents, this means generating intermediate thinking steps.

Self-Verification: A technique where you prompt the agent to review and check its own reasoning or answers, potentially catching errors or improving the response.

Step-by-Step Thinking: Breaking down a complex problem into smaller, sequential steps that build on each other, similar to showing your work in math class.

Quiz

Ready to test your understanding? Take this quick quiz to reinforce what you've learned about teaching AI agents to reason.

Loading component...

Reference

BIBTEXAcademic
@misc{reasoningteachingaiagentstothinkstepbystepwithchainofthoughtprompting, author = {Michael Brenndoerfer}, title = {Reasoning: Teaching AI Agents to Think Step-by-Step with Chain-of-Thought Prompting}, year = {2025}, url = {https://mbrenndoerfer.com/writing/ai-agent-reasoning-chain-of-thought-prompting}, organization = {mbrenndoerfer.com}, note = {Accessed: 2025-11-09} }
APAAcademic
Michael Brenndoerfer (2025). Reasoning: Teaching AI Agents to Think Step-by-Step with Chain-of-Thought Prompting. Retrieved from https://mbrenndoerfer.com/writing/ai-agent-reasoning-chain-of-thought-prompting
MLAAcademic
Michael Brenndoerfer. "Reasoning: Teaching AI Agents to Think Step-by-Step with Chain-of-Thought Prompting." 2025. Web. 11/9/2025. <https://mbrenndoerfer.com/writing/ai-agent-reasoning-chain-of-thought-prompting>.
CHICAGOAcademic
Michael Brenndoerfer. "Reasoning: Teaching AI Agents to Think Step-by-Step with Chain-of-Thought Prompting." Accessed 11/9/2025. https://mbrenndoerfer.com/writing/ai-agent-reasoning-chain-of-thought-prompting.
HARVARDAcademic
Michael Brenndoerfer (2025) 'Reasoning: Teaching AI Agents to Think Step-by-Step with Chain-of-Thought Prompting'. Available at: https://mbrenndoerfer.com/writing/ai-agent-reasoning-chain-of-thought-prompting (Accessed: 11/9/2025).
SimpleBasic
Michael Brenndoerfer (2025). Reasoning: Teaching AI Agents to Think Step-by-Step with Chain-of-Thought Prompting. https://mbrenndoerfer.com/writing/ai-agent-reasoning-chain-of-thought-prompting
Michael Brenndoerfer

About the author: Michael Brenndoerfer

All opinions expressed here are my own and do not reflect the views of my employer.

Michael currently works as an Associate Director of Data Science at EQT Partners in Singapore, where he drives AI and data initiatives across private capital investments.

With over a decade of experience spanning private equity, management consulting, and software engineering, he specializes in building and scaling analytics capabilities from the ground up. He has published research in leading AI conferences and holds expertise in machine learning, natural language processing, and value creation through data.

Stay updated

Get notified when I publish new articles on data and AI, private equity, technology, and more.