Build a working calculator tool for your AI agent from scratch. Learn the complete workflow from Python function to tool integration, with error handling and testing examples.

This article is part of the free-to-read AI Agent Handbook
Example: Adding a Calculator to Our Agent
You've learned why agents need tools and how to design tool interfaces. Now let's build something real. We'll add a calculator tool to our personal assistant, transforming it from an agent that guesses at math problems into one that computes exact answers.
This is a complete, working example you can run and modify. We'll start with the simplest possible calculator, then gradually enhance it to handle more complex scenarios. By the end, you'll have a template you can adapt for any tool you want to add to your agent.
The Problem We're Solving
Remember this scenario from earlier? You ask your agent: "What's ?"
Without tools, the agent approximates:
The agent is guessing. The actual answer is . That's an error of over . Not acceptable if you're calculating costs, revenues, or anything that matters.
With a calculator tool, the agent will:
- Recognize this is a math problem
- Call the calculator with the expression
- Return the exact answer:
Let's build it.
Step 1: Create the Calculator Function
First, we need a Python function that performs calculations. We'll use Python's built-in eval() function to evaluate mathematical expressions:
This function takes a string like "1234 * 5678", evaluates it, and returns the result in a structured format. We include error handling so the agent can provide helpful messages if something goes wrong.
Let's test it:
Perfect. Our calculator works. Now we need to connect it to the agent.
Using eval() on untrusted input is dangerous in production. It can execute arbitrary Python code. For this educational example, it's fine since only the agent (not users directly) calls this function. In production, use a safe math parser like ast.literal_eval() or a dedicated math library.
Step 2: Define the Tool Description
The agent needs to understand what this tool does and how to use it. We create a tool description following the format we learned in the previous chapter:
This description tells the agent:
- What the tool does: Perform precise mathematical calculations
- When to use it: For any arithmetic operations
- What input it needs: A mathematical expression as a string
- How to format the input: Examples of valid expressions
The examples are crucial. They help the agent understand how to extract the math problem from the user's question and format it correctly.
Step 3: Build the Agent with Tool Support (Claude Sonnet 4.5)
Now we'll create an agent that can use our calculator. We'll use Claude Sonnet 4.5 because of its superior tool use and agent reasoning capabilities:
This function handles the complete tool use workflow:
- Sends the user's message to Claude with the calculator tool description
- Checks if Claude wants to use the calculator
- If yes, calls the calculator function with the extracted parameters
- Sends the result back to Claude
- Returns Claude's final response to the user
The print statements let us see what's happening under the hood. In production, you'd use proper logging instead.
Step 4: Test the Calculator Agent
Let's try it out with various math problems:
Look at what happened. The agent:
- Understood each question
- Extracted the mathematical expression
- Called the calculator tool
- Received the exact result
- Formulated a natural response
For Test 2, notice how the agent translated "spend \23.50$15.75$ on coffee" into the expression 100 - 23.50 - 15.75. It understood the word problem and converted it into the appropriate calculation.
Step 5: Test When Tools Aren't Needed
An important part of tool use is knowing when not to use a tool. Let's test that:
Notice that the agent didn't use the calculator for these questions. For "What's 2 plus 2?", it answered directly because this is trivial arithmetic. For "What's the capital of France?", it recognized this isn't a math problem at all.
This shows the agent's intelligence. It's not mechanically using tools for everything. It's reasoning about when tools are necessary.
Step 6: Handle Errors Gracefully
What happens when something goes wrong? Let's test error handling:
The agent handled this well. Python's eval() returns a complex number for the square root of , and the agent explained the result in terms the user can understand.
Let's try a truly invalid expression:
Perfect. The calculator returned an error, and the agent explained the problem and provided the correct answer.
Understanding the Tool Use Flow
Let's trace through exactly what happens when you ask "What's ?":
Step 1: User Input
Step 2: Agent Analyzes the Question
Claude receives:
- The user's message
- The calculator tool description
- Instructions to use tools when appropriate
Claude thinks: "This is a multiplication problem with large numbers. I should use the calculate tool to get the exact answer."
Step 3: Agent Requests Tool Use
Claude returns a response indicating it wants to use a tool:
Step 4: We Call the Tool
Our code extracts the tool name and input, then calls:
Which returns:
Step 5: We Send the Result Back
We add the tool result to the conversation and send it back to Claude:
Step 6: Agent Formulates Final Response
Claude receives the tool result and generates a natural response:
This entire flow happens in milliseconds. From the user's perspective, they just asked a question and got an accurate answer.
Extending the Calculator
Our basic calculator works, but we can make it more robust. Here's an enhanced version that handles more scenarios:
This enhanced version:
- Cleans up the expression by removing common phrases
- Replaces word operators ("times" → "*")
- Formats results nicely (removes unnecessary decimals)
- Provides clearer error messages
You can swap this into your agent by updating the tool registry:
Adding Multiple Tools
Now that you have a calculator working, adding more tools follows the same pattern. Let's add a simple unit converter:
Now your agent can handle both calculations and temperature conversions:
The agent automatically chooses the right tool for each question.
Key Takeaways
Let's review what we've accomplished:
We built a complete tool integration. Starting from a simple Python function, we created a calculator tool that the agent can use to perform exact calculations.
We saw the full workflow in action. The agent receives a question, decides whether to use a tool, calls the tool with the right parameters, and incorporates the result into its response.
We handled errors gracefully. When calculations fail, the agent provides helpful error messages instead of crashing.
We learned the pattern for adding any tool. The same approach works for calculators, weather APIs, databases, or any other external capability you want to give your agent.
The pattern is always:
- Create a Python function that does something useful
- Write a clear tool description
- Add it to the tool registry
- Let the agent decide when to use it
Common Pitfalls and Solutions
As you build your own tools, watch out for these common issues:
Pitfall: Tool descriptions are too vague
❌ Bad: "description": "Does math stuff"
✅ Good: "description": "Perform precise mathematical calculations including addition, subtraction, multiplication, division, and exponentiation"
The agent uses the description to decide when to use the tool. Be specific.
Pitfall: Parameter examples are missing
❌ Bad: "expression": {"type": "string", "description": "A math expression"}
✅ Good: "expression": {"type": "string", "description": "Math expression using +, -, *, /, **. Examples: '1234 * 5678', '100 + 50 - 25'"}
Examples help the agent format inputs correctly.
Pitfall: Error handling is missing
❌ Bad: Let exceptions crash the program
✅ Good: Catch exceptions and return structured error information
The agent can work with error messages. It can't work with crashes.
Pitfall: Outputs are unstructured
❌ Bad: return "The answer is 42"
✅ Good: return {"success": True, "result": 42}
Structured outputs give the agent flexibility in how it presents information.
Looking Ahead
You now have a working agent with tool use capabilities. This is a significant milestone. Your agent is no longer limited to its training data. It can:
- Perform exact calculations
- Access external information
- Take actions in the world
In the next chapter, we'll explore memory and retrieval. You'll learn how to give your agent the ability to remember conversations, store facts, and recall information when needed. Combined with tool use, this will make your agent far more capable and useful.
The calculator example we built here is simple, but the pattern scales. Whether you're adding a weather API, a database query tool, or a complex business logic function, the approach is the same. Define the function, describe it clearly, register it, and let the agent decide when to use it.
This is how modern AI agents work. They combine language understanding with the ability to use tools, creating systems that can reason and act in the world.
Glossary
Tool Registry: A dictionary or mapping that connects tool names to their actual Python function implementations, allowing the agent to look up and call tools dynamically.
Tool Use Block: The structured response from the language model indicating it wants to use a tool, including the tool name and the parameters to pass to it.
Stop Reason: A field in the model's response indicating why it stopped generating. A value of "tool_use" means the model wants to call a tool.
Tool Result: The output returned by a tool function after it executes, which is sent back to the agent so it can incorporate the information into its response.
Parameter Extraction: The process by which the agent identifies the necessary values from a user's question and formats them as inputs for a tool function.
Error Handling: The practice of catching and gracefully managing errors in tool execution, returning structured error information that the agent can communicate to the user.
Quiz
Ready to test your understanding? Take this quick quiz to reinforce what you've learned about adding calculator tools to AI agents.






Comments