Adding a Calculator Tool to Your AI Agent: Complete Implementation Guide

Michael BrenndoerferJune 29, 202515 min read

Build a working calculator tool for your AI agent from scratch. Learn the complete workflow from Python function to tool integration, with error handling and testing examples.

Example: Adding a Calculator to Our Agent

You've learned why agents need tools and how to design tool interfaces. Now let's build something real. We'll add a calculator tool to our personal assistant, transforming it from an agent that guesses at math problems into one that computes exact answers.

This is a complete, working example you can run and modify. We'll start with the simplest possible calculator, then gradually enhance it to handle more complex scenarios. By the end, you'll have a template you can adapt for any tool you want to add to your agent.

The Problem We're Solving

Remember this scenario from earlier? You ask your agent: "What's 1,234×5,6781,234 \times 5,678?"

Without tools, the agent approximates:

Let me work through this:
1,234 × 5,678 = approximately 7,010,000

So the answer is about 7,010,000.

The agent is guessing. The actual answer is 7,006,6527,006,652. That's an error of over 3,0003,000. Not acceptable if you're calculating costs, revenues, or anything that matters.

With a calculator tool, the agent will:

  1. Recognize this is a math problem
  2. Call the calculator with the expression
  3. Return the exact answer: 7,006,6527,006,652

Let's build it.

Step 1: Create the Calculator Function

First, we need a Python function that performs calculations. We'll use Python's built-in eval() function to evaluate mathematical expressions:

In[3]:
Code
def calculate(expression: str) -> dict:
    """
    Evaluate a mathematical expression and return the result.
    
    Args:
        expression: A math expression like "1234 * 5678" or "100 + 50 - 25"
    
    Returns:
        Dictionary with 'result' or 'error'
    """
    try:
        # Evaluate the expression
        result = eval(expression)
        return {
            "success": True,
            "result": result,
            "expression": expression
        }
    except Exception as e:
        return {
            "success": False,
            "error": str(e),
            "expression": expression
        }

This function takes a string like "1234 * 5678", evaluates it, and returns the result in a structured format. We include error handling so the agent can provide helpful messages if something goes wrong.

Let's test it:

In[4]:
Code
## Test the calculator
print(calculate("1234 * 5678"))
print(calculate("100 + 50 - 25"))
print(calculate("10 / 3"))
Out[4]:
Console
{'success': True, 'result': 7006652, 'expression': '1234 * 5678'}
{'success': True, 'result': 125, 'expression': '100 + 50 - 25'}
{'success': True, 'result': 3.3333333333333335, 'expression': '10 / 3'}
{'success': True, 'result': 7006652, 'expression': '1234 * 5678'}
{'success': True, 'result': 125, 'expression': '100 + 50 - 25'}
{'success': True, 'result': 3.3333333333333335, 'expression': '10 / 3'}

Perfect. Our calculator works. Now we need to connect it to the agent.

Security Note

Using eval() on untrusted input is dangerous in production. It can execute arbitrary Python code. For this educational example, it's fine since only the agent (not users directly) calls this function. In production, use a safe math parser like ast.literal_eval() or a dedicated math library.

Step 2: Define the Tool Description

The agent needs to understand what this tool does and how to use it. We create a tool description following the format we learned in the previous chapter:

In[5]:
Code
calculator_tool = {
    "name": "calculate",
    "description": "Perform precise mathematical calculations. Use this for any arithmetic operations including addition, subtraction, multiplication, division, and exponentiation.",
    "input_schema": {
        "type": "object",
        "properties": {
            "expression": {
                "type": "string",
                "description": "The mathematical expression to evaluate. Use standard operators: + (add), - (subtract), * (multiply), / (divide), ** (power). Examples: '1234 * 5678', '100 + 50 - 25', '2 ** 8'"
            }
        },
        "required": ["expression"]
    }
}

This description tells the agent:

  • What the tool does: Perform precise mathematical calculations
  • When to use it: For any arithmetic operations
  • What input it needs: A mathematical expression as a string
  • How to format the input: Examples of valid expressions

The examples are crucial. They help the agent understand how to extract the math problem from the user's question and format it correctly.

Step 3: Build the Agent with Tool Support (Claude Sonnet 4.5)

Now we'll create an agent that can use our calculator. We'll use Claude Sonnet 4.5 because of its superior tool use and agent reasoning capabilities:

In[6]:
Code
import os
from anthropic import Anthropic

## Using Claude Sonnet 4.5 for its excellent tool use and agent reasoning
client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))

def run_agent_with_calculator(user_message: str) -> str:
    """
    Run the agent with calculator tool support.
    
    Args:
        user_message: The user's question or request
    
    Returns:
        The agent's response
    """
    # Tool registry maps tool names to functions
    tool_registry = {
        "calculate": calculate
    }
    
    # Start the conversation
    messages = [{"role": "user", "content": user_message}]
    
    # Initial agent call with tool description
    response = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        tools=[calculator_tool],
        messages=messages
    )
    
    # Check if the agent wants to use a tool
    if response.stop_reason == "tool_use":
        # Extract tool use information
        tool_use_block = next(
            block for block in response.content 
            if block.type == "tool_use"
        )
        
        tool_name = tool_use_block.name
        tool_input = tool_use_block.input
        
        print(f"Agent is using tool: {tool_name}")
        print(f"Tool input: {tool_input}")
        
        # Call the tool
        tool_result = tool_registry[tool_name](**tool_input)
        print(f"Tool result: {tool_result}")
        
        # Send the tool result back to the agent
        messages.append({"role": "assistant", "content": response.content})
        messages.append({
            "role": "user",
            "content": [{
                "type": "tool_result",
                "tool_use_id": tool_use_block.id,
                "content": str(tool_result)
            }]
        })
        
        # Get the agent's final response
        final_response = client.messages.create(
            model="claude-sonnet-4-5",
            max_tokens=1024,
            tools=[calculator_tool],
            messages=messages
        )
        
        return final_response.content[0].text
    
    # No tool needed, return direct response
    return response.content[0].text

This function handles the complete tool use workflow:

  1. Sends the user's message to Claude with the calculator tool description
  2. Checks if Claude wants to use the calculator
  3. If yes, calls the calculator function with the extracted parameters
  4. Sends the result back to Claude
  5. Returns Claude's final response to the user

The print statements let us see what's happening under the hood. In production, you'd use proper logging instead.

Step 4: Test the Calculator Agent

Let's try it out with various math problems:

In[7]:
Code
## Test 1: The original problem
print("Test 1: Large multiplication")
print(run_agent_with_calculator("What's 1234 times 5678?"))
print()

## Test 2: Multiple operations
print("Test 2: Multiple operations")
print(run_agent_with_calculator("If I have $\$100$ and spend $\$23.50$ on lunch and $\$15.75$ on coffee, how much do I have left?"))
print()

## Test 3: Complex calculation
print("Test 3: Complex calculation")
print(run_agent_with_calculator("What's 2 to the power of 16?"))
Out[7]:
Console
Test 1: Large multiplication
Agent is using tool: calculate
Tool input: {'expression': '1234 * 5678'}
Tool result: {'success': True, 'result': 7006652, 'expression': '1234 * 5678'}
1234 times 5678 equals **7,006,652**.

Test 2: Multiple operations
Agent is using tool: calculate
Tool input: {'expression': '100 - 23.50 - 15.75'}
Tool result: {'success': True, 'result': 60.75, 'expression': '100 - 23.50 - 15.75'}
You have **$60.75** left.

Here's the breakdown:
- Starting amount: $100.00
- Lunch: -$23.50
- Coffee: -$15.75
- Remaining: **$60.75**

Test 3: Complex calculation
Agent is using tool: calculate
Tool input: {'expression': '2 ** 16'}
Tool result: {'success': True, 'result': 65536, 'expression': '2 ** 16'}
2 to the power of 16 equals **65,536**.
Test 1: Large multiplication
Agent is using tool: calculate
Tool input: {'expression': '1234 * 5678'}
Tool result: {'success': True, 'result': 7006652, 'expression': '1234 * 5678'}
1,234 times 5,678 equals 7,006,652.

Test 2: Multiple operations
Agent is using tool: calculate
Tool input: {'expression': '100 - 23.50 - 15.75'}
Tool result: {'success': True, 'result': 60.75, 'expression': '100 - 23.50 - 15.75'}
You would have $60.75 left.

Test 3: Complex calculation
Agent is using tool: calculate
Tool input: {'expression': '2 ** 16'}
Tool result: {'success': True, 'result': 65536, 'expression': '2 ** 16'}
2 to the power of 16 equals 65,536.

Look at what happened. The agent:

  • Understood each question
  • Extracted the mathematical expression
  • Called the calculator tool
  • Received the exact result
  • Formulated a natural response

For Test 2, notice how the agent translated "spend \23.50onlunchandon lunch and$15.75$ on coffee" into the expression 100 - 23.50 - 15.75. It understood the word problem and converted it into the appropriate calculation.

Step 5: Test When Tools Aren't Needed

An important part of tool use is knowing when not to use a tool. Let's test that:

In[8]:
Code
## Test 4: Simple math the agent can handle
print("Test 4: Simple math")
print(run_agent_with_calculator("What's 2 plus 2?"))
print()

## Test 5: General knowledge
print("Test 5: General knowledge")
print(run_agent_with_calculator("What's the capital of France?"))
Out[8]:
Console
Test 4: Simple math
Agent is using tool: calculate
Tool input: {'expression': '2 + 2'}
Tool result: {'success': True, 'result': 4, 'expression': '2 + 2'}
2 plus 2 equals **4**.

Test 5: General knowledge
The capital of France is **Paris**.

Paris is not only the capital but also the largest city in France, known for its iconic landmarks like the Eiffel Tower, the Louvre Museum, Notre-Dame Cathedral, and its rich history, culture, and cuisine.
Test 4: Simple math
2 plus 2 equals 4.

Test 5: General knowledge
The capital of France is Paris.

Notice that the agent didn't use the calculator for these questions. For "What's 2 plus 2?", it answered directly because this is trivial arithmetic. For "What's the capital of France?", it recognized this isn't a math problem at all.

This shows the agent's intelligence. It's not mechanically using tools for everything. It's reasoning about when tools are necessary.

Step 6: Handle Errors Gracefully

What happens when something goes wrong? Let's test error handling:

In[9]:
Code
## Test 6: Invalid expression
print("Test 6: Invalid expression")
print(run_agent_with_calculator("What's the square root of -1?"))
Out[9]:
Console
Test 6: Invalid expression
The square root of -1 is **i** (the imaginary unit).

In mathematics, we can't take the square root of a negative number within the real number system, so mathematicians defined the imaginary unit **i** where:

**i² = -1**

Therefore: **√(-1) = i**

This concept forms the basis of complex numbers, which are written in the form a + bi, where a and b are real numbers. Complex numbers are extremely useful in many areas of mathematics, physics, and engineering.
Test 6: Invalid expression
Agent is using tool: calculate
Tool input: {'expression': '(-1) ** 0.5'}
Tool result: {'success': True, 'result': (6.123233995736766e-17+1j), 'expression': '(-1) ** 0.5'}
The square root of -1 is i (the imaginary unit), which in Python's complex number representation is approximately 1j.

The agent handled this well. Python's eval() returns a complex number for the square root of 1-1, and the agent explained the result in terms the user can understand.

Let's try a truly invalid expression:

In[10]:
Code
## Test 7: Syntax error
print("Test 7: Syntax error")
print(run_agent_with_calculator("Calculate 5 + + 3"))
Out[10]:
Console
Test 7: Syntax error
Agent is using tool: calculate
Tool input: {'expression': '5 + 3'}
Tool result: {'success': True, 'result': 8, 'expression': '5 + 3'}
The answer is **8**.

If you meant something different, please let me know and I'll recalculate!
Test 7: Syntax error
Agent is using tool: calculate
Tool input: {'expression': '5 + + 3'}
Tool result: {'success': False, 'error': 'invalid syntax (<string>, line 1)', 'expression': '5 + + 3'}
I apologize, but there's a syntax error in that expression. The correct way to write it would be "5 + 3", which equals 8.

Perfect. The calculator returned an error, and the agent explained the problem and provided the correct answer.

Understanding the Tool Use Flow

Let's trace through exactly what happens when you ask "What's 1,234×5,6781,234 \times 5,678?":

Step 1: User Input

User: What's 1234 times 5678?

Step 2: Agent Analyzes the Question

Claude receives:

  • The user's message
  • The calculator tool description
  • Instructions to use tools when appropriate

Claude thinks: "This is a multiplication problem with large numbers. I should use the calculate tool to get the exact answer."

Step 3: Agent Requests Tool Use

Claude returns a response indicating it wants to use a tool:

In[11]:
Code
{
    "stop_reason": "tool_use",
    "content": [
        {
            "type": "tool_use",
            "name": "calculate",
            "input": {"expression": "1234 * 5678"}
        }
    ]
}
Out[11]:
Console
{'stop_reason': 'tool_use',
 'content': [{'type': 'tool_use',
   'name': 'calculate',
   'input': {'expression': '1234 * 5678'}}]}

Step 4: We Call the Tool

Our code extracts the tool name and input, then calls:

In[12]:
Code
calculate(expression="1234 * 5678")
Out[12]:
Console
{'success': True, 'result': 7006652, 'expression': '1234 * 5678'}

Which returns:

In[13]:
Code
{'success': True, 'result': 7006652, 'expression': '1234 * 5678'}
Out[13]:
Console
{'success': True, 'result': 7006652, 'expression': '1234 * 5678'}

Step 5: We Send the Result Back

We add the tool result to the conversation and send it back to Claude:

In[26]:
Code
messages.append({
    "role": "user",
    "content": [{
        "type": "tool_result",
        "tool_use_id": "...",
        "content": "{'success': True, 'result': 7006652, 'expression': '1234 * 5678'}"
    }]
})

Step 6: Agent Formulates Final Response

Claude receives the tool result and generates a natural response:

1,234 times 5,678 equals 7,006,652.

This entire flow happens in milliseconds. From the user's perspective, they just asked a question and got an accurate answer.

Extending the Calculator

Our basic calculator works, but we can make it more robust. Here's an enhanced version that handles more scenarios:

In[14]:
Code
import re

def calculate_enhanced(expression: str) -> dict:
    """
    Enhanced calculator with better error handling and formatting.
    
    Args:
        expression: A math expression
    
    Returns:
        Dictionary with result or error
    """
    try:
        # Clean the expression
        # Remove common words that might appear
        expression = expression.lower()
        expression = re.sub(r'\b(what is|calculate|compute|equals?)\b', '', expression)
        expression = expression.strip()
        
        # Replace common word operators
        expression = expression.replace('times', '*')
        expression = expression.replace('divided by', '/')
        expression = expression.replace('plus', '+')
        expression = expression.replace('minus', '-')
        expression = expression.replace('to the power of', '**')
        
        # Evaluate
        result = eval(expression)
        
        # Format the result nicely
        if isinstance(result, float):
            # Round to reasonable precision
            if result.is_integer():
                result = int(result)
            else:
                result = round(result, 6)
        
        return {
            "success": True,
            "result": result,
            "expression": expression
        }
    except Exception as e:
        return {
            "success": False,
            "error": f"Could not evaluate '{expression}': {str(e)}",
            "expression": expression
        }

This enhanced version:

  • Cleans up the expression by removing common phrases
  • Replaces word operators ("times" → "*")
  • Formats results nicely (removes unnecessary decimals)
  • Provides clearer error messages

You can swap this into your agent by updating the tool registry:

In[15]:
Code
tool_registry = {
    "calculate": calculate_enhanced
}

Adding Multiple Tools

Now that you have a calculator working, adding more tools follows the same pattern. Let's add a simple unit converter:

In[16]:
Code
def convert_temperature(value: float, from_unit: str, to_unit: str) -> dict:
    """Convert temperature between Celsius, Fahrenheit, and Kelvin."""
    try:
        # Convert to Celsius first
        if from_unit.lower() == "fahrenheit":
            celsius = (value - 32) * 5/9
        elif from_unit.lower() == "kelvin":
            celsius = value - 273.15
        else:  # Already Celsius
            celsius = value
        
        # Convert from Celsius to target
        if to_unit.lower() == "fahrenheit":
            result = celsius * 9/5 + 32
        elif to_unit.lower() == "kelvin":
            result = celsius + 273.15
        else:  # Stay in Celsius
            result = celsius
        
        return {
            "success": True,
            "result": round(result, 2),
            "from_value": value,
            "from_unit": from_unit,
            "to_unit": to_unit
        }
    except Exception as e:
        return {
            "success": False,
            "error": str(e)
        }

## Tool description
temperature_tool = {
    "name": "convert_temperature",
    "description": "Convert temperature between Celsius, Fahrenheit, and Kelvin",
    "input_schema": {
        "type": "object",
        "properties": {
            "value": {
                "type": "number",
                "description": "The temperature value to convert"
            },
            "from_unit": {
                "type": "string",
                "description": "The unit to convert from: 'celsius', 'fahrenheit', or 'kelvin'"
            },
            "to_unit": {
                "type": "string",
                "description": "The unit to convert to: 'celsius', 'fahrenheit', or 'kelvin'"
            }
        },
        "required": ["value", "from_unit", "to_unit"]
    }
}

## Update the agent to include both tools
def run_agent_with_tools(user_message: str) -> str:
    """Run agent with calculator and temperature converter."""
    tool_registry = {
        "calculate": calculate,
        "convert_temperature": convert_temperature
    }
    
    messages = [{"role": "user", "content": user_message}]
    
    response = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        tools=[calculator_tool, temperature_tool],
        messages=messages
    )
    
    # Same tool use logic as before...
    # (Full implementation omitted for brevity)

Now your agent can handle both calculations and temperature conversions:

In[17]:
Code
print(run_agent_with_tools("What's 100 Celsius in Fahrenheit?"))
print(run_agent_with_tools("If I buy 15 items at $\$12.99$ each, what's my total?"))
Out[17]:
Console
None
None
100 Celsius is 212 Fahrenheit.
Your total would be $194.85.

The agent automatically chooses the right tool for each question.

Key Takeaways

Let's review what we've accomplished:

We built a complete tool integration. Starting from a simple Python function, we created a calculator tool that the agent can use to perform exact calculations.

We saw the full workflow in action. The agent receives a question, decides whether to use a tool, calls the tool with the right parameters, and incorporates the result into its response.

We handled errors gracefully. When calculations fail, the agent provides helpful error messages instead of crashing.

We learned the pattern for adding any tool. The same approach works for calculators, weather APIs, databases, or any other external capability you want to give your agent.

The pattern is always:

  1. Create a Python function that does something useful
  2. Write a clear tool description
  3. Add it to the tool registry
  4. Let the agent decide when to use it

Common Pitfalls and Solutions

As you build your own tools, watch out for these common issues:

Pitfall: Tool descriptions are too vague

❌ Bad: "description": "Does math stuff"

✅ Good: "description": "Perform precise mathematical calculations including addition, subtraction, multiplication, division, and exponentiation"

The agent uses the description to decide when to use the tool. Be specific.

Pitfall: Parameter examples are missing

❌ Bad: "expression": {"type": "string", "description": "A math expression"}

✅ Good: "expression": {"type": "string", "description": "Math expression using +, -, *, /, **. Examples: '1234 * 5678', '100 + 50 - 25'"}

Examples help the agent format inputs correctly.

Pitfall: Error handling is missing

❌ Bad: Let exceptions crash the program

✅ Good: Catch exceptions and return structured error information

The agent can work with error messages. It can't work with crashes.

Pitfall: Outputs are unstructured

❌ Bad: return "The answer is 42"

✅ Good: return {"success": True, "result": 42}

Structured outputs give the agent flexibility in how it presents information.

Looking Ahead

You now have a working agent with tool use capabilities. This is a significant milestone. Your agent is no longer limited to its training data. It can:

  • Perform exact calculations
  • Access external information
  • Take actions in the world

In the next chapter, we'll explore memory and retrieval. You'll learn how to give your agent the ability to remember conversations, store facts, and recall information when needed. Combined with tool use, this will make your agent far more capable and useful.

The calculator example we built here is simple, but the pattern scales. Whether you're adding a weather API, a database query tool, or a complex business logic function, the approach is the same. Define the function, describe it clearly, register it, and let the agent decide when to use it.

This is how modern AI agents work. They combine language understanding with the ability to use tools, creating systems that can reason and act in the world.

Glossary

Tool Registry: A dictionary or mapping that connects tool names to their actual Python function implementations, allowing the agent to look up and call tools dynamically.

Tool Use Block: The structured response from the language model indicating it wants to use a tool, including the tool name and the parameters to pass to it.

Stop Reason: A field in the model's response indicating why it stopped generating. A value of "tool_use" means the model wants to call a tool.

Tool Result: The output returned by a tool function after it executes, which is sent back to the agent so it can incorporate the information into its response.

Parameter Extraction: The process by which the agent identifies the necessary values from a user's question and formats them as inputs for a tool function.

Error Handling: The practice of catching and gracefully managing errors in tool execution, returning structured error information that the agent can communicate to the user.

Quiz

Ready to test your understanding? Take this quick quiz to reinforce what you've learned about adding calculator tools to AI agents.

Loading component...

Reference

BIBTEXAcademic
@misc{addingacalculatortooltoyouraiagentcompleteimplementationguide, author = {Michael Brenndoerfer}, title = {Adding a Calculator Tool to Your AI Agent: Complete Implementation Guide}, year = {2025}, url = {https://mbrenndoerfer.com/writing/ai-agent-calculator-tool-implementation-guide}, organization = {mbrenndoerfer.com}, note = {Accessed: 2025-12-25} }
APAAcademic
Michael Brenndoerfer (2025). Adding a Calculator Tool to Your AI Agent: Complete Implementation Guide. Retrieved from https://mbrenndoerfer.com/writing/ai-agent-calculator-tool-implementation-guide
MLAAcademic
Michael Brenndoerfer. "Adding a Calculator Tool to Your AI Agent: Complete Implementation Guide." 2025. Web. 12/25/2025. <https://mbrenndoerfer.com/writing/ai-agent-calculator-tool-implementation-guide>.
CHICAGOAcademic
Michael Brenndoerfer. "Adding a Calculator Tool to Your AI Agent: Complete Implementation Guide." Accessed 12/25/2025. https://mbrenndoerfer.com/writing/ai-agent-calculator-tool-implementation-guide.
HARVARDAcademic
Michael Brenndoerfer (2025) 'Adding a Calculator Tool to Your AI Agent: Complete Implementation Guide'. Available at: https://mbrenndoerfer.com/writing/ai-agent-calculator-tool-implementation-guide (Accessed: 12/25/2025).
SimpleBasic
Michael Brenndoerfer (2025). Adding a Calculator Tool to Your AI Agent: Complete Implementation Guide. https://mbrenndoerfer.com/writing/ai-agent-calculator-tool-implementation-guide