Deploying Your AI Agent: From Development Script to Production Service
Back to Writing

Deploying Your AI Agent: From Development Script to Production Service

Michael Brenndoerfer•November 10, 2025•9 min read•1,823 words•Interactive

Learn how to deploy your AI agent from a local script to a production service. Covers packaging, cloud deployment, APIs, and making your agent accessible to users.

AI Agent Handbook Cover
Part of AI Agent Handbook

This article is part of the free-to-read AI Agent Handbook

View full handbook

Deploying Your AI Agent

You've built a capable personal assistant that can reason, use tools, remember conversations, and plan complex tasks. Right now, it lives as a Python script on your development machine. But what if you want to access it from your phone? Share it with a colleague? Or run it 24/7 without keeping your laptop open?

That's where deployment comes in. Deployment is the process of taking your agent from a development script to a running service that you (or others) can reliably access. Think of it like the difference between cooking a meal in your kitchen versus opening a restaurant. The core skills are the same, but you need to think about packaging, location, and how people will interact with what you've created.

In this chapter, we'll walk through the practical steps to deploy your agent. We'll start simple with local deployment, then explore how to package your code properly, choose where to run it, and make it accessible to users. By the end, you'll understand the path from prototype to production.

From Script to Service

Right now, your agent probably looks something like this: you open a terminal, run python assistant.py, type some questions, and get responses. This works great for development, but it has limitations. You can only use it when you're at your computer. If you close the terminal, the agent stops. And sharing it with someone else means sending them your code and hoping they can set it up correctly.

A deployed agent, by contrast, runs continuously on a server. It waits for requests, processes them, and sends back responses. You can interact with it through a web interface, a mobile app, or even a messaging platform like Slack. The agent's state persists between sessions, and multiple people can use it simultaneously.

Let's look at what this transformation involves.

Understanding Deployment Options

Before we dive into the mechanics, it helps to understand your deployment options. Where your agent runs determines its accessibility, reliability, and cost.

Local Deployment

The simplest option is to keep your agent running on your own computer or a dedicated machine you control. You might set it up as a background service that starts automatically when the machine boots. This works well for personal use or small-scale testing.

Advantages: You have complete control, no ongoing costs, and your data stays on your machine. Disadvantages: The agent is only available when that machine is running and connected to the network. If your laptop goes to sleep, the agent stops responding.

Cloud Deployment

For broader access and reliability, you can deploy to a cloud service like AWS, Google Cloud, Azure, or a platform-as-a-service provider like Heroku or Railway. Your agent runs on servers in a data center, accessible from anywhere with an internet connection.

Advantages: High availability (the service keeps running even if your personal computer is off), scalability (you can handle more users by adding resources), and professional infrastructure (backups, monitoring, security). Disadvantages: Ongoing costs (you pay for the computing resources), and you need to manage API keys and data security carefully since your agent is now on someone else's hardware.

Hybrid Approaches

Some deployments combine both. For example, you might run the agent locally but expose it through a secure tunnel service like ngrok, which gives you a public URL without actually moving the code to the cloud. This can be useful for demos or temporary sharing.

Packaging Your Agent

Before you can deploy anywhere, you need to package your code so it can run reliably in a different environment. This means organizing your files, documenting dependencies, and making sure everything needed to run the agent is included.

Organizing Your Code

Start by structuring your project clearly. A typical layout might look like this:

1my-assistant/
2├── assistant.py          # Main agent logic
3├── tools.py              # Tool implementations
4├── memory.py             # Memory management
5├── requirements.txt      # Python dependencies
6├── .env.example          # Example environment variables
7└── README.md             # Setup instructions

This structure makes it clear what each file does and helps others (or future you) understand the project quickly.

Managing Dependencies

Your agent relies on external libraries like the Anthropic SDK, OpenAI client, or other packages. You need to document these so the agent can be set up correctly on a new machine.

Create a requirements.txt file listing all your dependencies with their versions:

1anthropic==0.18.0
2openai==1.12.0
3python-dotenv==1.0.0

Anyone deploying your agent can then install everything with a single command:

1pip install -r requirements.txt

For more robust dependency management, you might use tools like uv or poetry, which handle virtual environments and lock files automatically. But for getting started, requirements.txt works well.

Handling Configuration

Your agent probably uses API keys and other configuration values. These should never be hardcoded in your source files or committed to version control. Instead, use environment variables.

Create a .env.example file showing what variables are needed:

1ANTHROPIC_API_KEY=your_key_here
2OPENAI_API_KEY=your_key_here
3DATABASE_URL=sqlite:///assistant.db

When deploying, you'll create a real .env file with actual values, or set these variables in your deployment platform's configuration.

Running Your Agent as a Service

Once your code is packaged, you need to decide how it will run. For a development script, you manually start it and interact through the terminal. For a deployed agent, you want it to run continuously and accept requests programmatically.

Creating a Simple API

The most common approach is to wrap your agent in a web API. This lets other programs (including web interfaces, mobile apps, or other services) interact with your agent by sending HTTP requests.

Here's a minimal example using Flask, a lightweight Python web framework:

1## Using Claude Sonnet 4.5 for its superior agent capabilities
2from flask import Flask, request, jsonify
3from anthropic import Anthropic
4import os
5
6app = Flask(__name__)
7client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
8
9@app.route('/chat', methods=['POST'])
10def chat():
11    user_message = request.json.get('message')
12    
13    # Call your agent logic here
14    response = client.messages.create(
15        model="claude-sonnet-4.5",
16        max_tokens=1024,
17        messages=[{"role": "user", "content": user_message}]
18    )
19    
20    return jsonify({
21        "response": response.content[0].text
22    })
23
24if __name__ == '__main__':
25    app.run(host='0.0.0.0', port=5000)

Now your agent listens on port 5000. You can send it a message like this:

1curl -X POST http://localhost:5000/chat \
2  -H "Content-Type: application/json" \
3  -d '{"message": "What is the weather like?"}'

This simple API can be extended to handle conversation history, tool calls, and all the other features you've built. The key insight is that you're transforming your agent from an interactive script into a service that responds to requests.

Running in the Background

On a local machine or server, you want your agent to start automatically and keep running even if you log out. On Linux or macOS, you can use systemd or launchd to create a service. On Windows, you can use Task Scheduler or run it as a Windows service.

For a simple approach, you might use a process manager like pm2 (originally for Node.js but works with Python too) or supervisor. These tools restart your agent if it crashes and provide logging.

Example with pm2:

1pm2 start assistant.py --name my-assistant
2pm2 save
3pm2 startup

Now your assistant runs in the background and starts automatically on boot.

Deploying to the Cloud

For cloud deployment, you have many options. Let's walk through a straightforward path using a platform-as-a-service provider, which handles much of the infrastructure complexity for you.

Preparing for Cloud Deployment

First, add a file that tells the platform how to run your application. For example, if using Heroku, you'd create a Procfile:

1web: python assistant.py

Or if you're using Docker (a containerization tool that packages your app with all its dependencies), you'd create a Dockerfile:

1FROM python:3.11-slim
2
3WORKDIR /app
4COPY requirements.txt .
5RUN pip install -r requirements.txt
6
7COPY . .
8
9CMD ["python", "assistant.py"]

This Docker container includes everything your agent needs to run, making deployment consistent across different environments.

Deploying to a Platform

Most platforms follow a similar pattern:

  1. Create an account and a new project
  2. Connect your code repository (GitHub, GitLab, etc.)
  3. Configure environment variables (your API keys)
  4. Deploy with a single command or button click

For example, with Railway:

1railway login
2railway init
3railway up

The platform builds your application, starts it running, and gives you a URL where it's accessible. Your agent is now live on the internet.

Monitoring and Logs

Once deployed, you need visibility into what's happening. Most platforms provide logs you can view in their dashboard or via command-line tools:

1railway logs

These logs show your agent's output, including any errors or debugging information you've added. This is where the observability techniques from Chapter 12 become essential. By logging key decisions and actions, you can diagnose issues even when you can't directly interact with the running agent.

Making Your Agent Accessible

Now that your agent is running somewhere, how do users interact with it? You have several options, depending on your needs.

Web Interface

The most straightforward approach is to build a simple web page that lets users type messages and see responses. You can use HTML, CSS, and JavaScript to create a chat interface that calls your agent's API.

Here's a minimal example:

1<!DOCTYPE html>
2<html>
3<head>
4    <title>My AI Assistant</title>
5</head>
6<body>
7    <div id="chat"></div>
8    <input id="message" type="text" placeholder="Type a message...">
9    <button onclick="sendMessage()">Send</button>
10
11    <script>
12        async function sendMessage() {
13            const message = document.getElementById('message').value;
14            const response = await fetch('/chat', {
15                method: 'POST',
16                headers: {'Content-Type': 'application/json'},
17                body: JSON.stringify({message: message})
18            });
19            const data = await response.json();
20            document.getElementById('chat').innerHTML += 
21                `<p><strong>You:</strong> ${message}</p>
22                 <p><strong>Assistant:</strong> ${data.response}</p>`;
23        }
24    </script>
25</body>
26</html>

This gives you a basic chat interface. You can enhance it with better styling, conversation history, typing indicators, and other features as needed.

Messaging Platform Integration

Another option is to integrate your agent with an existing messaging platform like Slack, Discord, or Telegram. Users interact with your agent through a familiar interface, and you don't need to build your own UI.

Most platforms provide APIs and webhooks for this. For example, with Slack, you'd create a bot application, configure it to receive messages, and have your agent respond to those messages through the Slack API.

API-Only Access

For some use cases, you might not need a user interface at all. If your agent is meant to be used by other programs or services, exposing it as an API is sufficient. Other developers can integrate your agent into their applications by making HTTP requests.

Deployment Checklist

Before you deploy, run through this checklist to make sure you're ready:

Code Organization:

  • All files organized logically
  • Dependencies documented in requirements.txt or equivalent
  • Configuration uses environment variables, not hardcoded values
  • README explains how to set up and run the agent

Security:

  • API keys stored securely (environment variables or secrets manager)
  • No sensitive data committed to version control
  • Input validation to prevent malicious requests
  • Rate limiting to prevent abuse

Reliability:

  • Error handling for API failures and network issues
  • Logging for debugging and monitoring
  • Health check endpoint to verify the agent is running
  • Graceful shutdown handling

Testing:

  • Agent tested with various inputs
  • Edge cases handled (empty messages, very long inputs, etc.)
  • Load testing if expecting multiple users

Once you've verified these points, you're ready to deploy. Start with a simple deployment to test everything, then iterate based on what you learn.

What Comes Next

Deploying your agent is a milestone, but it's not the end of the journey. In the next chapter, we'll look at monitoring and reliability: how to keep your agent running smoothly, detect issues early, and ensure users have a good experience. And in Chapter 14.3, we'll discuss maintenance and updates, so your agent can evolve and improve over time.

For now, you have the foundation to take your personal assistant from a development script to a deployed service. Whether you start with a local deployment for personal use or jump straight to the cloud for broader access, you understand the key concepts and steps involved.

Try deploying your agent using one of the approaches we've covered. See how it feels to interact with it through a web interface instead of a terminal. Notice what works well and what could be improved. Deployment transforms your agent from a prototype into something real, and that's an exciting step forward.

Quiz

Ready to test your understanding? Take this quick quiz to reinforce what you've learned about deploying AI agents.

Loading component...

Reference

BIBTEXAcademic
@misc{deployingyouraiagentfromdevelopmentscripttoproductionservice, author = {Michael Brenndoerfer}, title = {Deploying Your AI Agent: From Development Script to Production Service}, year = {2025}, url = {https://mbrenndoerfer.com/writing/deploying-your-ai-agent-production-service}, organization = {mbrenndoerfer.com}, note = {Accessed: 2025-11-10} }
APAAcademic
Michael Brenndoerfer (2025). Deploying Your AI Agent: From Development Script to Production Service. Retrieved from https://mbrenndoerfer.com/writing/deploying-your-ai-agent-production-service
MLAAcademic
Michael Brenndoerfer. "Deploying Your AI Agent: From Development Script to Production Service." 2025. Web. 11/10/2025. <https://mbrenndoerfer.com/writing/deploying-your-ai-agent-production-service>.
CHICAGOAcademic
Michael Brenndoerfer. "Deploying Your AI Agent: From Development Script to Production Service." Accessed 11/10/2025. https://mbrenndoerfer.com/writing/deploying-your-ai-agent-production-service.
HARVARDAcademic
Michael Brenndoerfer (2025) 'Deploying Your AI Agent: From Development Script to Production Service'. Available at: https://mbrenndoerfer.com/writing/deploying-your-ai-agent-production-service (Accessed: 11/10/2025).
SimpleBasic
Michael Brenndoerfer (2025). Deploying Your AI Agent: From Development Script to Production Service. https://mbrenndoerfer.com/writing/deploying-your-ai-agent-production-service
Michael Brenndoerfer

About the author: Michael Brenndoerfer

All opinions expressed here are my own and do not reflect the views of my employer.

Michael currently works as an Associate Director of Data Science at EQT Partners in Singapore, where he drives AI and data initiatives across private capital investments.

With over a decade of experience spanning private equity, management consulting, and software engineering, he specializes in building and scaling analytics capabilities from the ground up. He has published research in leading AI conferences and holds expertise in machine learning, natural language processing, and value creation through data.

Stay updated

Get notified when I publish new articles on data and AI, private equity, technology, and more.