Action Restrictions and Permissions: Controlling What Your AI Agent Can Do

Michael Brenndoerfer

AI Agent Handbook Machine Learning Software Engineering

Learn how to implement action restrictions and permissions for AI agents using the principle of least privilege, confirmation steps, and sandboxing to keep your agent powerful but safe.

Part of AI Agent Handbook

This article is part of the free-to-read AI Agent Handbook

View full handbook

Action Restrictions and Permissions

In the previous chapter, you learned how to keep your agent's outputs safe through content moderation. But what about its actions? When your agent can send emails, modify files, or make API calls, filtering text isn't enough. You need to control what it's allowed to do in the first place.

Think about how permissions work on your computer. When you install an app, it asks for specific permissions: access to your camera, your files, your location. The app doesn't get unlimited power. It gets exactly what it needs to do its job, and no more. Your AI agent should work the same way.

This chapter explores how to implement action restrictions and permissions for our personal assistant. You'll learn the principle of least privilege (giving the agent only the access it needs), how to add confirmation steps for risky actions, and how to sandbox the agent's environment. By the end, you'll have an agent that's powerful but constrained, capable but safe.

The Problem with Unrestricted Actions

Let's start by understanding what can go wrong. Imagine you've given your assistant the ability to send emails on your behalf. Here's a simple tool implementation:

1## Using Claude Sonnet 4.5 for its strong tool use capabilities
2import anthropic
3import smtplib
4from email.mime.text import MIMEText
5
6def send_email(to_address, subject, body):
7    """Send an email via SMTP"""
8    msg = MIMEText(body)
9    msg['Subject'] = subject
10    msg['From'] = 'your.email@example.com'
11    msg['To'] = to_address
12    
13    with smtplib.SMTP('smtp.example.com', 587) as server:
14        server.starttls()
15        server.login('your.email@example.com', 'PASSWORD')
16        server.send_message(msg)
17    
18    return f"Email sent to {to_address}"
19
20## Define the tool for the agent
21tools = [{
22    "name": "send_email",
23    "description": "Send an email to a specified address",
24    "input_schema": {
25        "type": "object",
26        "properties": {
27            "to_address": {"type": "string"},
28            "subject": {"type": "string"},
29            "body": {"type": "string"}
30        },
31        "required": ["to_address", "subject", "body"]
32    }
33}]

1## Using Claude Sonnet 4.5 for its strong tool use capabilities
2import anthropic
3import smtplib
4from email.mime.text import MIMEText
5
6def send_email(to_address, subject, body):
7    """Send an email via SMTP"""
8    msg = MIMEText(body)
9    msg['Subject'] = subject
10    msg['From'] = 'your.email@example.com'
11    msg['To'] = to_address
12    
13    with smtplib.SMTP('smtp.example.com', 587) as server:
14        server.starttls()
15        server.login('your.email@example.com', 'PASSWORD')
16        server.send_message(msg)
17    
18    return f"Email sent to {to_address}"
19
20## Define the tool for the agent
21tools = [{
22    "name": "send_email",
23    "description": "Send an email to a specified address",
24    "input_schema": {
25        "type": "object",
26        "properties": {
27            "to_address": {"type": "string"},
28            "subject": {"type": "string"},
29            "body": {"type": "string"}
30        },
31        "required": ["to_address", "subject", "body"]
32    }
33}]

This works, but it's dangerous. The agent can now send any email to anyone, at any time, without asking. What if it misunderstands a request? What if a user tries to trick it into spamming someone? What if there's a bug in your code that causes it to send the same email repeatedly?

These aren't hypothetical concerns. When you give an agent the power to take actions in the real world, you need safeguards. Let's explore how to add them.

Principle of Least Privilege

The first rule of action safety is simple: give your agent the minimum permissions it needs to do its job. This is called the principle of least privilege, and it's a fundamental concept in security.

Let's apply this to our email example. Instead of letting the agent email anyone, we might restrict it to only emailing people in your contacts list, or only emailing specific domains. Here's how:

1## Using Claude Sonnet 4.5 for agent reasoning with restricted tools
2import anthropic
3
4ALLOWED_DOMAINS = ['example.com', 'trusted-partner.com']
5ALLOWED_RECIPIENTS = ['alice@example.com', 'bob@example.com']
6
7def send_email_restricted(to_address, subject, body):
8    """Send an email with domain and recipient restrictions"""
9    # Check if recipient is in allowed list
10    if to_address not in ALLOWED_RECIPIENTS:
11        # Check if domain is allowed
12        domain = to_address.split('@')[1] if '@' in to_address else ''
13        if domain not in ALLOWED_DOMAINS:
14            return f"Error: Cannot send email to {to_address}. Not in allowed recipients or domains."
15    
16    # If we get here, the recipient is allowed
17    msg = MIMEText(body)
18    msg['Subject'] = subject
19    msg['From'] = 'your.email@example.com'
20    msg['To'] = to_address
21    
22    with smtplib.SMTP('smtp.example.com', 587) as server:
23        server.starttls()
24        server.login('your.email@example.com', 'PASSWORD')
25        server.send_message(msg)
26    
27    return f"Email sent to {to_address}"

1## Using Claude Sonnet 4.5 for agent reasoning with restricted tools
2import anthropic
3
4ALLOWED_DOMAINS = ['example.com', 'trusted-partner.com']
5ALLOWED_RECIPIENTS = ['alice@example.com', 'bob@example.com']
6
7def send_email_restricted(to_address, subject, body):
8    """Send an email with domain and recipient restrictions"""
9    # Check if recipient is in allowed list
10    if to_address not in ALLOWED_RECIPIENTS:
11        # Check if domain is allowed
12        domain = to_address.split('@')[1] if '@' in to_address else ''
13        if domain not in ALLOWED_DOMAINS:
14            return f"Error: Cannot send email to {to_address}. Not in allowed recipients or domains."
15    
16    # If we get here, the recipient is allowed
17    msg = MIMEText(body)
18    msg['Subject'] = subject
19    msg['From'] = 'your.email@example.com'
20    msg['To'] = to_address
21    
22    with smtplib.SMTP('smtp.example.com', 587) as server:
23        server.starttls()
24        server.login('your.email@example.com', 'PASSWORD')
25        server.send_message(msg)
26    
27    return f"Email sent to {to_address}"

Now the agent can only email specific people or specific domains. If it tries to email someone else, the function returns an error. This is a hard constraint that the agent can't bypass, no matter what the user asks for.

You can apply the same principle to other tools:

File access: Instead of giving the agent access to your entire filesystem, restrict it to a specific directory:

1import os
2
3ALLOWED_DIRECTORY = '/home/user/assistant_workspace'
4
5def read_file_restricted(filename):
6    """Read a file only from the allowed directory"""
7    # Construct the full path
8    full_path = os.path.join(ALLOWED_DIRECTORY, filename)
9    
10    # Make sure the path is actually inside the allowed directory
11    # (prevents tricks like "../../../etc/passwd")
12    real_path = os.path.realpath(full_path)
13    if not real_path.startswith(os.path.realpath(ALLOWED_DIRECTORY)):
14        return "Error: Access denied. File is outside allowed directory."
15    
16    # Check if file exists
17    if not os.path.exists(real_path):
18        return f"Error: File {filename} not found."
19    
20    # Read and return the file
21    with open(real_path, 'r') as f:
22        return f.read()

1import os
2
3ALLOWED_DIRECTORY = '/home/user/assistant_workspace'
4
5def read_file_restricted(filename):
6    """Read a file only from the allowed directory"""
7    # Construct the full path
8    full_path = os.path.join(ALLOWED_DIRECTORY, filename)
9    
10    # Make sure the path is actually inside the allowed directory
11    # (prevents tricks like "../../../etc/passwd")
12    real_path = os.path.realpath(full_path)
13    if not real_path.startswith(os.path.realpath(ALLOWED_DIRECTORY)):
14        return "Error: Access denied. File is outside allowed directory."
15    
16    # Check if file exists
17    if not os.path.exists(real_path):
18        return f"Error: File {filename} not found."
19    
20    # Read and return the file
21    with open(real_path, 'r') as f:
22        return f.read()

API calls: Limit which APIs the agent can call and what operations it can perform:

1ALLOWED_API_ENDPOINTS = [
2    'https://api.weather.com/current',
3    'https://api.calendar.com/events'
4]
5
6def make_api_call(endpoint, method='GET', data=None):
7    """Make an API call to allowed endpoints only"""
8    if endpoint not in ALLOWED_API_ENDPOINTS:
9        return f"Error: API endpoint {endpoint} is not allowed."
10    
11    if method not in ['GET', 'POST']:
12        return f"Error: HTTP method {method} is not allowed."
13    
14    # Make the actual API call
15    # ... implementation ...

1ALLOWED_API_ENDPOINTS = [
2    'https://api.weather.com/current',
3    'https://api.calendar.com/events'
4]
5
6def make_api_call(endpoint, method='GET', data=None):
7    """Make an API call to allowed endpoints only"""
8    if endpoint not in ALLOWED_API_ENDPOINTS:
9        return f"Error: API endpoint {endpoint} is not allowed."
10    
11    if method not in ['GET', 'POST']:
12        return f"Error: HTTP method {method} is not allowed."
13    
14    # Make the actual API call
15    # ... implementation ...

The pattern is consistent: before the tool does anything, it checks whether the action is allowed. If not, it returns an error. The agent sees the error and can explain to the user why the action wasn't possible.

Confirmation Steps for Risky Actions

Some actions are too risky to perform automatically, even if they're technically allowed. For these, you want the agent to ask for confirmation first.

Think about how your smartphone handles this. When an app wants to access your camera, it doesn't just do it. It asks: "Allow this app to access your camera?" You have to explicitly approve.

Your agent should do the same for high-stakes actions. Here's how to implement confirmation for email sending:

1## Using Claude Sonnet 4.5 for multi-turn confirmation flows
2import anthropic
3
4class AgentWithConfirmation:
5    def __init__(self):
6        self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
7        self.pending_actions = {}
8        
9    def run(self, user_message, conversation_history=None):
10        """Run the agent with confirmation support"""
11        if conversation_history is None:
12            conversation_history = []
13        
14        # Add user message to history
15        conversation_history.append({
16            "role": "user",
17            "content": user_message
18        })
19        
20        # Define tools with confirmation requirement
21        tools = [{
22            "name": "send_email",
23            "description": "Send an email. Requires user confirmation before sending.",
24            "input_schema": {
25                "type": "object",
26                "properties": {
27                    "to_address": {"type": "string"},
28                    "subject": {"type": "string"},
29                    "body": {"type": "string"}
30                },
31                "required": ["to_address", "subject", "body"]
32            }
33        }]
34        
35        # Call the model
36        response = self.client.messages.create(
37            model="claude-sonnet-4.5",
38            max_tokens=2048,
39            messages=conversation_history,
40            tools=tools
41        )
42        
43        # Check if agent wants to use a tool
44        if response.stop_reason == "tool_use":
45            tool_use = next(block for block in response.content if block.type == "tool_use")
46            
47            if tool_use.name == "send_email":
48                # Store the pending action
49                action_id = "email_001"
50                self.pending_actions[action_id] = tool_use.input
51                
52                # Ask user for confirmation
53                confirmation_message = f"""I'm ready to send this email:
54
55To: {tool_use.input['to_address']}
56Subject: {tool_use.input['subject']}
57Body: {tool_use.input['body']}
58
59Do you want me to send this email? (Reply 'yes' to confirm or 'no' to cancel)"""
60                
61                return confirmation_message, conversation_history
62        
63        # Return normal response
64        return response.content[0].text, conversation_history
65    
66    def confirm_action(self, action_id, confirmed):
67        """Execute or cancel a pending action based on user confirmation"""
68        if action_id not in self.pending_actions:
69            return "No pending action found."
70        
71        action = self.pending_actions[action_id]
72        
73        if confirmed:
74            # Execute the action
75            result = send_email_restricted(
76                action['to_address'],
77                action['subject'],
78                action['body']
79            )
80            del self.pending_actions[action_id]
81            return f"Confirmed. {result}"
82        else:
83            # Cancel the action
84            del self.pending_actions[action_id]
85            return "Action cancelled."

1## Using Claude Sonnet 4.5 for multi-turn confirmation flows
2import anthropic
3
4class AgentWithConfirmation:
5    def __init__(self):
6        self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
7        self.pending_actions = {}
8        
9    def run(self, user_message, conversation_history=None):
10        """Run the agent with confirmation support"""
11        if conversation_history is None:
12            conversation_history = []
13        
14        # Add user message to history
15        conversation_history.append({
16            "role": "user",
17            "content": user_message
18        })
19        
20        # Define tools with confirmation requirement
21        tools = [{
22            "name": "send_email",
23            "description": "Send an email. Requires user confirmation before sending.",
24            "input_schema": {
25                "type": "object",
26                "properties": {
27                    "to_address": {"type": "string"},
28                    "subject": {"type": "string"},
29                    "body": {"type": "string"}
30                },
31                "required": ["to_address", "subject", "body"]
32            }
33        }]
34        
35        # Call the model
36        response = self.client.messages.create(
37            model="claude-sonnet-4.5",
38            max_tokens=2048,
39            messages=conversation_history,
40            tools=tools
41        )
42        
43        # Check if agent wants to use a tool
44        if response.stop_reason == "tool_use":
45            tool_use = next(block for block in response.content if block.type == "tool_use")
46            
47            if tool_use.name == "send_email":
48                # Store the pending action
49                action_id = "email_001"
50                self.pending_actions[action_id] = tool_use.input
51                
52                # Ask user for confirmation
53                confirmation_message = f"""I'm ready to send this email:
54
55To: {tool_use.input['to_address']}
56Subject: {tool_use.input['subject']}
57Body: {tool_use.input['body']}
58
59Do you want me to send this email? (Reply 'yes' to confirm or 'no' to cancel)"""
60                
61                return confirmation_message, conversation_history
62        
63        # Return normal response
64        return response.content[0].text, conversation_history
65    
66    def confirm_action(self, action_id, confirmed):
67        """Execute or cancel a pending action based on user confirmation"""
68        if action_id not in self.pending_actions:
69            return "No pending action found."
70        
71        action = self.pending_actions[action_id]
72        
73        if confirmed:
74            # Execute the action
75            result = send_email_restricted(
76                action['to_address'],
77                action['subject'],
78                action['body']
79            )
80            del self.pending_actions[action_id]
81            return f"Confirmed. {result}"
82        else:
83            # Cancel the action
84            del self.pending_actions[action_id]
85            return "Action cancelled."

Let's see this in action:

1User: Send an email to alice@example.com with subject "Meeting Tomorrow" 
2      and body "Let's meet at 2pm"
3
4Agent: I'm ready to send this email:
5
6To: alice@example.com
7Subject: Meeting Tomorrow
8Body: Let's meet at 2pm
9
10Do you want me to send this email? (Reply 'yes' to confirm or 'no' to cancel)
11
12User: yes
13
14Agent: Confirmed. Email sent to alice@example.com

1User: Send an email to alice@example.com with subject "Meeting Tomorrow" 
2      and body "Let's meet at 2pm"
3
4Agent: I'm ready to send this email:
5
6To: alice@example.com
7Subject: Meeting Tomorrow
8Body: Let's meet at 2pm
9
10Do you want me to send this email? (Reply 'yes' to confirm or 'no' to cancel)
11
12User: yes
13
14Agent: Confirmed. Email sent to alice@example.com

The agent prepares the action but doesn't execute it. It shows you exactly what it's about to do and waits for your approval. Only when you say "yes" does the email actually get sent.

This pattern works for any risky action:

Deleting files: Show which files will be deleted
Making purchases: Show the item and price
Posting to social media: Show the exact content to be posted
Modifying data: Show what will change

The key is to make the confirmation specific. Don't just ask "Is this okay?" Show exactly what will happen so the user can make an informed decision.

Sandboxing the Agent's Environment

Even with restrictions and confirmations, you might want an extra layer of protection: running the agent in a sandbox. A sandbox is a restricted environment where the agent can operate without affecting the rest of your system.

Think of it like a playground with a fence. The agent can do whatever it wants inside the sandbox, but it can't get out and affect anything beyond the fence.

Here's a simple example using Docker to sandbox file operations:

1## Using Claude Sonnet 4.5 with sandboxed file operations
2import docker
3import tempfile
4import os
5
6class SandboxedAgent:
7    def __init__(self):
8        self.client = docker.from_env()
9        # Create a temporary directory for the sandbox
10        self.sandbox_dir = tempfile.mkdtemp()
11        
12    def execute_file_operation(self, code):
13        """Execute Python code in a sandboxed Docker container"""
14        # Create a Python script with the code
15        script_path = os.path.join(self.sandbox_dir, 'script.py')
16        with open(script_path, 'w') as f:
17            f.write(code)
18        
19        # Run the code in a Docker container with limited permissions
20        container = self.client.containers.run(
21            'python:3.9-slim',
22            f'python /sandbox/script.py',
23            volumes={
24                self.sandbox_dir: {'bind': '/sandbox', 'mode': 'rw'}
25            },
26            network_disabled=True,  # No network access
27            mem_limit='256m',  # Limited memory
28            cpu_period=100000,
29            cpu_quota=50000,  # Limited CPU
30            remove=True,
31            detach=False
32        )
33        
34        return container.decode('utf-8')
35    
36    def cleanup(self):
37        """Clean up the sandbox directory"""
38        import shutil
39        shutil.rmtree(self.sandbox_dir)

1## Using Claude Sonnet 4.5 with sandboxed file operations
2import docker
3import tempfile
4import os
5
6class SandboxedAgent:
7    def __init__(self):
8        self.client = docker.from_env()
9        # Create a temporary directory for the sandbox
10        self.sandbox_dir = tempfile.mkdtemp()
11        
12    def execute_file_operation(self, code):
13        """Execute Python code in a sandboxed Docker container"""
14        # Create a Python script with the code
15        script_path = os.path.join(self.sandbox_dir, 'script.py')
16        with open(script_path, 'w') as f:
17            f.write(code)
18        
19        # Run the code in a Docker container with limited permissions
20        container = self.client.containers.run(
21            'python:3.9-slim',
22            f'python /sandbox/script.py',
23            volumes={
24                self.sandbox_dir: {'bind': '/sandbox', 'mode': 'rw'}
25            },
26            network_disabled=True,  # No network access
27            mem_limit='256m',  # Limited memory
28            cpu_period=100000,
29            cpu_quota=50000,  # Limited CPU
30            remove=True,
31            detach=False
32        )
33        
34        return container.decode('utf-8')
35    
36    def cleanup(self):
37        """Clean up the sandbox directory"""
38        import shutil
39        shutil.rmtree(self.sandbox_dir)

This sandbox provides several protections:

Isolated filesystem: The agent can only access files in the sandbox directory. It can't read or modify anything else on your system.

No network access: The agent can't make network requests, preventing it from sending data to external servers.

Resource limits: The agent gets limited CPU and memory, preventing it from consuming all your system resources.

Automatic cleanup: When you're done, you can delete the entire sandbox, removing any files the agent created.

For most personal assistants, full Docker sandboxing might be overkill. But the principle is valuable: isolate risky operations so they can't affect the rest of your system.

A lighter-weight approach is to use Python's built-in restrictions:

1import os
2import sys
3
4def run_restricted_code(code):
5    """Run Python code with restricted built-ins"""
6    # Create a restricted environment
7    restricted_globals = {
8        '__builtins__': {
9            'print': print,
10            'len': len,
11            'range': range,
12            'str': str,
13            'int': int,
14            'float': float,
15            'list': list,
16            'dict': dict,
17            # Add only safe built-ins
18        }
19    }
20    
21    # Execute the code in the restricted environment
22    try:
23        exec(code, restricted_globals)
24        return "Code executed successfully"
25    except Exception as e:
26        return f"Error: {str(e)}"
27
28## Example: This will work
29result = run_restricted_code("print('Hello, world!')")
30print(result)  # "Code executed successfully"
31
32## Example: This will fail (no file access)
33result = run_restricted_code("open('/etc/passwd', 'r')")
34print(result)  # "Error: name 'open' is not defined"

1import os
2import sys
3
4def run_restricted_code(code):
5    """Run Python code with restricted built-ins"""
6    # Create a restricted environment
7    restricted_globals = {
8        '__builtins__': {
9            'print': print,
10            'len': len,
11            'range': range,
12            'str': str,
13            'int': int,
14            'float': float,
15            'list': list,
16            'dict': dict,
17            # Add only safe built-ins
18        }
19    }
20    
21    # Execute the code in the restricted environment
22    try:
23        exec(code, restricted_globals)
24        return "Code executed successfully"
25    except Exception as e:
26        return f"Error: {str(e)}"
27
28## Example: This will work
29result = run_restricted_code("print('Hello, world!')")
30print(result)  # "Code executed successfully"
31
32## Example: This will fail (no file access)
33result = run_restricted_code("open('/etc/passwd', 'r')")
34print(result)  # "Error: name 'open' is not defined"

This approach removes dangerous built-ins like open, eval, and __import__, preventing the agent from accessing files or importing modules. It's not as secure as Docker, but it's much simpler and works for many use cases.

Designing a Permission System

As your agent grows more capable, you'll want a more structured approach to permissions. Instead of hardcoding restrictions in each tool, you can create a permission system that manages what the agent can do.

Here's a simple permission system:

1## Using Claude Sonnet 4.5 with a structured permission system
2from enum import Enum
3from typing import Set, Dict, Any
4
5class Permission(Enum):
6    READ_FILES = "read_files"
7    WRITE_FILES = "write_files"
8    SEND_EMAIL = "send_email"
9    MAKE_API_CALLS = "make_api_calls"
10    DELETE_DATA = "delete_data"
11
12class PermissionManager:
13    def __init__(self, granted_permissions: Set[Permission]):
14        self.granted_permissions = granted_permissions
15        self.permission_log = []
16    
17    def check_permission(self, permission: Permission, action_details: str) -> bool:
18        """Check if a permission is granted and log the check"""
19        has_permission = permission in self.granted_permissions
20        
21        self.permission_log.append({
22            'permission': permission.value,
23            'action': action_details,
24            'granted': has_permission
25        })
26        
27        return has_permission
28    
29    def require_permission(self, permission: Permission, action_details: str):
30        """Raise an error if permission is not granted"""
31        if not self.check_permission(permission, action_details):
32            raise PermissionError(
33                f"Action requires {permission.value} permission: {action_details}"
34            )
35
36class PermissionedAgent:
37    def __init__(self, permissions: Set[Permission]):
38        self.permissions = PermissionManager(permissions)
39        self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
40    
41    def read_file(self, filename: str) -> str:
42        """Read a file if permission is granted"""
43        self.permissions.require_permission(
44            Permission.READ_FILES,
45            f"Reading file: {filename}"
46        )
47        
48        # Permission granted, proceed with reading
49        with open(filename, 'r') as f:
50            return f.read()
51    
52    def send_email(self, to_address: str, subject: str, body: str) -> str:
53        """Send an email if permission is granted"""
54        self.permissions.require_permission(
55            Permission.SEND_EMAIL,
56            f"Sending email to: {to_address}"
57        )
58        
59        # Permission granted, proceed with sending
60        return send_email_restricted(to_address, subject, body)
61    
62    def get_permission_log(self) -> list:
63        """Get a log of all permission checks"""
64        return self.permissions.permission_log

1## Using Claude Sonnet 4.5 with a structured permission system
2from enum import Enum
3from typing import Set, Dict, Any
4
5class Permission(Enum):
6    READ_FILES = "read_files"
7    WRITE_FILES = "write_files"
8    SEND_EMAIL = "send_email"
9    MAKE_API_CALLS = "make_api_calls"
10    DELETE_DATA = "delete_data"
11
12class PermissionManager:
13    def __init__(self, granted_permissions: Set[Permission]):
14        self.granted_permissions = granted_permissions
15        self.permission_log = []
16    
17    def check_permission(self, permission: Permission, action_details: str) -> bool:
18        """Check if a permission is granted and log the check"""
19        has_permission = permission in self.granted_permissions
20        
21        self.permission_log.append({
22            'permission': permission.value,
23            'action': action_details,
24            'granted': has_permission
25        })
26        
27        return has_permission
28    
29    def require_permission(self, permission: Permission, action_details: str):
30        """Raise an error if permission is not granted"""
31        if not self.check_permission(permission, action_details):
32            raise PermissionError(
33                f"Action requires {permission.value} permission: {action_details}"
34            )
35
36class PermissionedAgent:
37    def __init__(self, permissions: Set[Permission]):
38        self.permissions = PermissionManager(permissions)
39        self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
40    
41    def read_file(self, filename: str) -> str:
42        """Read a file if permission is granted"""
43        self.permissions.require_permission(
44            Permission.READ_FILES,
45            f"Reading file: {filename}"
46        )
47        
48        # Permission granted, proceed with reading
49        with open(filename, 'r') as f:
50            return f.read()
51    
52    def send_email(self, to_address: str, subject: str, body: str) -> str:
53        """Send an email if permission is granted"""
54        self.permissions.require_permission(
55            Permission.SEND_EMAIL,
56            f"Sending email to: {to_address}"
57        )
58        
59        # Permission granted, proceed with sending
60        return send_email_restricted(to_address, subject, body)
61    
62    def get_permission_log(self) -> list:
63        """Get a log of all permission checks"""
64        return self.permissions.permission_log

Now you can create agents with different permission levels:

1## Create a read-only agent
2read_only_agent = PermissionedAgent({
3    Permission.READ_FILES,
4    Permission.MAKE_API_CALLS
5})
6
7## Create a full-access agent
8full_agent = PermissionedAgent({
9    Permission.READ_FILES,
10    Permission.WRITE_FILES,
11    Permission.SEND_EMAIL,
12    Permission.MAKE_API_CALLS
13})
14
15## Try to send an email with the read-only agent
16try:
17    read_only_agent.send_email('alice@example.com', 'Test', 'Hello')
18except PermissionError as e:
19    print(f"Permission denied: {e}")
20    # Output: Permission denied: Action requires send_email permission: 
21    #         Sending email to: alice@example.com

1## Create a read-only agent
2read_only_agent = PermissionedAgent({
3    Permission.READ_FILES,
4    Permission.MAKE_API_CALLS
5})
6
7## Create a full-access agent
8full_agent = PermissionedAgent({
9    Permission.READ_FILES,
10    Permission.WRITE_FILES,
11    Permission.SEND_EMAIL,
12    Permission.MAKE_API_CALLS
13})
14
15## Try to send an email with the read-only agent
16try:
17    read_only_agent.send_email('alice@example.com', 'Test', 'Hello')
18except PermissionError as e:
19    print(f"Permission denied: {e}")
20    # Output: Permission denied: Action requires send_email permission: 
21    #         Sending email to: alice@example.com

This system gives you several benefits:

Centralized control: All permission logic is in one place, making it easy to audit and modify.

Clear permissions: You can see at a glance what each agent is allowed to do.

Audit trail: The permission log shows every action the agent attempted and whether it was allowed.

Flexible configuration: You can easily create agents with different permission levels for different use cases.

Combining Restrictions, Confirmations, and Permissions

The most robust approach combines all three techniques:

Permissions define what the agent is allowed to do in principle
Restrictions limit the scope of allowed actions (which files, which recipients, etc.)
Confirmations require user approval for high-stakes actions

Here's how they work together:

1## Using Claude Sonnet 4.5 for comprehensive action safety
2class SafeAgent:
3    def __init__(self, permissions: Set[Permission]):
4        self.permissions = PermissionManager(permissions)
5        self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6        self.pending_confirmations = {}
7    
8    def send_email(self, to_address: str, subject: str, body: str) -> str:
9        """Send an email with permission check, restriction, and confirmation"""
10        # Step 1: Check permission
11        self.permissions.require_permission(
12            Permission.SEND_EMAIL,
13            f"Sending email to: {to_address}"
14        )
15        
16        # Step 2: Check restrictions
17        if to_address not in ALLOWED_RECIPIENTS:
18            domain = to_address.split('@')[1] if '@' in to_address else ''
19            if domain not in ALLOWED_DOMAINS:
20                return f"Error: Cannot send email to {to_address}. Not in allowed recipients."
21        
22        # Step 3: Request confirmation
23        confirmation_id = f"email_{len(self.pending_confirmations)}"
24        self.pending_confirmations[confirmation_id] = {
25            'action': 'send_email',
26            'params': {
27                'to_address': to_address,
28                'subject': subject,
29                'body': body
30            }
31        }
32        
33        return f"""Ready to send email:
34
35To: {to_address}
36Subject: {subject}
37Body: {body}
38
39Confirm with: agent.confirm('{confirmation_id}')
40Cancel with: agent.cancel('{confirmation_id}')"""
41    
42    def confirm(self, confirmation_id: str) -> str:
43        """Execute a pending action after confirmation"""
44        if confirmation_id not in self.pending_confirmations:
45            return "No pending action with that ID."
46        
47        action_data = self.pending_confirmations[confirmation_id]
48        
49        if action_data['action'] == 'send_email':
50            params = action_data['params']
51            result = send_email_restricted(
52                params['to_address'],
53                params['subject'],
54                params['body']
55            )
56            del self.pending_confirmations[confirmation_id]
57            return f"Confirmed. {result}"
58    
59    def cancel(self, confirmation_id: str) -> str:
60        """Cancel a pending action"""
61        if confirmation_id not in self.pending_confirmations:
62            return "No pending action with that ID."
63        
64        del self.pending_confirmations[confirmation_id]
65        return "Action cancelled."

1## Using Claude Sonnet 4.5 for comprehensive action safety
2class SafeAgent:
3    def __init__(self, permissions: Set[Permission]):
4        self.permissions = PermissionManager(permissions)
5        self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6        self.pending_confirmations = {}
7    
8    def send_email(self, to_address: str, subject: str, body: str) -> str:
9        """Send an email with permission check, restriction, and confirmation"""
10        # Step 1: Check permission
11        self.permissions.require_permission(
12            Permission.SEND_EMAIL,
13            f"Sending email to: {to_address}"
14        )
15        
16        # Step 2: Check restrictions
17        if to_address not in ALLOWED_RECIPIENTS:
18            domain = to_address.split('@')[1] if '@' in to_address else ''
19            if domain not in ALLOWED_DOMAINS:
20                return f"Error: Cannot send email to {to_address}. Not in allowed recipients."
21        
22        # Step 3: Request confirmation
23        confirmation_id = f"email_{len(self.pending_confirmations)}"
24        self.pending_confirmations[confirmation_id] = {
25            'action': 'send_email',
26            'params': {
27                'to_address': to_address,
28                'subject': subject,
29                'body': body
30            }
31        }
32        
33        return f"""Ready to send email:
34
35To: {to_address}
36Subject: {subject}
37Body: {body}
38
39Confirm with: agent.confirm('{confirmation_id}')
40Cancel with: agent.cancel('{confirmation_id}')"""
41    
42    def confirm(self, confirmation_id: str) -> str:
43        """Execute a pending action after confirmation"""
44        if confirmation_id not in self.pending_confirmations:
45            return "No pending action with that ID."
46        
47        action_data = self.pending_confirmations[confirmation_id]
48        
49        if action_data['action'] == 'send_email':
50            params = action_data['params']
51            result = send_email_restricted(
52                params['to_address'],
53                params['subject'],
54                params['body']
55            )
56            del self.pending_confirmations[confirmation_id]
57            return f"Confirmed. {result}"
58    
59    def cancel(self, confirmation_id: str) -> str:
60        """Cancel a pending action"""
61        if confirmation_id not in self.pending_confirmations:
62            return "No pending action with that ID."
63        
64        del self.pending_confirmations[confirmation_id]
65        return "Action cancelled."

This gives you defense in depth. If one layer fails, the others provide backup protection:

If the agent somehow bypasses the permission check, the restriction will still block unauthorized recipients
If the restriction is misconfigured, the confirmation gives the user a chance to catch the error
If the user accidentally confirms, the permission log provides an audit trail

Practical Guidelines for Action Safety

As you implement action restrictions for your own agent, keep these guidelines in mind:

Start with the minimum: When adding a new tool, give it the most restrictive permissions possible. You can always loosen restrictions later, but it's harder to tighten them once users expect certain capabilities.

Make restrictions visible: When the agent can't do something, make sure it explains why. A message like "I don't have permission to delete files" is much better than a generic error.

Log everything: Keep a record of what actions the agent attempted, which were allowed, and which were blocked. This helps you understand how the agent is being used and whether your restrictions are too tight or too loose.

Test adversarially: Try to trick your agent into doing things it shouldn't. Ask it to email someone outside the allowed list. Try to make it access files outside its sandbox. See where the weaknesses are.

Layer your defenses: Don't rely on a single protection mechanism. Use permissions, restrictions, confirmations, and sandboxing together.

Consider the context: A personal assistant running on your laptop might need different restrictions than one deployed as a service for multiple users. Adjust your safety measures to match the risk level.

When to Use Each Technique

Different situations call for different approaches:

Use permissions when you want to completely disable certain capabilities. If your agent should never send emails, don't give it the permission.

Use restrictions when you want to limit the scope of allowed actions. The agent can send emails, but only to certain people.

Use confirmations when actions are risky but sometimes necessary. The agent can delete files, but only after you approve each deletion.

Use sandboxing when you need strong isolation. If your agent runs untrusted code or processes user-uploaded files, put it in a sandbox.

For our personal assistant, here's a reasonable configuration:

Permissions: Read files, send emails, make API calls (no delete, no system commands)
Restrictions: Only read from designated folders, only email known contacts
Confirmations: Required for sending emails, making purchases, posting publicly
Sandboxing: Not needed for basic assistant tasks, but useful if adding code execution

This gives the agent enough power to be useful while keeping risks manageable.

Glossary

Action Restriction: A limit on what an agent can do with a given capability, such as restricting file access to a specific directory or email sending to approved recipients.

Confirmation Step: A safety mechanism where the agent requests explicit user approval before executing a potentially risky action, showing exactly what will happen.

Least Privilege: The security principle of granting an agent only the minimum permissions necessary to perform its intended function, reducing potential harm from errors or misuse.

Permission: An authorization that determines whether an agent is allowed to perform a specific type of action, such as reading files or sending emails.

Permission System: A structured framework for managing and checking what actions an agent is allowed to perform, typically including permission definitions, checks, and audit logging.

Sandbox: An isolated environment where an agent can operate without affecting the broader system, typically with restricted access to files, network, and system resources.

Quiz

Ready to test your understanding? Take this quick quiz to reinforce what you've learned about action restrictions and permissions for AI agents.

Loading component...

Back to AI Agent Handbook

Previous Chapter

Content Safety and Moderation

Next Chapter

Ethical Guidelines and Human Oversight

Reference

BIBTEXAcademic

@misc{actionrestrictionsandpermissionscontrollingwhatyouraiagentcando, author = {Michael Brenndoerfer}, title = {Action Restrictions and Permissions: Controlling What Your AI Agent Can Do}, year = {2025}, url = {https://mbrenndoerfer.com/writing/action-restrictions-and-permissions-ai-agents}, organization = {mbrenndoerfer.com}, note = {Accessed: 2025-11-10} }

APAAcademic

Michael Brenndoerfer (2025). Action Restrictions and Permissions: Controlling What Your AI Agent Can Do. Retrieved from https://mbrenndoerfer.com/writing/action-restrictions-and-permissions-ai-agents

MLAAcademic

Michael Brenndoerfer. "Action Restrictions and Permissions: Controlling What Your AI Agent Can Do." 2025. Web. 11/10/2025. <https://mbrenndoerfer.com/writing/action-restrictions-and-permissions-ai-agents>.

CHICAGOAcademic

Michael Brenndoerfer. "Action Restrictions and Permissions: Controlling What Your AI Agent Can Do." Accessed 11/10/2025. https://mbrenndoerfer.com/writing/action-restrictions-and-permissions-ai-agents.

HARVARDAcademic

Michael Brenndoerfer (2025) 'Action Restrictions and Permissions: Controlling What Your AI Agent Can Do'. Available at: https://mbrenndoerfer.com/writing/action-restrictions-and-permissions-ai-agents (Accessed: 11/10/2025).

SimpleBasic

Michael Brenndoerfer (2025). Action Restrictions and Permissions: Controlling What Your AI Agent Can Do. https://mbrenndoerfer.com/writing/action-restrictions-and-permissions-ai-agents

Direct link:

https://mbrenndoerfer.com/writing/action-restrictions-and-permissions-ai-agents

Part of AI Agent Handbook

This article is part of the free-to-read AI Agent Handbook

View full handbook

About the author: Michael Brenndoerfer

All opinions expressed here are my own and do not reflect the views of my employer.

Michael currently works as an Associate Director of Data Science at EQT Partners in Singapore, where he drives AI and data initiatives across private capital investments.

With over a decade of experience spanning private equity, management consulting, and software engineering, he specializes in building and scaling analytics capabilities from the ground up. He has published research in leading AI conferences and holds expertise in machine learning, natural language processing, and value creation through data.

View Full Resume Publications

InteractiveAction Restrictions and Permissions: Controlling What Your AI Agent Can Do