Learn how to implement action restrictions and permissions for AI agents using the principle of least privilege, confirmation steps, and sandboxing to keep your agent powerful but safe.

This article is part of the free-to-read AI Agent Handbook
Action Restrictions and Permissions
In the previous chapter, you learned how to keep your agent's outputs safe through content moderation. But what about its actions? When your agent can send emails, modify files, or make API calls, filtering text isn't enough. You need to control what it's allowed to do in the first place.
Think about how permissions work on your computer. When you install an app, it asks for specific permissions: access to your camera, your files, your location. The app doesn't get unlimited power. It gets exactly what it needs to do its job, and no more. Your AI agent should work the same way.
This chapter explores how to implement action restrictions and permissions for our personal assistant. You'll learn the principle of least privilege (giving the agent only the access it needs), how to add confirmation steps for risky actions, and how to sandbox the agent's environment. By the end, you'll have an agent that's powerful but constrained, capable but safe.
The Problem with Unrestricted Actions
Let's start by understanding what can go wrong. Imagine you've given your assistant the ability to send emails on your behalf. Here's a simple tool implementation:
1## Using Claude Sonnet 4.5 for its strong tool use capabilities
2import anthropic
3import smtplib
4from email.mime.text import MIMEText
5
6def send_email(to_address, subject, body):
7 """Send an email via SMTP"""
8 msg = MIMEText(body)
9 msg['Subject'] = subject
10 msg['From'] = 'your.email@example.com'
11 msg['To'] = to_address
12
13 with smtplib.SMTP('smtp.example.com', 587) as server:
14 server.starttls()
15 server.login('your.email@example.com', 'PASSWORD')
16 server.send_message(msg)
17
18 return f"Email sent to {to_address}"
19
20## Define the tool for the agent
21tools = [{
22 "name": "send_email",
23 "description": "Send an email to a specified address",
24 "input_schema": {
25 "type": "object",
26 "properties": {
27 "to_address": {"type": "string"},
28 "subject": {"type": "string"},
29 "body": {"type": "string"}
30 },
31 "required": ["to_address", "subject", "body"]
32 }
33}]1## Using Claude Sonnet 4.5 for its strong tool use capabilities
2import anthropic
3import smtplib
4from email.mime.text import MIMEText
5
6def send_email(to_address, subject, body):
7 """Send an email via SMTP"""
8 msg = MIMEText(body)
9 msg['Subject'] = subject
10 msg['From'] = 'your.email@example.com'
11 msg['To'] = to_address
12
13 with smtplib.SMTP('smtp.example.com', 587) as server:
14 server.starttls()
15 server.login('your.email@example.com', 'PASSWORD')
16 server.send_message(msg)
17
18 return f"Email sent to {to_address}"
19
20## Define the tool for the agent
21tools = [{
22 "name": "send_email",
23 "description": "Send an email to a specified address",
24 "input_schema": {
25 "type": "object",
26 "properties": {
27 "to_address": {"type": "string"},
28 "subject": {"type": "string"},
29 "body": {"type": "string"}
30 },
31 "required": ["to_address", "subject", "body"]
32 }
33}]This works, but it's dangerous. The agent can now send any email to anyone, at any time, without asking. What if it misunderstands a request? What if a user tries to trick it into spamming someone? What if there's a bug in your code that causes it to send the same email repeatedly?
These aren't hypothetical concerns. When you give an agent the power to take actions in the real world, you need safeguards. Let's explore how to add them.
Principle of Least Privilege
The first rule of action safety is simple: give your agent the minimum permissions it needs to do its job. This is called the principle of least privilege, and it's a fundamental concept in security.
Let's apply this to our email example. Instead of letting the agent email anyone, we might restrict it to only emailing people in your contacts list, or only emailing specific domains. Here's how:
1## Using Claude Sonnet 4.5 for agent reasoning with restricted tools
2import anthropic
3
4ALLOWED_DOMAINS = ['example.com', 'trusted-partner.com']
5ALLOWED_RECIPIENTS = ['alice@example.com', 'bob@example.com']
6
7def send_email_restricted(to_address, subject, body):
8 """Send an email with domain and recipient restrictions"""
9 # Check if recipient is in allowed list
10 if to_address not in ALLOWED_RECIPIENTS:
11 # Check if domain is allowed
12 domain = to_address.split('@')[1] if '@' in to_address else ''
13 if domain not in ALLOWED_DOMAINS:
14 return f"Error: Cannot send email to {to_address}. Not in allowed recipients or domains."
15
16 # If we get here, the recipient is allowed
17 msg = MIMEText(body)
18 msg['Subject'] = subject
19 msg['From'] = 'your.email@example.com'
20 msg['To'] = to_address
21
22 with smtplib.SMTP('smtp.example.com', 587) as server:
23 server.starttls()
24 server.login('your.email@example.com', 'PASSWORD')
25 server.send_message(msg)
26
27 return f"Email sent to {to_address}"1## Using Claude Sonnet 4.5 for agent reasoning with restricted tools
2import anthropic
3
4ALLOWED_DOMAINS = ['example.com', 'trusted-partner.com']
5ALLOWED_RECIPIENTS = ['alice@example.com', 'bob@example.com']
6
7def send_email_restricted(to_address, subject, body):
8 """Send an email with domain and recipient restrictions"""
9 # Check if recipient is in allowed list
10 if to_address not in ALLOWED_RECIPIENTS:
11 # Check if domain is allowed
12 domain = to_address.split('@')[1] if '@' in to_address else ''
13 if domain not in ALLOWED_DOMAINS:
14 return f"Error: Cannot send email to {to_address}. Not in allowed recipients or domains."
15
16 # If we get here, the recipient is allowed
17 msg = MIMEText(body)
18 msg['Subject'] = subject
19 msg['From'] = 'your.email@example.com'
20 msg['To'] = to_address
21
22 with smtplib.SMTP('smtp.example.com', 587) as server:
23 server.starttls()
24 server.login('your.email@example.com', 'PASSWORD')
25 server.send_message(msg)
26
27 return f"Email sent to {to_address}"Now the agent can only email specific people or specific domains. If it tries to email someone else, the function returns an error. This is a hard constraint that the agent can't bypass, no matter what the user asks for.
You can apply the same principle to other tools:
File access: Instead of giving the agent access to your entire filesystem, restrict it to a specific directory:
1import os
2
3ALLOWED_DIRECTORY = '/home/user/assistant_workspace'
4
5def read_file_restricted(filename):
6 """Read a file only from the allowed directory"""
7 # Construct the full path
8 full_path = os.path.join(ALLOWED_DIRECTORY, filename)
9
10 # Make sure the path is actually inside the allowed directory
11 # (prevents tricks like "../../../etc/passwd")
12 real_path = os.path.realpath(full_path)
13 if not real_path.startswith(os.path.realpath(ALLOWED_DIRECTORY)):
14 return "Error: Access denied. File is outside allowed directory."
15
16 # Check if file exists
17 if not os.path.exists(real_path):
18 return f"Error: File {filename} not found."
19
20 # Read and return the file
21 with open(real_path, 'r') as f:
22 return f.read()1import os
2
3ALLOWED_DIRECTORY = '/home/user/assistant_workspace'
4
5def read_file_restricted(filename):
6 """Read a file only from the allowed directory"""
7 # Construct the full path
8 full_path = os.path.join(ALLOWED_DIRECTORY, filename)
9
10 # Make sure the path is actually inside the allowed directory
11 # (prevents tricks like "../../../etc/passwd")
12 real_path = os.path.realpath(full_path)
13 if not real_path.startswith(os.path.realpath(ALLOWED_DIRECTORY)):
14 return "Error: Access denied. File is outside allowed directory."
15
16 # Check if file exists
17 if not os.path.exists(real_path):
18 return f"Error: File {filename} not found."
19
20 # Read and return the file
21 with open(real_path, 'r') as f:
22 return f.read()API calls: Limit which APIs the agent can call and what operations it can perform:
1ALLOWED_API_ENDPOINTS = [
2 'https://api.weather.com/current',
3 'https://api.calendar.com/events'
4]
5
6def make_api_call(endpoint, method='GET', data=None):
7 """Make an API call to allowed endpoints only"""
8 if endpoint not in ALLOWED_API_ENDPOINTS:
9 return f"Error: API endpoint {endpoint} is not allowed."
10
11 if method not in ['GET', 'POST']:
12 return f"Error: HTTP method {method} is not allowed."
13
14 # Make the actual API call
15 # ... implementation ...1ALLOWED_API_ENDPOINTS = [
2 'https://api.weather.com/current',
3 'https://api.calendar.com/events'
4]
5
6def make_api_call(endpoint, method='GET', data=None):
7 """Make an API call to allowed endpoints only"""
8 if endpoint not in ALLOWED_API_ENDPOINTS:
9 return f"Error: API endpoint {endpoint} is not allowed."
10
11 if method not in ['GET', 'POST']:
12 return f"Error: HTTP method {method} is not allowed."
13
14 # Make the actual API call
15 # ... implementation ...The pattern is consistent: before the tool does anything, it checks whether the action is allowed. If not, it returns an error. The agent sees the error and can explain to the user why the action wasn't possible.
Confirmation Steps for Risky Actions
Some actions are too risky to perform automatically, even if they're technically allowed. For these, you want the agent to ask for confirmation first.
Think about how your smartphone handles this. When an app wants to access your camera, it doesn't just do it. It asks: "Allow this app to access your camera?" You have to explicitly approve.
Your agent should do the same for high-stakes actions. Here's how to implement confirmation for email sending:
1## Using Claude Sonnet 4.5 for multi-turn confirmation flows
2import anthropic
3
4class AgentWithConfirmation:
5 def __init__(self):
6 self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
7 self.pending_actions = {}
8
9 def run(self, user_message, conversation_history=None):
10 """Run the agent with confirmation support"""
11 if conversation_history is None:
12 conversation_history = []
13
14 # Add user message to history
15 conversation_history.append({
16 "role": "user",
17 "content": user_message
18 })
19
20 # Define tools with confirmation requirement
21 tools = [{
22 "name": "send_email",
23 "description": "Send an email. Requires user confirmation before sending.",
24 "input_schema": {
25 "type": "object",
26 "properties": {
27 "to_address": {"type": "string"},
28 "subject": {"type": "string"},
29 "body": {"type": "string"}
30 },
31 "required": ["to_address", "subject", "body"]
32 }
33 }]
34
35 # Call the model
36 response = self.client.messages.create(
37 model="claude-sonnet-4.5",
38 max_tokens=2048,
39 messages=conversation_history,
40 tools=tools
41 )
42
43 # Check if agent wants to use a tool
44 if response.stop_reason == "tool_use":
45 tool_use = next(block for block in response.content if block.type == "tool_use")
46
47 if tool_use.name == "send_email":
48 # Store the pending action
49 action_id = "email_001"
50 self.pending_actions[action_id] = tool_use.input
51
52 # Ask user for confirmation
53 confirmation_message = f"""I'm ready to send this email:
54
55To: {tool_use.input['to_address']}
56Subject: {tool_use.input['subject']}
57Body: {tool_use.input['body']}
58
59Do you want me to send this email? (Reply 'yes' to confirm or 'no' to cancel)"""
60
61 return confirmation_message, conversation_history
62
63 # Return normal response
64 return response.content[0].text, conversation_history
65
66 def confirm_action(self, action_id, confirmed):
67 """Execute or cancel a pending action based on user confirmation"""
68 if action_id not in self.pending_actions:
69 return "No pending action found."
70
71 action = self.pending_actions[action_id]
72
73 if confirmed:
74 # Execute the action
75 result = send_email_restricted(
76 action['to_address'],
77 action['subject'],
78 action['body']
79 )
80 del self.pending_actions[action_id]
81 return f"Confirmed. {result}"
82 else:
83 # Cancel the action
84 del self.pending_actions[action_id]
85 return "Action cancelled."1## Using Claude Sonnet 4.5 for multi-turn confirmation flows
2import anthropic
3
4class AgentWithConfirmation:
5 def __init__(self):
6 self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
7 self.pending_actions = {}
8
9 def run(self, user_message, conversation_history=None):
10 """Run the agent with confirmation support"""
11 if conversation_history is None:
12 conversation_history = []
13
14 # Add user message to history
15 conversation_history.append({
16 "role": "user",
17 "content": user_message
18 })
19
20 # Define tools with confirmation requirement
21 tools = [{
22 "name": "send_email",
23 "description": "Send an email. Requires user confirmation before sending.",
24 "input_schema": {
25 "type": "object",
26 "properties": {
27 "to_address": {"type": "string"},
28 "subject": {"type": "string"},
29 "body": {"type": "string"}
30 },
31 "required": ["to_address", "subject", "body"]
32 }
33 }]
34
35 # Call the model
36 response = self.client.messages.create(
37 model="claude-sonnet-4.5",
38 max_tokens=2048,
39 messages=conversation_history,
40 tools=tools
41 )
42
43 # Check if agent wants to use a tool
44 if response.stop_reason == "tool_use":
45 tool_use = next(block for block in response.content if block.type == "tool_use")
46
47 if tool_use.name == "send_email":
48 # Store the pending action
49 action_id = "email_001"
50 self.pending_actions[action_id] = tool_use.input
51
52 # Ask user for confirmation
53 confirmation_message = f"""I'm ready to send this email:
54
55To: {tool_use.input['to_address']}
56Subject: {tool_use.input['subject']}
57Body: {tool_use.input['body']}
58
59Do you want me to send this email? (Reply 'yes' to confirm or 'no' to cancel)"""
60
61 return confirmation_message, conversation_history
62
63 # Return normal response
64 return response.content[0].text, conversation_history
65
66 def confirm_action(self, action_id, confirmed):
67 """Execute or cancel a pending action based on user confirmation"""
68 if action_id not in self.pending_actions:
69 return "No pending action found."
70
71 action = self.pending_actions[action_id]
72
73 if confirmed:
74 # Execute the action
75 result = send_email_restricted(
76 action['to_address'],
77 action['subject'],
78 action['body']
79 )
80 del self.pending_actions[action_id]
81 return f"Confirmed. {result}"
82 else:
83 # Cancel the action
84 del self.pending_actions[action_id]
85 return "Action cancelled."Let's see this in action:
1User: Send an email to alice@example.com with subject "Meeting Tomorrow"
2 and body "Let's meet at 2pm"
3
4Agent: I'm ready to send this email:
5
6To: alice@example.com
7Subject: Meeting Tomorrow
8Body: Let's meet at 2pm
9
10Do you want me to send this email? (Reply 'yes' to confirm or 'no' to cancel)
11
12User: yes
13
14Agent: Confirmed. Email sent to alice@example.com1User: Send an email to alice@example.com with subject "Meeting Tomorrow"
2 and body "Let's meet at 2pm"
3
4Agent: I'm ready to send this email:
5
6To: alice@example.com
7Subject: Meeting Tomorrow
8Body: Let's meet at 2pm
9
10Do you want me to send this email? (Reply 'yes' to confirm or 'no' to cancel)
11
12User: yes
13
14Agent: Confirmed. Email sent to alice@example.comThe agent prepares the action but doesn't execute it. It shows you exactly what it's about to do and waits for your approval. Only when you say "yes" does the email actually get sent.
This pattern works for any risky action:
- Deleting files: Show which files will be deleted
- Making purchases: Show the item and price
- Posting to social media: Show the exact content to be posted
- Modifying data: Show what will change
The key is to make the confirmation specific. Don't just ask "Is this okay?" Show exactly what will happen so the user can make an informed decision.
Sandboxing the Agent's Environment
Even with restrictions and confirmations, you might want an extra layer of protection: running the agent in a sandbox. A sandbox is a restricted environment where the agent can operate without affecting the rest of your system.
Think of it like a playground with a fence. The agent can do whatever it wants inside the sandbox, but it can't get out and affect anything beyond the fence.
Here's a simple example using Docker to sandbox file operations:
1## Using Claude Sonnet 4.5 with sandboxed file operations
2import docker
3import tempfile
4import os
5
6class SandboxedAgent:
7 def __init__(self):
8 self.client = docker.from_env()
9 # Create a temporary directory for the sandbox
10 self.sandbox_dir = tempfile.mkdtemp()
11
12 def execute_file_operation(self, code):
13 """Execute Python code in a sandboxed Docker container"""
14 # Create a Python script with the code
15 script_path = os.path.join(self.sandbox_dir, 'script.py')
16 with open(script_path, 'w') as f:
17 f.write(code)
18
19 # Run the code in a Docker container with limited permissions
20 container = self.client.containers.run(
21 'python:3.9-slim',
22 f'python /sandbox/script.py',
23 volumes={
24 self.sandbox_dir: {'bind': '/sandbox', 'mode': 'rw'}
25 },
26 network_disabled=True, # No network access
27 mem_limit='256m', # Limited memory
28 cpu_period=100000,
29 cpu_quota=50000, # Limited CPU
30 remove=True,
31 detach=False
32 )
33
34 return container.decode('utf-8')
35
36 def cleanup(self):
37 """Clean up the sandbox directory"""
38 import shutil
39 shutil.rmtree(self.sandbox_dir)1## Using Claude Sonnet 4.5 with sandboxed file operations
2import docker
3import tempfile
4import os
5
6class SandboxedAgent:
7 def __init__(self):
8 self.client = docker.from_env()
9 # Create a temporary directory for the sandbox
10 self.sandbox_dir = tempfile.mkdtemp()
11
12 def execute_file_operation(self, code):
13 """Execute Python code in a sandboxed Docker container"""
14 # Create a Python script with the code
15 script_path = os.path.join(self.sandbox_dir, 'script.py')
16 with open(script_path, 'w') as f:
17 f.write(code)
18
19 # Run the code in a Docker container with limited permissions
20 container = self.client.containers.run(
21 'python:3.9-slim',
22 f'python /sandbox/script.py',
23 volumes={
24 self.sandbox_dir: {'bind': '/sandbox', 'mode': 'rw'}
25 },
26 network_disabled=True, # No network access
27 mem_limit='256m', # Limited memory
28 cpu_period=100000,
29 cpu_quota=50000, # Limited CPU
30 remove=True,
31 detach=False
32 )
33
34 return container.decode('utf-8')
35
36 def cleanup(self):
37 """Clean up the sandbox directory"""
38 import shutil
39 shutil.rmtree(self.sandbox_dir)This sandbox provides several protections:
Isolated filesystem: The agent can only access files in the sandbox directory. It can't read or modify anything else on your system.
No network access: The agent can't make network requests, preventing it from sending data to external servers.
Resource limits: The agent gets limited CPU and memory, preventing it from consuming all your system resources.
Automatic cleanup: When you're done, you can delete the entire sandbox, removing any files the agent created.
For most personal assistants, full Docker sandboxing might be overkill. But the principle is valuable: isolate risky operations so they can't affect the rest of your system.
A lighter-weight approach is to use Python's built-in restrictions:
1import os
2import sys
3
4def run_restricted_code(code):
5 """Run Python code with restricted built-ins"""
6 # Create a restricted environment
7 restricted_globals = {
8 '__builtins__': {
9 'print': print,
10 'len': len,
11 'range': range,
12 'str': str,
13 'int': int,
14 'float': float,
15 'list': list,
16 'dict': dict,
17 # Add only safe built-ins
18 }
19 }
20
21 # Execute the code in the restricted environment
22 try:
23 exec(code, restricted_globals)
24 return "Code executed successfully"
25 except Exception as e:
26 return f"Error: {str(e)}"
27
28## Example: This will work
29result = run_restricted_code("print('Hello, world!')")
30print(result) # "Code executed successfully"
31
32## Example: This will fail (no file access)
33result = run_restricted_code("open('/etc/passwd', 'r')")
34print(result) # "Error: name 'open' is not defined"1import os
2import sys
3
4def run_restricted_code(code):
5 """Run Python code with restricted built-ins"""
6 # Create a restricted environment
7 restricted_globals = {
8 '__builtins__': {
9 'print': print,
10 'len': len,
11 'range': range,
12 'str': str,
13 'int': int,
14 'float': float,
15 'list': list,
16 'dict': dict,
17 # Add only safe built-ins
18 }
19 }
20
21 # Execute the code in the restricted environment
22 try:
23 exec(code, restricted_globals)
24 return "Code executed successfully"
25 except Exception as e:
26 return f"Error: {str(e)}"
27
28## Example: This will work
29result = run_restricted_code("print('Hello, world!')")
30print(result) # "Code executed successfully"
31
32## Example: This will fail (no file access)
33result = run_restricted_code("open('/etc/passwd', 'r')")
34print(result) # "Error: name 'open' is not defined"This approach removes dangerous built-ins like open, eval, and __import__, preventing the agent from accessing files or importing modules. It's not as secure as Docker, but it's much simpler and works for many use cases.
Designing a Permission System
As your agent grows more capable, you'll want a more structured approach to permissions. Instead of hardcoding restrictions in each tool, you can create a permission system that manages what the agent can do.
Here's a simple permission system:
1## Using Claude Sonnet 4.5 with a structured permission system
2from enum import Enum
3from typing import Set, Dict, Any
4
5class Permission(Enum):
6 READ_FILES = "read_files"
7 WRITE_FILES = "write_files"
8 SEND_EMAIL = "send_email"
9 MAKE_API_CALLS = "make_api_calls"
10 DELETE_DATA = "delete_data"
11
12class PermissionManager:
13 def __init__(self, granted_permissions: Set[Permission]):
14 self.granted_permissions = granted_permissions
15 self.permission_log = []
16
17 def check_permission(self, permission: Permission, action_details: str) -> bool:
18 """Check if a permission is granted and log the check"""
19 has_permission = permission in self.granted_permissions
20
21 self.permission_log.append({
22 'permission': permission.value,
23 'action': action_details,
24 'granted': has_permission
25 })
26
27 return has_permission
28
29 def require_permission(self, permission: Permission, action_details: str):
30 """Raise an error if permission is not granted"""
31 if not self.check_permission(permission, action_details):
32 raise PermissionError(
33 f"Action requires {permission.value} permission: {action_details}"
34 )
35
36class PermissionedAgent:
37 def __init__(self, permissions: Set[Permission]):
38 self.permissions = PermissionManager(permissions)
39 self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
40
41 def read_file(self, filename: str) -> str:
42 """Read a file if permission is granted"""
43 self.permissions.require_permission(
44 Permission.READ_FILES,
45 f"Reading file: {filename}"
46 )
47
48 # Permission granted, proceed with reading
49 with open(filename, 'r') as f:
50 return f.read()
51
52 def send_email(self, to_address: str, subject: str, body: str) -> str:
53 """Send an email if permission is granted"""
54 self.permissions.require_permission(
55 Permission.SEND_EMAIL,
56 f"Sending email to: {to_address}"
57 )
58
59 # Permission granted, proceed with sending
60 return send_email_restricted(to_address, subject, body)
61
62 def get_permission_log(self) -> list:
63 """Get a log of all permission checks"""
64 return self.permissions.permission_log1## Using Claude Sonnet 4.5 with a structured permission system
2from enum import Enum
3from typing import Set, Dict, Any
4
5class Permission(Enum):
6 READ_FILES = "read_files"
7 WRITE_FILES = "write_files"
8 SEND_EMAIL = "send_email"
9 MAKE_API_CALLS = "make_api_calls"
10 DELETE_DATA = "delete_data"
11
12class PermissionManager:
13 def __init__(self, granted_permissions: Set[Permission]):
14 self.granted_permissions = granted_permissions
15 self.permission_log = []
16
17 def check_permission(self, permission: Permission, action_details: str) -> bool:
18 """Check if a permission is granted and log the check"""
19 has_permission = permission in self.granted_permissions
20
21 self.permission_log.append({
22 'permission': permission.value,
23 'action': action_details,
24 'granted': has_permission
25 })
26
27 return has_permission
28
29 def require_permission(self, permission: Permission, action_details: str):
30 """Raise an error if permission is not granted"""
31 if not self.check_permission(permission, action_details):
32 raise PermissionError(
33 f"Action requires {permission.value} permission: {action_details}"
34 )
35
36class PermissionedAgent:
37 def __init__(self, permissions: Set[Permission]):
38 self.permissions = PermissionManager(permissions)
39 self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
40
41 def read_file(self, filename: str) -> str:
42 """Read a file if permission is granted"""
43 self.permissions.require_permission(
44 Permission.READ_FILES,
45 f"Reading file: {filename}"
46 )
47
48 # Permission granted, proceed with reading
49 with open(filename, 'r') as f:
50 return f.read()
51
52 def send_email(self, to_address: str, subject: str, body: str) -> str:
53 """Send an email if permission is granted"""
54 self.permissions.require_permission(
55 Permission.SEND_EMAIL,
56 f"Sending email to: {to_address}"
57 )
58
59 # Permission granted, proceed with sending
60 return send_email_restricted(to_address, subject, body)
61
62 def get_permission_log(self) -> list:
63 """Get a log of all permission checks"""
64 return self.permissions.permission_logNow you can create agents with different permission levels:
1## Create a read-only agent
2read_only_agent = PermissionedAgent({
3 Permission.READ_FILES,
4 Permission.MAKE_API_CALLS
5})
6
7## Create a full-access agent
8full_agent = PermissionedAgent({
9 Permission.READ_FILES,
10 Permission.WRITE_FILES,
11 Permission.SEND_EMAIL,
12 Permission.MAKE_API_CALLS
13})
14
15## Try to send an email with the read-only agent
16try:
17 read_only_agent.send_email('alice@example.com', 'Test', 'Hello')
18except PermissionError as e:
19 print(f"Permission denied: {e}")
20 # Output: Permission denied: Action requires send_email permission:
21 # Sending email to: alice@example.com1## Create a read-only agent
2read_only_agent = PermissionedAgent({
3 Permission.READ_FILES,
4 Permission.MAKE_API_CALLS
5})
6
7## Create a full-access agent
8full_agent = PermissionedAgent({
9 Permission.READ_FILES,
10 Permission.WRITE_FILES,
11 Permission.SEND_EMAIL,
12 Permission.MAKE_API_CALLS
13})
14
15## Try to send an email with the read-only agent
16try:
17 read_only_agent.send_email('alice@example.com', 'Test', 'Hello')
18except PermissionError as e:
19 print(f"Permission denied: {e}")
20 # Output: Permission denied: Action requires send_email permission:
21 # Sending email to: alice@example.comThis system gives you several benefits:
Centralized control: All permission logic is in one place, making it easy to audit and modify.
Clear permissions: You can see at a glance what each agent is allowed to do.
Audit trail: The permission log shows every action the agent attempted and whether it was allowed.
Flexible configuration: You can easily create agents with different permission levels for different use cases.
Combining Restrictions, Confirmations, and Permissions
The most robust approach combines all three techniques:
- Permissions define what the agent is allowed to do in principle
- Restrictions limit the scope of allowed actions (which files, which recipients, etc.)
- Confirmations require user approval for high-stakes actions
Here's how they work together:
1## Using Claude Sonnet 4.5 for comprehensive action safety
2class SafeAgent:
3 def __init__(self, permissions: Set[Permission]):
4 self.permissions = PermissionManager(permissions)
5 self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6 self.pending_confirmations = {}
7
8 def send_email(self, to_address: str, subject: str, body: str) -> str:
9 """Send an email with permission check, restriction, and confirmation"""
10 # Step 1: Check permission
11 self.permissions.require_permission(
12 Permission.SEND_EMAIL,
13 f"Sending email to: {to_address}"
14 )
15
16 # Step 2: Check restrictions
17 if to_address not in ALLOWED_RECIPIENTS:
18 domain = to_address.split('@')[1] if '@' in to_address else ''
19 if domain not in ALLOWED_DOMAINS:
20 return f"Error: Cannot send email to {to_address}. Not in allowed recipients."
21
22 # Step 3: Request confirmation
23 confirmation_id = f"email_{len(self.pending_confirmations)}"
24 self.pending_confirmations[confirmation_id] = {
25 'action': 'send_email',
26 'params': {
27 'to_address': to_address,
28 'subject': subject,
29 'body': body
30 }
31 }
32
33 return f"""Ready to send email:
34
35To: {to_address}
36Subject: {subject}
37Body: {body}
38
39Confirm with: agent.confirm('{confirmation_id}')
40Cancel with: agent.cancel('{confirmation_id}')"""
41
42 def confirm(self, confirmation_id: str) -> str:
43 """Execute a pending action after confirmation"""
44 if confirmation_id not in self.pending_confirmations:
45 return "No pending action with that ID."
46
47 action_data = self.pending_confirmations[confirmation_id]
48
49 if action_data['action'] == 'send_email':
50 params = action_data['params']
51 result = send_email_restricted(
52 params['to_address'],
53 params['subject'],
54 params['body']
55 )
56 del self.pending_confirmations[confirmation_id]
57 return f"Confirmed. {result}"
58
59 def cancel(self, confirmation_id: str) -> str:
60 """Cancel a pending action"""
61 if confirmation_id not in self.pending_confirmations:
62 return "No pending action with that ID."
63
64 del self.pending_confirmations[confirmation_id]
65 return "Action cancelled."1## Using Claude Sonnet 4.5 for comprehensive action safety
2class SafeAgent:
3 def __init__(self, permissions: Set[Permission]):
4 self.permissions = PermissionManager(permissions)
5 self.client = anthropic.Anthropic(api_key="ANTHROPIC_API_KEY")
6 self.pending_confirmations = {}
7
8 def send_email(self, to_address: str, subject: str, body: str) -> str:
9 """Send an email with permission check, restriction, and confirmation"""
10 # Step 1: Check permission
11 self.permissions.require_permission(
12 Permission.SEND_EMAIL,
13 f"Sending email to: {to_address}"
14 )
15
16 # Step 2: Check restrictions
17 if to_address not in ALLOWED_RECIPIENTS:
18 domain = to_address.split('@')[1] if '@' in to_address else ''
19 if domain not in ALLOWED_DOMAINS:
20 return f"Error: Cannot send email to {to_address}. Not in allowed recipients."
21
22 # Step 3: Request confirmation
23 confirmation_id = f"email_{len(self.pending_confirmations)}"
24 self.pending_confirmations[confirmation_id] = {
25 'action': 'send_email',
26 'params': {
27 'to_address': to_address,
28 'subject': subject,
29 'body': body
30 }
31 }
32
33 return f"""Ready to send email:
34
35To: {to_address}
36Subject: {subject}
37Body: {body}
38
39Confirm with: agent.confirm('{confirmation_id}')
40Cancel with: agent.cancel('{confirmation_id}')"""
41
42 def confirm(self, confirmation_id: str) -> str:
43 """Execute a pending action after confirmation"""
44 if confirmation_id not in self.pending_confirmations:
45 return "No pending action with that ID."
46
47 action_data = self.pending_confirmations[confirmation_id]
48
49 if action_data['action'] == 'send_email':
50 params = action_data['params']
51 result = send_email_restricted(
52 params['to_address'],
53 params['subject'],
54 params['body']
55 )
56 del self.pending_confirmations[confirmation_id]
57 return f"Confirmed. {result}"
58
59 def cancel(self, confirmation_id: str) -> str:
60 """Cancel a pending action"""
61 if confirmation_id not in self.pending_confirmations:
62 return "No pending action with that ID."
63
64 del self.pending_confirmations[confirmation_id]
65 return "Action cancelled."This gives you defense in depth. If one layer fails, the others provide backup protection:
- If the agent somehow bypasses the permission check, the restriction will still block unauthorized recipients
- If the restriction is misconfigured, the confirmation gives the user a chance to catch the error
- If the user accidentally confirms, the permission log provides an audit trail
Practical Guidelines for Action Safety
As you implement action restrictions for your own agent, keep these guidelines in mind:
Start with the minimum: When adding a new tool, give it the most restrictive permissions possible. You can always loosen restrictions later, but it's harder to tighten them once users expect certain capabilities.
Make restrictions visible: When the agent can't do something, make sure it explains why. A message like "I don't have permission to delete files" is much better than a generic error.
Log everything: Keep a record of what actions the agent attempted, which were allowed, and which were blocked. This helps you understand how the agent is being used and whether your restrictions are too tight or too loose.
Test adversarially: Try to trick your agent into doing things it shouldn't. Ask it to email someone outside the allowed list. Try to make it access files outside its sandbox. See where the weaknesses are.
Layer your defenses: Don't rely on a single protection mechanism. Use permissions, restrictions, confirmations, and sandboxing together.
Consider the context: A personal assistant running on your laptop might need different restrictions than one deployed as a service for multiple users. Adjust your safety measures to match the risk level.
When to Use Each Technique
Different situations call for different approaches:
Use permissions when you want to completely disable certain capabilities. If your agent should never send emails, don't give it the permission.
Use restrictions when you want to limit the scope of allowed actions. The agent can send emails, but only to certain people.
Use confirmations when actions are risky but sometimes necessary. The agent can delete files, but only after you approve each deletion.
Use sandboxing when you need strong isolation. If your agent runs untrusted code or processes user-uploaded files, put it in a sandbox.
For our personal assistant, here's a reasonable configuration:
- Permissions: Read files, send emails, make API calls (no delete, no system commands)
- Restrictions: Only read from designated folders, only email known contacts
- Confirmations: Required for sending emails, making purchases, posting publicly
- Sandboxing: Not needed for basic assistant tasks, but useful if adding code execution
This gives the agent enough power to be useful while keeping risks manageable.
Glossary
Action Restriction: A limit on what an agent can do with a given capability, such as restricting file access to a specific directory or email sending to approved recipients.
Confirmation Step: A safety mechanism where the agent requests explicit user approval before executing a potentially risky action, showing exactly what will happen.
Least Privilege: The security principle of granting an agent only the minimum permissions necessary to perform its intended function, reducing potential harm from errors or misuse.
Permission: An authorization that determines whether an agent is allowed to perform a specific type of action, such as reading files or sending emails.
Permission System: A structured framework for managing and checking what actions an agent is allowed to perform, typically including permission definitions, checks, and audit logging.
Sandbox: An isolated environment where an agent can operate without affecting the broader system, typically with restricted access to files, network, and system resources.
Quiz
Ready to test your understanding? Take this quick quiz to reinforce what you've learned about action restrictions and permissions for AI agents.
Reference

About the author: Michael Brenndoerfer
All opinions expressed here are my own and do not reflect the views of my employer.
Michael currently works as an Associate Director of Data Science at EQT Partners in Singapore, where he drives AI and data initiatives across private capital investments.
With over a decade of experience spanning private equity, management consulting, and software engineering, he specializes in building and scaling analytics capabilities from the ground up. He has published research in leading AI conferences and holds expertise in machine learning, natural language processing, and value creation through data.
Related Content

Scaling Up without Breaking the Bank: AI Agent Performance & Cost Optimization at Scale
Learn how to scale AI agents from single users to thousands while maintaining performance and controlling costs. Covers horizontal scaling, load balancing, monitoring, cost controls, and prompt optimization strategies.

Managing and Reducing AI Agent Costs: Complete Guide to Cost Optimization Strategies
Learn how to dramatically reduce AI agent API costs without sacrificing capability. Covers model selection, caching, batching, prompt optimization, and budget controls with practical Python examples.

Speeding Up AI Agents: Performance Optimization Techniques for Faster Response Times
Learn practical techniques to make AI agents respond faster, including model selection strategies, response caching, streaming, parallel execution, and prompt optimization for reduced latency.
Stay updated
Get notified when I publish new articles on data and AI, private equity, technology, and more.

