Perception and Action: How AI Agents Sense and Respond to Their Environment

Michael BrenndoerferJuly 15, 202513 min read

Learn how AI agents perceive their environment through inputs, tool outputs, and memory, and how they take actions that change the world around them through the perception-action cycle.

Perception and Action

In the previous section, we defined what an environment is for our AI assistant. Now we'll explore the two fundamental ways our agent interacts with that environment: perception and action. Think of it like this: if the environment is the world our agent lives in, then perception is how it senses what's happening, and action is how it makes things happen.

Just like a self-driving car uses sensors to see the road and then steers or brakes to act, our AI assistant "senses" input and then acts by generating responses or using tools. But here's what makes this interesting: the environment isn't static. The agent's perceptions change its internal state, and its actions change the environment, which in turn creates new things to perceive. It's a continuous loop.

Let's break down how this works in practice.

What Is Perception?

For our personal assistant, perception is simpler than you might think. The agent perceives its environment by reading inputs. When you type a message, the agent "hears" it. When a tool returns data, the agent "sees" it. When it checks its memory, it "remembers" previous context.

Here's a concrete example. When you ask your assistant, "What's the weather like today?", several perceptions happen:

  1. User input perception: The agent receives your text query
  2. Context perception: It accesses its memory to understand you're asking about weather
  3. Tool output perception: After calling a weather API, it receives structured data about temperature, conditions, etc.

Each of these is a form of perception. The agent takes information from its environment and incorporates it into its understanding of the current situation.

Perception in Code (Claude Sonnet 4.5)

Let's see how perception works in a simple agent loop:

In[3]:
Code
import os
from anthropic import Anthropic

## Using Claude Sonnet 4.5 for its superior agent reasoning capabilities
client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))

def perceive_user_input(user_message):
    """
    Perception: Agent reads and processes user input
    Returns the perceived information in a structured format
    """
    return {
        "type": "user_message",
        "content": user_message,
        "timestamp": "2025-11-09T10:30:00Z"
    }

def perceive_tool_output(tool_name, tool_result):
    """
    Perception: Agent reads and processes tool output
    """
    return {
        "type": "tool_output",
        "tool": tool_name,
        "content": tool_result
    }

## Example: Agent perceives a user query
perception_1 = perceive_user_input("What's the weather in San Francisco?")
print(f"Perceived: {perception_1}")

## Later, agent perceives tool output
weather_data = {"temperature": 65, "condition": "sunny"}
perception_2 = perceive_tool_output("weather_api", weather_data)
print(f"Perceived: {perception_2}")
Out[3]:
Console
Perceived: {'type': 'user_message', 'content': "What's the weather in San Francisco?", 'timestamp': '2025-11-09T10:30:00Z'}
Perceived: {'type': 'tool_output', 'tool': 'weather_api', 'content': {'temperature': 65, 'condition': 'sunny'}}

This example shows perception as a deliberate process. The agent doesn't just receive data; it structures and interprets what it perceives. This structured perception becomes part of the agent's state, which we covered in Chapter 7.

Types of Perception

Our assistant can perceive different kinds of information:

Direct user input: The most obvious form. When you type a message, the agent perceives your intent, the specific words you used, and any context clues in your phrasing.

Tool responses: When the agent calls a calculator, searches the web, or queries a database, the returned data is a perception. The agent must interpret this data and integrate it into its understanding.

Memory retrieval: When the agent looks up previous conversations or stored facts, it's perceiving information from its own long-term memory. This is like you remembering something from yesterday.

System signals: The agent might perceive metadata like timestamps, user IDs, or error messages. These help it understand the broader context of its environment.

What Is Action?

If perception is input, action is output. But action goes beyond just generating text. When our assistant takes an action, it changes something in its environment.

Here are the main types of actions our assistant can take:

Generating responses: The most common action. The agent produces text that answers your question or continues the conversation.

Calling tools: When the agent invokes a calculator, sends an email, or searches the web, it's taking action that affects the external world.

Updating memory: Saving information for later is an action. When the agent stores "User's birthday is July 20," it's changing its internal environment.

Requesting clarification: Sometimes the best action is to ask for more information. "Did you mean San Francisco, California or San Francisco, Philippines?" is an action that seeks better perception.

Action in Code (Claude Sonnet 4.5)

Let's extend our example to show how the agent takes actions:

In[4]:
Code
def action_generate_response(agent_message):
    """
    Action: Agent generates a text response
    This changes the environment by adding to the conversation
    """
    return {
        "type": "response",
        "content": agent_message,
        "timestamp": "2025-11-09T10:30:05Z"
    }

def action_call_tool(tool_name, tool_params):
    """
    Action: Agent calls an external tool
    This changes the environment by triggering external systems
    """
    print(f"Calling {tool_name} with params: {tool_params}")
    # Simulate tool call
    if tool_name == "weather_api":
        return {"temperature": 65, "condition": "sunny"}
    return None

## Agent decides to take action based on perception
user_query = perceive_user_input("What's the weather in San Francisco?")

## Action 1: Call weather tool
weather_result = action_call_tool("weather_api", {"city": "San Francisco"})

## Perception: Agent perceives the tool output
tool_perception = perceive_tool_output("weather_api", weather_result)

## Action 2: Generate response based on perceptions
response = action_generate_response(
    f"The weather in San Francisco is {weather_result['condition']} "
    f"with a temperature of {weather_result['temperature']}°F."
)

print(f"Agent response: {response['content']}")
Out[4]:
Console
Calling weather_api with params: {'city': 'San Francisco'}
Agent response: The weather in San Francisco is sunny with a temperature of 65°F.

Notice how actions and perceptions alternate. The agent perceives the user query, takes an action (calling a tool), perceives the result, and takes another action (generating a response). This is the perception-action cycle in practice.

The Perception-Action Cycle

Here's where it gets interesting. Perception and action aren't separate processes; they're part of a continuous cycle. Each action creates new things to perceive, and each perception informs the next action.

Let's trace through a more complex example:

User: "Schedule a meeting with Alice next Tuesday at 2pm"

Cycle 1:
  Perception: Agent reads user request
  Action: Agent calls calendar tool to check availability
  
Cycle 2:
  Perception: Agent sees Tuesday 2pm is already booked
  Action: Agent generates clarification request
  
Agent: "Tuesday at 2pm is already taken. Would 3pm work instead?"

Cycle 3:
  Perception: Agent reads user's response "Yes, 3pm works"
  Action: Agent calls calendar tool to create meeting
  
Cycle 4:
  Perception: Agent sees meeting was created successfully
  Action: Agent generates confirmation message
  
Agent: "Meeting with Alice scheduled for Tuesday at 3pm."

Each cycle builds on the previous one. The agent's actions change the environment (checking the calendar, creating a meeting), and these changes create new perceptions (seeing the conflict, seeing the successful creation).

Implementing the Cycle (Claude Sonnet 4.5)

Here's a simplified implementation of the perception-action cycle:

In[5]:
Code
class AgentCycle:
    def __init__(self):
        self.state = {
            "conversation_history": [],
            "current_goal": None
        }
    
    def perceive(self, input_data):
        """Process incoming information"""
        self.state["conversation_history"].append({
            "role": "perception",
            "data": input_data
        })
        return input_data
    
    def decide(self, perception):
        """Decide what action to take based on perception"""
        # In a real agent, this would use the LLM to reason
        if "weather" in perception.lower():
            return {"action": "call_tool", "tool": "weather_api"}
        else:
            return {"action": "respond", "message": "I understand."}
    
    def act(self, decision):
        """Execute the decided action"""
        if decision["action"] == "call_tool":
            # Simulate tool call
            result = {"temperature": 65, "condition": "sunny"}
            return result
        elif decision["action"] == "respond":
            return decision["message"]
    
    def run_cycle(self, user_input):
        """Run one complete perception-action cycle"""
        # Perceive
        perception = self.perceive(user_input)
        
        # Decide
        decision = self.decide(perception)
        
        # Act
        result = self.act(decision)
        
        return result

## Example usage
agent = AgentCycle()
result = agent.run_cycle("What's the weather like?")
print(f"Agent action result: {result}")
Out[5]:
Console
Agent action result: {'temperature': 65, 'condition': 'sunny'}

This example shows the three stages of each cycle: perceive, decide, and act. The agent's state persists across cycles, allowing it to maintain context and build on previous interactions.

How Actions Change the Environment

Let's be specific about what "changing the environment" means for our assistant. Every action has consequences:

Text responses change the conversation state: When the agent replies, the conversation history grows. This changes what the agent will perceive in future cycles.

Tool calls change external systems: Sending an email, creating a calendar event, or updating a database all modify the external environment. These changes persist even after the agent stops running.

Memory updates change the agent's knowledge: When the agent saves information, it changes its own internal environment. Future perceptions will include this stored knowledge.

Failed actions create new perceptions: If a tool call fails, the agent perceives an error. This might trigger a different action, like trying an alternative approach or asking the user for help.

Example: Action Consequences (GPT-5)

Let's see how one action creates ripple effects:

In[6]:
Code
import openai

## Using GPT-5 for this straightforward example
client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def demonstrate_action_consequences():
    """Show how actions change what the agent perceives next"""
    
    # Initial state
    environment = {
        "calendar": [],
        "conversation": []
    }
    
    # Action 1: Agent adds event to calendar
    event = {"title": "Team meeting", "time": "2pm"}
    environment["calendar"].append(event)
    environment["conversation"].append({
        "role": "agent",
        "content": "I've scheduled the team meeting for 2pm."
    })
    
    print("After Action 1:")
    print(f"Calendar: {environment['calendar']}")
    
    # Action 2: User asks about schedule
    # Agent now perceives the event it created
    environment["conversation"].append({
        "role": "user",
        "content": "What's on my calendar?"
    })
    
    # Agent perceives its own previous action
    events = environment["calendar"]
    response = f"You have {len(events)} event(s): {events[0]['title']} at {events[0]['time']}"
    
    print(f"\nAgent perceives its own action: {response}")
    
    return environment

result = demonstrate_action_consequences()
Out[6]:
Console
After Action 1:
Calendar: [{'title': 'Team meeting', 'time': '2pm'}]

Agent perceives its own action: You have 1 event(s): Team meeting at 2pm

The agent's first action (scheduling the meeting) changed the environment. When the agent later perceives the calendar, it sees the result of its own previous action. This is how the perception-action cycle creates continuity.

Perception Limitations and Action Constraints

Our assistant doesn't perceive everything, and it can't do everything. Understanding these boundaries is crucial for building reliable agents.

Perception Limitations

Partial observability: The agent can't see everything in its environment. It only perceives what it explicitly checks. If you have a calendar event but the agent doesn't query the calendar, it won't know about the event.

Noisy perception: Sometimes the agent misinterprets what it perceives. A user's ambiguous question might be understood incorrectly, or a tool might return incomplete data.

Delayed perception: The agent perceives information at specific moments. It doesn't continuously monitor its environment. Between cycles, things might change without the agent knowing.

Action Constraints

Limited capabilities: The agent can only take actions it has tools for. If there's no email tool, it can't send emails, no matter how much it wants to.

Permission boundaries: Even with tools available, the agent might not have permission to use them in all situations. We might restrict it from deleting files or making purchases without confirmation.

Action failures: Tools can fail. APIs go down, databases become unavailable, or operations time out. The agent must handle these failures gracefully.

Handling Limitations in Code (Claude Sonnet 4.5)

Here's how to build robustness into the perception-action cycle:

In[7]:
Code
class RobustAgent:
    def __init__(self, available_tools):
        self.tools = available_tools
        self.perception_history = []
    
    def safe_perceive(self, source, data):
        """Perceive with error handling"""
        try:
            perception = {
                "source": source,
                "data": data,
                "success": True
            }
            self.perception_history.append(perception)
            return perception
        except Exception as e:
            return {
                "source": source,
                "error": str(e),
                "success": False
            }
    
    def safe_act(self, action_type, params):
        """Act with constraints and error handling"""
        # Check if action is allowed
        if action_type not in self.tools:
            return {
                "success": False,
                "error": f"Tool {action_type} not available"
            }
        
        # Try to execute action
        try:
            result = self.tools[action_type](params)
            return {"success": True, "result": result}
        except Exception as e:
            return {
                "success": False,
                "error": f"Action failed: {str(e)}"
            }
    
    def run_with_fallback(self, user_input):
        """Run cycle with fallback strategies"""
        # Perceive input
        perception = self.safe_perceive("user", user_input)
        
        if not perception["success"]:
            return "I'm having trouble understanding that. Could you rephrase?"
        
        # Try primary action
        action = self.safe_act("primary_tool", {"query": user_input})
        
        if not action["success"]:
            # Fallback: Try alternative action
            action = self.safe_act("fallback_tool", {"query": user_input})
            
            if not action["success"]:
                return "I tried to help, but encountered an error. Please try again."
        
        return action["result"]

## Example usage
tools = {
    "primary_tool": lambda x: f"Processed: {x['query']}",
    "fallback_tool": lambda x: f"Alternative processing: {x['query']}"
}

agent = RobustAgent(tools)
result = agent.run_with_fallback("Help me with something")
print(result)
Out[7]:
Console
Processed: Help me with something

This example shows defensive programming. The agent checks whether perceptions succeeded, whether tools are available, and has fallback strategies when primary actions fail.

Bringing It Together: A Complete Example

Let's build a complete example that shows perception and action working together in our personal assistant:

In[8]:
Code
## Using Claude Sonnet 4.5 for comprehensive agent capabilities
import os
from anthropic import Anthropic
import json

class PersonalAssistant:
    def __init__(self):
        self.client = Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
        self.conversation_history = []
        self.tools = {
            "calculator": self.calculator_tool,
            "memory": self.memory_tool
        }
        self.memory_store = {}
    
    def calculator_tool(self, expression):
        """Simple calculator tool"""
        try:
            return {"result": eval(expression)}
        except:
            return {"error": "Invalid expression"}
    
    def memory_tool(self, action, key=None, value=None):
        """Memory storage tool"""
        if action == "save":
            self.memory_store[key] = value
            return {"status": "saved"}
        elif action == "retrieve":
            return {"value": self.memory_store.get(key, "Not found")}
    
    def perceive_and_act(self, user_message):
        """Complete perception-action cycle"""
        
        # Perception 1: User input
        print(f"\n[PERCEPTION] User says: {user_message}")
        self.conversation_history.append({
            "role": "user",
            "content": user_message
        })
        
        # Perception 2: Check relevant memory
        # (In a real system, this would be more sophisticated)
        print(f"[PERCEPTION] Current memory: {self.memory_store}")
        
        # Decision & Action: Use LLM to decide what to do
        response = self.client.messages.create(
            model="claude-sonnet-4-5",
            max_tokens=1024,
            messages=self.conversation_history
        )
        
        agent_message = response.content[0].text
        
        # Action: Generate response
        print(f"[ACTION] Agent responds: {agent_message}")
        self.conversation_history.append({
            "role": "assistant",
            "content": agent_message
        })
        
        return agent_message

## Example usage
assistant = PersonalAssistant()

## Cycle 1
assistant.perceive_and_act("Hi! My name is Alex.")

## Cycle 2
assistant.perceive_and_act("What's 15 times 23?")

## Cycle 3
assistant.perceive_and_act("What's my name?")
Out[8]:
Console

[PERCEPTION] User says: Hi! My name is Alex.
[PERCEPTION] Current memory: {}
[ACTION] Agent responds: Hello Alex! It's nice to meet you. How can I help you today?

[PERCEPTION] User says: What's 15 times 23?
[PERCEPTION] Current memory: {}
[ACTION] Agent responds: 15 times 23 is 345.

[PERCEPTION] User says: What's my name?
[PERCEPTION] Current memory: {}
[ACTION] Agent responds: Your name is Alex. You mentioned that at the beginning of our conversation.
'Your name is Alex. You mentioned that at the beginning of our conversation.'

This example demonstrates:

  1. Multiple perceptions: The agent perceives user input and checks its memory
  2. State maintenance: Conversation history persists across cycles
  3. Action variety: The agent can respond, call tools, or update memory
  4. Continuity: Each cycle builds on previous ones

Key Takeaways

Perception and action are the fundamental ways your agent interacts with its environment:

Perception is active: The agent doesn't passively receive information. It actively queries, interprets, and structures what it perceives.

Action has consequences: Every action changes something, whether it's the conversation state, external systems, or the agent's own memory.

The cycle is continuous: Perception leads to action, which creates new perceptions, which lead to new actions. This cycle is how agents accomplish complex, multi-step tasks.

Limitations matter: Understanding what the agent can't perceive and can't do is as important as understanding what it can do. Build in error handling and fallback strategies.

State bridges cycles: The agent's state (which we covered in Chapter 7) is what connects one perception-action cycle to the next. Without state, each cycle would start from scratch.

In the next section, we'll explore environment boundaries and constraints, where we'll look at how to define what your agent should and shouldn't be able to perceive and do. This is crucial for building safe, reliable agents that operate within appropriate limits.

Glossary

Action: Any operation the agent performs that changes its environment, such as generating a response, calling a tool, or updating memory. Actions are the agent's way of affecting the world around it.

Perception: The process by which the agent receives and interprets information from its environment. This includes reading user input, receiving tool outputs, and accessing stored memory.

Perception-Action Cycle: The continuous loop where the agent perceives information from its environment, decides what to do, takes action, and then perceives the results of that action. This cycle repeats throughout the agent's operation.

Partial Observability: The limitation that an agent cannot perceive everything in its environment at once. The agent only knows about what it explicitly checks or queries, not everything that exists.

Action Constraint: A limitation on what actions an agent can take, whether due to lack of tools, insufficient permissions, or environmental restrictions. Constraints help keep agents operating within safe boundaries.

Quiz

Ready to test your understanding? Take this quick quiz to reinforce what you've learned about perception and action in AI agents.

Loading component...

Reference

BIBTEXAcademic
@misc{perceptionandactionhowaiagentssenseandrespondtotheirenvironment, author = {Michael Brenndoerfer}, title = {Perception and Action: How AI Agents Sense and Respond to Their Environment}, year = {2025}, url = {https://mbrenndoerfer.com/writing/ai-agent-perception-action-cycle}, organization = {mbrenndoerfer.com}, note = {Accessed: 2025-12-25} }
APAAcademic
Michael Brenndoerfer (2025). Perception and Action: How AI Agents Sense and Respond to Their Environment. Retrieved from https://mbrenndoerfer.com/writing/ai-agent-perception-action-cycle
MLAAcademic
Michael Brenndoerfer. "Perception and Action: How AI Agents Sense and Respond to Their Environment." 2025. Web. 12/25/2025. <https://mbrenndoerfer.com/writing/ai-agent-perception-action-cycle>.
CHICAGOAcademic
Michael Brenndoerfer. "Perception and Action: How AI Agents Sense and Respond to Their Environment." Accessed 12/25/2025. https://mbrenndoerfer.com/writing/ai-agent-perception-action-cycle.
HARVARDAcademic
Michael Brenndoerfer (2025) 'Perception and Action: How AI Agents Sense and Respond to Their Environment'. Available at: https://mbrenndoerfer.com/writing/ai-agent-perception-action-cycle (Accessed: 12/25/2025).
SimpleBasic
Michael Brenndoerfer (2025). Perception and Action: How AI Agents Sense and Respond to Their Environment. https://mbrenndoerfer.com/writing/ai-agent-perception-action-cycle