Learn how AI agents perceive their environment through inputs, tool outputs, and memory, and how they take actions that change the world around them through the perception-action cycle.

This article is part of the free-to-read AI Agent Handbook
Perception and Action
In the previous section, we defined what an environment is for our AI assistant. Now we'll explore the two fundamental ways our agent interacts with that environment: perception and action. Think of it like this: if the environment is the world our agent lives in, then perception is how it senses what's happening, and action is how it makes things happen.
Just like a self-driving car uses sensors to see the road and then steers or brakes to act, our AI assistant "senses" input and then acts by generating responses or using tools. But here's what makes this interesting: the environment isn't static. The agent's perceptions change its internal state, and its actions change the environment, which in turn creates new things to perceive. It's a continuous loop.
Let's break down how this works in practice.
What Is Perception?
For our personal assistant, perception is simpler than you might think. The agent perceives its environment by reading inputs. When you type a message, the agent "hears" it. When a tool returns data, the agent "sees" it. When it checks its memory, it "remembers" previous context.
Here's a concrete example. When you ask your assistant, "What's the weather like today?", several perceptions happen:
- User input perception: The agent receives your text query
- Context perception: It accesses its memory to understand you're asking about weather
- Tool output perception: After calling a weather API, it receives structured data about temperature, conditions, etc.
Each of these is a form of perception. The agent takes information from its environment and incorporates it into its understanding of the current situation.
Perception in Code (Claude Sonnet 4.5)
Let's see how perception works in a simple agent loop:
This example shows perception as a deliberate process. The agent doesn't just receive data; it structures and interprets what it perceives. This structured perception becomes part of the agent's state, which we covered in Chapter 7.
Types of Perception
Our assistant can perceive different kinds of information:
Direct user input: The most obvious form. When you type a message, the agent perceives your intent, the specific words you used, and any context clues in your phrasing.
Tool responses: When the agent calls a calculator, searches the web, or queries a database, the returned data is a perception. The agent must interpret this data and integrate it into its understanding.
Memory retrieval: When the agent looks up previous conversations or stored facts, it's perceiving information from its own long-term memory. This is like you remembering something from yesterday.
System signals: The agent might perceive metadata like timestamps, user IDs, or error messages. These help it understand the broader context of its environment.
What Is Action?
If perception is input, action is output. But action goes beyond just generating text. When our assistant takes an action, it changes something in its environment.
Here are the main types of actions our assistant can take:
Generating responses: The most common action. The agent produces text that answers your question or continues the conversation.
Calling tools: When the agent invokes a calculator, sends an email, or searches the web, it's taking action that affects the external world.
Updating memory: Saving information for later is an action. When the agent stores "User's birthday is July 20," it's changing its internal environment.
Requesting clarification: Sometimes the best action is to ask for more information. "Did you mean San Francisco, California or San Francisco, Philippines?" is an action that seeks better perception.
Action in Code (Claude Sonnet 4.5)
Let's extend our example to show how the agent takes actions:
Notice how actions and perceptions alternate. The agent perceives the user query, takes an action (calling a tool), perceives the result, and takes another action (generating a response). This is the perception-action cycle in practice.
The Perception-Action Cycle
Here's where it gets interesting. Perception and action aren't separate processes; they're part of a continuous cycle. Each action creates new things to perceive, and each perception informs the next action.
Let's trace through a more complex example:
Each cycle builds on the previous one. The agent's actions change the environment (checking the calendar, creating a meeting), and these changes create new perceptions (seeing the conflict, seeing the successful creation).
Implementing the Cycle (Claude Sonnet 4.5)
Here's a simplified implementation of the perception-action cycle:
This example shows the three stages of each cycle: perceive, decide, and act. The agent's state persists across cycles, allowing it to maintain context and build on previous interactions.
How Actions Change the Environment
Let's be specific about what "changing the environment" means for our assistant. Every action has consequences:
Text responses change the conversation state: When the agent replies, the conversation history grows. This changes what the agent will perceive in future cycles.
Tool calls change external systems: Sending an email, creating a calendar event, or updating a database all modify the external environment. These changes persist even after the agent stops running.
Memory updates change the agent's knowledge: When the agent saves information, it changes its own internal environment. Future perceptions will include this stored knowledge.
Failed actions create new perceptions: If a tool call fails, the agent perceives an error. This might trigger a different action, like trying an alternative approach or asking the user for help.
Example: Action Consequences (GPT-5)
Let's see how one action creates ripple effects:
The agent's first action (scheduling the meeting) changed the environment. When the agent later perceives the calendar, it sees the result of its own previous action. This is how the perception-action cycle creates continuity.
Perception Limitations and Action Constraints
Our assistant doesn't perceive everything, and it can't do everything. Understanding these boundaries is crucial for building reliable agents.
Perception Limitations
Partial observability: The agent can't see everything in its environment. It only perceives what it explicitly checks. If you have a calendar event but the agent doesn't query the calendar, it won't know about the event.
Noisy perception: Sometimes the agent misinterprets what it perceives. A user's ambiguous question might be understood incorrectly, or a tool might return incomplete data.
Delayed perception: The agent perceives information at specific moments. It doesn't continuously monitor its environment. Between cycles, things might change without the agent knowing.
Action Constraints
Limited capabilities: The agent can only take actions it has tools for. If there's no email tool, it can't send emails, no matter how much it wants to.
Permission boundaries: Even with tools available, the agent might not have permission to use them in all situations. We might restrict it from deleting files or making purchases without confirmation.
Action failures: Tools can fail. APIs go down, databases become unavailable, or operations time out. The agent must handle these failures gracefully.
Handling Limitations in Code (Claude Sonnet 4.5)
Here's how to build robustness into the perception-action cycle:
This example shows defensive programming. The agent checks whether perceptions succeeded, whether tools are available, and has fallback strategies when primary actions fail.
Bringing It Together: A Complete Example
Let's build a complete example that shows perception and action working together in our personal assistant:
This example demonstrates:
- Multiple perceptions: The agent perceives user input and checks its memory
- State maintenance: Conversation history persists across cycles
- Action variety: The agent can respond, call tools, or update memory
- Continuity: Each cycle builds on previous ones
Key Takeaways
Perception and action are the fundamental ways your agent interacts with its environment:
Perception is active: The agent doesn't passively receive information. It actively queries, interprets, and structures what it perceives.
Action has consequences: Every action changes something, whether it's the conversation state, external systems, or the agent's own memory.
The cycle is continuous: Perception leads to action, which creates new perceptions, which lead to new actions. This cycle is how agents accomplish complex, multi-step tasks.
Limitations matter: Understanding what the agent can't perceive and can't do is as important as understanding what it can do. Build in error handling and fallback strategies.
State bridges cycles: The agent's state (which we covered in Chapter 7) is what connects one perception-action cycle to the next. Without state, each cycle would start from scratch.
In the next section, we'll explore environment boundaries and constraints, where we'll look at how to define what your agent should and shouldn't be able to perceive and do. This is crucial for building safe, reliable agents that operate within appropriate limits.
Glossary
Action: Any operation the agent performs that changes its environment, such as generating a response, calling a tool, or updating memory. Actions are the agent's way of affecting the world around it.
Perception: The process by which the agent receives and interprets information from its environment. This includes reading user input, receiving tool outputs, and accessing stored memory.
Perception-Action Cycle: The continuous loop where the agent perceives information from its environment, decides what to do, takes action, and then perceives the results of that action. This cycle repeats throughout the agent's operation.
Partial Observability: The limitation that an agent cannot perceive everything in its environment at once. The agent only knows about what it explicitly checks or queries, not everything that exists.
Action Constraint: A limitation on what actions an agent can take, whether due to lack of tools, insufficient permissions, or environmental restrictions. Constraints help keep agents operating within safe boundaries.
Quiz
Ready to test your understanding? Take this quick quiz to reinforce what you've learned about perception and action in AI agents.





Comments