An agent is a non-deterministic state machine made of functions that call LLMs or tools, guided by memory and control logic to reach a goal. Each function is a state, and transitions depend on outputs from LLMs or tools.
Agents:
- Combine decision-making, tool use, and memory
- Use components like LLM calls, tool calls, memory, and branching logic
- Finish when a defined condition is met, which could be binary, subjective, or iterative
Tools are real-world actions agents can trigger, such as APIs. These are often discovered dynamically using the Model Context Protocol (MCP).
Memory can be short-term, long-term, scratchpad, or shared. It is crucial for retaining context across states or agents.
Agent swarms are multiple agents that coordinate on subgoals, similar to how departments function within a company.
The key challenge is orchestration. This means managing non-determinism, evolving APIs, and error handling.
An agent is a structured, memory-aware workflow that uses LLMs and tools to act toward a goal.
The term "agent" shows up everywhere: agentic workflows, autonomous agents, agent swarms. But if you're not exactly sure what an agent is, you're not alone. This space moves quickly, and terminology can make it harder to grasp the core ideas.
This article provides a clear mental model of agents: what they are, how they work, and why they matter. We will explore their mechanics, design principles, and practical guidance for building agentic systems.
Tip: Before continuing, read this article on Tools and the Model Context Protocol (MCP). It is foundational for understanding how agents interact with their environment.
Agents Are Not Magic#
Online commentary often describes agents as AI teammates capable of reasoning like humans, writing code from scratch, or replacing entire departments.
The truth is more practical.
Agents are structured workflows. They are programs composed of logic, memory, tools, and language model calls. These are organized in a way that lets them operate autonomously, at least partially, toward a specific goal.
So what makes them special? They combine decision-making, tool use, and stateful reasoning. This is something that LLMs alone cannot reliably accomplish.
The Core Metaphor: Agents Are State Machines#
To understand agents, it helps to first understand the concept of a state machine.
What is a State Machine?#
A state machine is a system that:
- Is always in one of a finite number of states
- Moves between states based on inputs
- Executes logic within each state
Deterministic vs. Non-Deterministic#
- Deterministic: A given state and input always leads to exactly one next state, similar to a vending machine.
- Non-Deterministic: A given state and input can lead to multiple possible next states, like a choose-your-own-adventure game.
Agents fall into the second category. They are non-deterministic state machines, because LLMs can produce different results for the same input. This variability adds flexibility, creativity, and adaptability.
If you want to understand why LLMs are non-deterministic, see this article.
Functions as States in an Agent#
Each function in an agent can be thought of as a state.
A function might:
- Call an LLM to interpret a prompt or summarize context
- Use a tool to query a database or trigger an API
- Execute control logic to branch, loop, or exit
Each of these operations represents a discrete state. The result of that function determines the transition to the next state.
This function-to-state mapping is what gives agents their modular structure and power.
Anatomy of an Agent#
Let's examine the components.
Components#
Component | Role |
---|---|
Input | A user message or external event that starts the agent |
LLM Call | Interprets input, tracks memory, suggests next steps |
Tool Call | Carries out real-world actions beyond what an LLM can do |
Control Logic | Drives state transitions through branching, loops, and end checks |
Memory/State | Holds context, decisions, and history |
Simplified Execution Flow#
User Prompt: "Plan me a vacation to Italy."
- LLM State 1: Ask for travel dates and budget
- Tool State 2: Look up hotels based on input
- LLM State 3: Recommend options
- LLM State 4: Ask for confirmation
- Tool State 5: Book the hotel and complete
Each numbered step is a state implemented as a function. Transitions depend on LLM and tool outputs.
Tools: How Agents Take Action#
LLMs are good at reasoning and generating text, but they cannot directly act in the real world. That is the role of tools.
What Are Tools?#
A tool is a function an LLM can call with structured inputs. For example:
1findHotels(
2 destination="Italy",
3 date="June",
4 budget=2000
5)
1findHotels(
2 destination="Italy",
3 date="June",
4 budget=2000
5)
The LLM selects this function, fills in the arguments, and runs it. Tools include:
- APIs for weather, search, reservations
- Database queries
- File system access
- Calculators
- Other agents
The Role of MCP#
The Model Context Protocol (MCP) allows agents to discover tools at runtime. This means agents can adapt and expand without needing hardcoded logic.
Read this guide on MCP for more detail.
When Does an Agent Finish?#
Determining when an agent finishes is crucial.
Finish Condition Types#
Type | Example | Outcome |
---|---|---|
Binary | "Is Italy nice in summer?" | Yes or No |
Subjective | "Write a Python script." | May succeed or need review |
Iterative | "Design an app." | Involves multiple feedback cycles |
Guardrails and Criteria#
To prevent endless loops or runaway compute:
- Limit to 25 LLM or tool calls
- Impose a token budget (like 1 million tokens)
- Set a time limit (for example, 10 minutes)
- Include human checkpoints
From Agents to Agent Swarms#
When multiple agents coordinate, share memory, and delegate tasks, we get agent swarms.
Single Agent | Agent Swarm |
---|---|
Linear logic | Concurrent workflows |
One memory stream | Shared or modular memory |
One goal | Multiple subgoals across agents |
Few states | Many dynamic transitions and sub-agents |
You might have:
- A Research Agent
- A Planning Agent
- A Coding Agent
- A Review Agent
These could be directed by a Master Agent.
Understanding Memory in Agents#
Memory is how agents remember context to decide what to do next.
Types of Memory#
Type | Description |
---|---|
Short-term | Stored in the current LLM prompt |
Long-term | Indexed and retrieved as needed, often from vector stores |
Scratchpad | Temporary summaries or notes used during transitions |
Shared memory | Global state accessible to multiple agents |
Implementing Memory#
Memory can be maintained using:
- Prompt engineering with structured history
- External stores like Redis or Pinecone
- MCP-based memory tools
Orchestration is the Hard Part#
LLMs are powerful reasoning engines, but they don't do anything on their own. What turns them into functioning agents is orchestration.
Orchestration is the glue that binds prompts, logic, tools, memory, and decisions into a working system.
It's not just about chaining together API calls. It's about coordinating an evolving, partially deterministic process with tools that weren't designed to behave like functions, and doing so reliably.
What Does Orchestration Actually Involve?#
Think of orchestration as building the runtime brain of the agent. It must:
- Track state across steps
- Route data between prompts, tools, and memory
- Decide what to do next (or when to stop)
- Manage errors and retries
- Optimize for cost, latency, and quality
In practice, this means:
- Structuring workflows as state machines (with branching and looping logic)
- Calling LLMs at the right points with the right context
- Choosing and calling tools based on LLM output
- Persisting intermediate decisions and outputs
- Monitoring for task completion or failure
The moment you go beyond a single prompt-response, you're in orchestration territory.
Why Orchestration Is So Hard#
Here's why building orchestrated agents is orders of magnitude harder than calling an LLM directly:
Unpredictable Outputs LLMs are non-deterministic. The same prompt can yield different outputs. This makes control flow fuzzy - unlike traditional code, you can't always rely on the same state transition.
Fragile Prompt Logic Prompt engineering becomes brittle when prompts act as function calls. A small change in wording can cause an LLM to miss a tool call, misinterpret intent, or generate unexpected output.
Limited Testing Infrastructure You can't write unit tests for every LLM call. Most orchestration frameworks rely on evaluations, heuristics, or spot checks - making regression testing and CI/CD workflows hard to implement.
Complex Error Handling You must build logic for:
- LLM failure modes (nonsense, hallucinations, omissions)
- Tool call errors (timeouts, bad inputs)
- Loops that don't converge
- Ambiguous states
This often means adding layers of fallback logic or human-in-the-loop checkpoints.
Frameworks for Agents#
The ecosystem is rapidly maturing, and several frameworks now help abstract away some of the pain of orchestration.
Open-source
- CrewAI: Python, Focuses on team-based agent collaboration. Great for use cases with defined roles and communication patterns.
- Mastra: TypeScript, Built around state-machine-style workflows and browser-based interactions. Designed for TypeScript-heavy stacks.
Cloud Platforms
- AWS Agent Squad A managed orchestration layer that integrates with Bedrock models, Lambda, and Step Functions.
- Cloudflare Agents SDK Focused on edge-deployed agents running lightweight, low-latency workflows in serverless environments.
- GCP Vertex AI Agent Builder Graph-based orchestration with tool and memory integration. Good for enterprise-grade pipelines.
- Azure AI Foundry Agent Service Combines Azure OpenAI, function calling, and memory modules into a production-oriented orchestration framework.
Why Agents Resemble Organizations#
You might have heard of Agents resembling organizations. This is because agents mirror organizational structures:
- Each has a role
- Uses resources and tools
- Communicates and delegates
- Follows defined protocols
Part of building agents is mapping your existing workflow, including the roles, tools, and communication patterns of your organization, as agents will reflect it.
Complexity Spectrum#
The terms "agentic workflow," "agent," and "agent swarm" are widely used now, but they lack universally agreed-upon definitions.
However, somewhat of a practical consensus has emerged around their core characteristics and distinctions. My definition below is based on this consensus.
Level | Complexity | Definition |
---|---|---|
Agentic Workflow | Simple | LLM-based workflow with mostly deterministic logic and few branches |
Agent | Medium | Autonomous system with conditional logic and a defined goal |
Agent Swarm | High | Multiple agents coordinating with unclear endpoint |
A Concise Definition#
Here's the most concise definition I am able to come up with:
An agent is a non-deterministic state machine composed of chained function-states. Each function usually invokes an LLM or tool. Memory and context guide the agent toward a goal, until a finish condition is met.
Final Thoughts#
Agents aren't magical. They're structured, context-aware systems that combine reasoning with action. As a model for building with AI, they represent one of the most promising new approaches.
I hope this article helped clarify what agents are and how they work. While my goal was to demystify them, agents still offer a significant opportunity to create new kinds of software.