What Are the 5 Types of AI Agents? A Developer's Guide

Your thermostat is an AI agent. So is ChatGPT. The difference between them is not complexity -- it is architecture, and understanding that architecture is the single most important decision you will make when building intelligent systems.

In AI, an agent is any system that:

Perceives its environment through sensors or inputs
Processes that information using some form of intelligence
Acts on the environment through actuators or outputs
Aims to achieve specific goals or maximize outcomes

That definition covers everything from a $30 smart plug to a $30 million autonomous vehicle. The five types of AI agents, first formalized by Stuart Russell and Peter Norvig in their landmark 1995 textbook, represent a clear progression from "react to what you see" to "learn from everything you have ever done."

Pick the wrong type for your problem and no amount of engineering saves you. Pick the right one and the architecture does half the work.

Type 1: Simple Reflex Agents

How They Work

Simple reflex agents are the most basic form of AI agent. They operate on one principle: if condition, then action. No history. No planning. No learning. Just stimulus and response, executed in microseconds.

Architecture

Perception → Condition-Action Rules → Action

The agent receives input, matches it against a set of rules, and executes the corresponding action. No memory, no reasoning -- just direct stimulus-response behavior.

Examples

Thermostat: If temperature drops below 68°F, turn on heat. If it rises above 72°F, turn off heat. Two rules. Zero intelligence. Billions deployed worldwide.

Basic Spam Filter: If email contains "lottery winner", mark as spam. Gmail's earliest spam filter in 2004 was essentially a simple reflex agent with keyword rules. It worked until spammers learned to misspell.

Simple Chatbot Rules: If user says "hello", respond "Hi there!" Rule-based chatbots without NLP are simple reflex agents -- and 73% of chatbots deployed before 2020 worked exactly this way (Drift).

Code Example

class SimpleReflexAgent:
    def __init__(self, rules):
        self.rules = rules  # Dictionary of condition: action
def perceive(self, environment):         return environment.get_current_state()
def act(self, perception):         for condition, action in self.rules.items():             if condition(perception):                 return action         return None  # No matching rule

Limitations

•Can't handle partially observable environments
•No memory of past states
•Fail with novel situations not covered by rules
•Rules must be hand-coded for every scenario

When to Use

Simple reflex agents work well when:

•The environment is fully observable

•All relevant conditions can be enumerated

•Immediate response is more important than optimal response

•The problem doesn't require learning or adaptation

Type 2: Model-Based Reflex Agents

How They Work

Simple reflex agents are blind to anything they cannot see right now. Model-based reflex agents fix this by maintaining an internal model of the world -- a mental map that tracks aspects of the environment even when they are not directly observable.

Architecture

Perception → World Model → Condition-Action Rules → Action
     ↑            |
     └────────────┘ (model updated by perception)

The agent maintains state information and updates its world model based on: Current perception Knowledge of how the world evolves Knowledge of how its own actions affect the world

Examples

Robot Vacuum: iRobot's Roomba j7+ maintains a map of your entire home, tracking which rooms have been cleaned and which have not. It cannot see the kitchen from the bedroom. It does not need to -- the model remembers.

Traffic Light Controller: Modern adaptive traffic systems like SCATS (used in 40,000+ intersections across 27 countries) track traffic density over time, maintaining a model of expected flow patterns even when individual sensors go offline.

Inventory Management System: Amazon's warehouse system tracks 350+ million products across fulfillment centers, maintaining a model of current stock levels, expected deliveries, and consumption patterns without physically scanning every item continuously.

Code Example

class ModelBasedAgent:
    def __init__(self, rules, transition_model):
        self.rules = rules
        self.transition_model = transition_model
        self.world_model = {}  # Internal state
def update_model(self, action, perception):         # Update world model based on expected effects and actual perception         expected_state = self.transition_model.predict(self.world_model, action)         self.world_model = self.reconcile(expected_state, perception)
def act(self, perception):         self.update_model(self.last_action, perception)         for condition, action in self.rules.items():             if condition(self.world_model):                 self.last_action = action                 return action         return None

Advantages Over Simple Reflex

•Can handle partially observable environments
•Maintains context over time
•Can reason about things not currently visible

Limitations

•Still uses fixed rules for action selection
•Cannot plan ahead
•Rules must still be pre-programmed
•Model accuracy limits effectiveness

When to Use

Model-based reflex agents work well when:

•The environment is partially observable

•State tracking is important

•The action space is still well-defined by rules

•Real-time response is required

Type 3: Goal-Based Agents

How They Work

This is where agents start to think. Goal-based agents add explicit goals to the architecture, shifting from "what do I see?" to "what do I want to achieve?" They do not just react. They plan multi-step sequences to reach a desired state.

Architecture

Perception → World Model → Goal → Planning/Search → Action

The agent: Maintains a model of the current world state Has explicit representation of goal states Searches for action sequences that lead from current state to goal Executes actions from the plan

Examples

GPS Navigation: Google Maps evaluates 500+ million routes per day. The goal is the destination. The agent models the road network, searches for routes that achieve the goal, and delivers turn-by-turn directions. Change the destination, change the behavior. The planning engine stays the same.

Game-Playing AI: IBM's Deep Blue searched 200 million chess positions per second in 1997. The goal was checkmate. Every calculation served that single objective.

Task Automation: "Download all PDFs from this website" is a goal. The agent plans steps to navigate pages, identify links, and download files -- adapting when pages 404 or structures change.

Code Example

class GoalBasedAgent:
    def __init__(self, transition_model, goal_test):
        self.transition_model = transition_model
        self.goal_test = goal_test
        self.world_model = {}
        self.plan = []
def search(self, start_state, goal_test):         # BFS, A, or other search algorithm         frontier = [(start_state, [])]         explored = set()

while frontier:             state, actions = frontier.pop(0)             if goal_test(state):                 return actions             explored.add(state)             for action in self.get_actions(state):                 next_state = self.transition_model.predict(state, action)                 if next_state not in explored:                     frontier.append((next_state, actions + [action]))         return None
def act(self, perception):         self.update_model(perception)         if not self.plan or self.needs_replan():             self.plan = self.search(self.world_model, self.goal_test)         if self.plan:             return self.plan.pop(0)         return None

Advantages Over Model-Based Reflex

•Can plan multi-step sequences to achieve objectives

•More flexible -- change the goal, change the behavior

•Can handle novel situations by searching for solutions

Limitations

•Doesn't consider "how good" different goal-achieving paths are

•Binary goal satisfaction (achieved or not)

•May find suboptimal solutions

•Search can be computationally expensive

When to Use

Goal-based agents work well when:
•Tasks have clear success criteria

•Multi-step planning is required

•The goal is more stable than the path to achieve it

•Optimality is less important than goal achievement

Type 4: Utility-Based Agents

How They Work

Goal-based agents think in binary: achieved or not. Utility-based agents think in gradients. They ask "how good is this outcome?" and assign every possible state a score. A utility function maps states to real numbers representing desirability, enabling the agent to make tradeoffs that goal-based agents cannot.

Architecture

Perception → World Model → Utility Function → Decision Theory → Action

The agent: Models current state and possible future states Evaluates each possible outcome using the utility function Chooses actions that maximize expected utility Handles uncertainty through probability

Examples

Investment Algorithm: Renaissance Technologies' Medallion Fund averaged 66% annual returns from 1988 to 2018. It does not just seek profit (goal). It maximizes risk-adjusted returns (utility), weighing thousands of potential outcomes by probability and desirability simultaneously.

Recommendation System: Netflix's recommendation engine drives 80% of content watched on the platform. It does not just recommend shows users might like (goal). It optimizes for long-term engagement, diversity, and satisfaction across a utility landscape with millions of dimensions.

Autonomous Vehicle: Waymo's self-driving system does not just get from A to B (goal). It minimizes a weighted combination of travel time, energy consumption, safety risk, and passenger comfort -- making 100+ tradeoff decisions per second.

Code Example

class UtilityBasedAgent:
    def __init__(self, transition_model, utility_function):
        self.transition_model = transition_model
        self.utility_function = utility_function
        self.world_model = {}
def expected_utility(self, state, action):         outcomes = self.transition_model.get_outcomes(state, action)         total_utility = 0         for outcome_state, probability in outcomes:             total_utility += probability  self.utility_function(outcome_state)         return total_utility
def act(self, perception):         self.update_model(perception)         best_action = None         best_utility = float('-inf')
for action in self.get_actions():             utility = self.expected_utility(self.world_model, action)             if utility > best_utility:                 best_utility = utility                 best_action = action
return best_action

Advantages Over Goal-Based

•Can choose among multiple goals
•Handles tradeoffs explicitly
•Deals with uncertainty through expected utility
•Finds optimal solutions, not just satisfactory ones

Limitations

•Utility function must be defined (difficult for complex domains)
•Computationally expensive for large state spaces
•Doesn't learn utility functions from experience
•May be sensitive to utility function design

When to Use

Utility-based agents work well when:

•Tradeoffs between competing objectives exist

•Outcomes are uncertain

•Optimal (not just satisfactory) solutions are needed

•The utility function can be well-specified

Type 5: Learning Agents

How They Work

This is the type that changed everything. Learning agents improve their performance over time based on experience. They have all the capabilities of previous agent types, plus something none of them possess: the ability to rewrite their own decision-making based on feedback. Every mistake makes them better.

Architecture

                    ┌─────────────────────┐
                    │    Learning        │
                    │    Component       │
                    └─────────┬──────────┘
                              │ feedback
Perception → Performance → Critic → Problem Generator → Action
    Element                           │
                                      │ new problems
                                      ↓
                              Learning goals

A learning agent has:

•Performance element: Selects actions (like previous agent types)

•Critic: Evaluates how well the agent is doing

•Learning element: Modifies the performance element based on feedback

•Problem generator: Suggests exploratory actions to learn more

Examples

Modern LLM Assistants: Claude, GPT-4, and Gemini learn from human feedback (RLHF) across millions of conversations. OpenAI reported that RLHF improved GPT-4's factual accuracy by 40% compared to the base model.

AlphaGo: DeepMind's AlphaGo played 4.9 million games against itself before defeating world champion Lee Sedol in March 2016. Its move 37 in game two was so unexpected that commentators initially called it a mistake. It was genius the system taught itself.

Personalized Recommendations: Spotify's Discover Weekly analyzes 5 billion playlists to learn your preferences, serving 600 million users with each experiencing a product shaped by their own behavior.

Adaptive Spam Filters: Gmail's spam filter learns from 1.8 billion users flagging emails daily, catching 99.9% of spam. Every "mark as spam" click trains the model for everyone.

Code Example

class LearningAgent:
    def __init__(self, initial_policy, learning_rate=0.1):
        self.policy = initial_policy  # Maps states to action probabilities
        self.value_estimates = {}     # Estimated value of states
        self.learning_rate = learning_rate
        self.history = []
def act(self, state):         # Epsilon-greedy exploration         if random.random() < self.exploration_rate:             return self.explore(state)         return self.policy.best_action(state)
def learn(self, state, action, reward, next_state):         # Q-learning update (simplified)         old_value = self.value_estimates.get((state, action), 0)         future_value = max(self.value_estimates.get((next_state, a), 0)                           for a in self.get_actions(next_state))         new_value = old_value + self.learning_rate  (             reward + self.discount  future_value - old_value         )         self.value_estimates[(state, action)] = new_value         self.policy.update(state, action, new_value)
def feedback(self, reward):         # Process reward signal and trigger learning         state, action = self.history[-1]         next_state = self.current_state         self.learn(state, action, reward, next_state)

Advantages Over Utility-Based

•Adapts to changing environments
•Can discover utility functions from experience
•Improves with exposure to more data
•Handles environments too complex to model explicitly

Limitations

•Requires substantial training data or experience
•May learn incorrect patterns (distribution shift)
•Learning process can be unstable
•Exploration vs. exploitation tradeoff
•Computational cost of training

When to Use

Learning agents work well when:

•The optimal policy is unknown or complex

•The environment provides feedback signals

•Sufficient training data/experience is available

•Adaptation to change is important

Comparison Table

Type	Memory	Goals	Optimization	Learning	Example Use Case
Simple Reflex	None	None	None	None	Thermostat
Model-Based	World model	None	None	None	Robot vacuum
Goal-Based	World model	Explicit	Goal satisfaction	None	GPS navigation
Utility-Based	World model	Implicit in utility	Maximize utility	None	Investment bot
Learning	World model	Can learn	Can learn	Yes	LLM assistant

## Modern LLM-Powered Agents

Today's LLM-powered agents are typically combinations of these types:

Claude, GPT-4, and similar models are primarily learning agents (trained on trillions of tokens) combined with goal-based behavior (following instructions) and model-based reasoning (maintaining conversation context). They are hybrid architectures, not a single type.

Agentic systems built on LLMs layer additional capabilities:

•Explicit planning capabilities (goal-based)

•Tool use and environment interaction (model-based)

•Reward signals and feedback loops (utility-based + learning)

•Memory systems (model-based)

The five types are not just textbook theory. They are the architectural DNA of every intelligent system shipping today.

Practical Implementation Considerations

When building AI agents, the architecture decision matters more than the implementation details. Get the type right first. Here is a practical framework:

Start Simple

Begin with the simplest agent type that could work. Over-engineering kills more agent projects than under-engineering. Only add complexity when simple reflex rules provably fail.

Match Type to Problem

•Fully observable, simple domain → Simple reflex
•Partial observability → Model-based
•Clear goal, complex path → Goal-based
•Multiple objectives, tradeoffs → Utility-based
•Complex/unknown optimal behavior → Learning

Hybrid Approaches

Real production systems almost always combine types. Tesla's Autopilot uses model-based perception, goal-based planning, and learning-based refinement in a single pipeline. Do not be afraid to mix.

Testing and Evaluation

•Reflex agents: Test rule coverage (aim for 100% of known conditions)
•Model-based: Test model accuracy against ground truth
•Goal-based: Test goal achievement rate across 1,000+ scenarios
•Utility-based: Test utility achieved vs. theoretical optimum
•Learning: Test improvement curves over time

How Clarvia Builds AI Agents

At Clarvia, we build AI agents that combine these theoretical foundations with practical engineering. Our agents typically:

•Use LLM capabilities as a foundation (learning agents)
•Maintain explicit goals and context (goal-based)
•Model the environment through RAG and memory (model-based)
•Optimize for client-specific metrics (utility-based)

This combination delivers agents that are sophisticated yet practical. See our approach in action: Why We Build Products Using Only AI.

The Bottom Line

The five agent types -- simple reflex, model-based, goal-based, utility-based, and learning -- are not academic categories. They are engineering blueprints. Every intelligent system you have ever used maps to one of these architectures, or a combination of them.

The progression is clear: more sophisticated agents handle more complex problems, but at higher computational and design cost. The smartest engineering decision is not always the most advanced agent type. It is the simplest one that solves your problem.

Architecture is destiny. Choose wisely.

Building intelligent agents for your application? Contact Clarvia to discuss how we can help design and implement the right AI agent architecture for your needs.

What Are the 5 Types of AI Agents? A Developer's Guide

Type 1: Simple Reflex Agents

How They Work

Architecture

Examples

Code Example

Limitations

When to Use

Type 2: Model-Based Reflex Agents

How They Work

Architecture

Examples

Code Example

Advantages Over Simple Reflex

Limitations

When to Use

Type 3: Goal-Based Agents

How They Work

Architecture

Examples

Code Example

Advantages Over Model-Based Reflex

Limitations

When to Use

Type 4: Utility-Based Agents

How They Work

Architecture

Examples

Code Example

Advantages Over Goal-Based

Limitations

When to Use

Type 5: Learning Agents

How They Work

Architecture

Examples

Code Example

Advantages Over Utility-Based

Limitations

When to Use

Comparison Table

Practical Implementation Considerations

Start Simple

Match Type to Problem

Hybrid Approaches

Testing and Evaluation

How Clarvia Builds AI Agents

The Bottom Line

Ready to Transform Your Development?

Related Articles

The Harness Is Everything: What Cursor, Claude Code, Codex, and Perplexity Actually Built

What Is llms.txt? The Complete Guide to LLM-Readable Website Files

Structured Data for AI: How JSON-LD Helps LLMs Understand Your Business

Cookie Preferences