AI Development

What Are the 5 Types of AI Agents? A Developer's Guide

Clarvia Team
Author
Oct 30, 2025
9 min read
What Are the 5 Types of AI Agents? A Developer's Guide

Your thermostat is an AI agent. So is ChatGPT. The difference between them is not complexity -- it is architecture, and understanding that architecture is the single most important decision you will make when building intelligent systems.

In AI, an agent is any system that:

  1. Perceives its environment through sensors or inputs
  2. Processes that information using some form of intelligence
  3. Acts on the environment through actuators or outputs
  4. Aims to achieve specific goals or maximize outcomes

That definition covers everything from a $30 smart plug to a $30 million autonomous vehicle. The five types of AI agents, first formalized by Stuart Russell and Peter Norvig in their landmark 1995 textbook, represent a clear progression from "react to what you see" to "learn from everything you have ever done."

Pick the wrong type for your problem and no amount of engineering saves you. Pick the right one and the architecture does half the work.

Type 1: Simple Reflex Agents

How They Work

Simple reflex agents are the most basic form of AI agent. They operate on one principle: if condition, then action. No history. No planning. No learning. Just stimulus and response, executed in microseconds.

Architecture

Perception → Condition-Action Rules → Action

The agent receives input, matches it against a set of rules, and executes the corresponding action. No memory, no reasoning -- just direct stimulus-response behavior.

Examples

Thermostat: If temperature drops below 68°F, turn on heat. If it rises above 72°F, turn off heat. Two rules. Zero intelligence. Billions deployed worldwide.

Basic Spam Filter: If email contains "lottery winner", mark as spam. Gmail's earliest spam filter in 2004 was essentially a simple reflex agent with keyword rules. It worked until spammers learned to misspell.

Simple Chatbot Rules: If user says "hello", respond "Hi there!" Rule-based chatbots without NLP are simple reflex agents -- and 73% of chatbots deployed before 2020 worked exactly this way (Drift).

Code Example

class SimpleReflexAgent:
    def __init__(self, rules):
        self.rules = rules  # Dictionary of condition: action

def perceive(self, environment): return environment.get_current_state()

def act(self, perception): for condition, action in self.rules.items(): if condition(perception): return action return None # No matching rule

Limitations

  • Can't handle partially observable environments
  • No memory of past states
  • Fail with novel situations not covered by rules
  • Rules must be hand-coded for every scenario

When to Use

Simple reflex agents work well when:

  • The environment is fully observable
  • All relevant conditions can be enumerated
  • Immediate response is more important than optimal response
  • The problem doesn't require learning or adaptation
  • Type 2: Model-Based Reflex Agents

    How They Work

    Simple reflex agents are blind to anything they cannot see right now. Model-based reflex agents fix this by maintaining an internal model of the world -- a mental map that tracks aspects of the environment even when they are not directly observable.

    Architecture

    Perception → World Model → Condition-Action Rules → Action
         ↑            |
         └────────────┘ (model updated by perception)
    

    The agent maintains state information and updates its world model based on: Current perception Knowledge of how the world evolves Knowledge of how its own actions affect the world

    Examples

    Robot Vacuum: iRobot's Roomba j7+ maintains a map of your entire home, tracking which rooms have been cleaned and which have not. It cannot see the kitchen from the bedroom. It does not need to -- the model remembers.

    Traffic Light Controller: Modern adaptive traffic systems like SCATS (used in 40,000+ intersections across 27 countries) track traffic density over time, maintaining a model of expected flow patterns even when individual sensors go offline.

    Inventory Management System: Amazon's warehouse system tracks 350+ million products across fulfillment centers, maintaining a model of current stock levels, expected deliveries, and consumption patterns without physically scanning every item continuously.

    Code Example

    class ModelBasedAgent:
        def __init__(self, rules, transition_model):
            self.rules = rules
            self.transition_model = transition_model
            self.world_model = {}  # Internal state
    

    def update_model(self, action, perception): # Update world model based on expected effects and actual perception expected_state = self.transition_model.predict(self.world_model, action) self.world_model = self.reconcile(expected_state, perception)

    def act(self, perception): self.update_model(self.last_action, perception) for condition, action in self.rules.items(): if condition(self.world_model): self.last_action = action return action return None

    Advantages Over Simple Reflex

    • Can handle partially observable environments
    • Maintains context over time
    • Can reason about things not currently visible

    Limitations

    • Still uses fixed rules for action selection
    • Cannot plan ahead
    • Rules must still be pre-programmed
    • Model accuracy limits effectiveness

    When to Use

    Model-based reflex agents work well when:

  • The environment is partially observable
  • State tracking is important
  • The action space is still well-defined by rules
  • Real-time response is required
  • Type 3: Goal-Based Agents

    How They Work

    This is where agents start to think. Goal-based agents add explicit goals to the architecture, shifting from "what do I see?" to "what do I want to achieve?" They do not just react. They plan multi-step sequences to reach a desired state.

    Architecture

    Perception → World Model → Goal → Planning/Search → Action
    

    The agent: Maintains a model of the current world state Has explicit representation of goal states Searches for action sequences that lead from current state to goal Executes actions from the plan

    Examples

    GPS Navigation: Google Maps evaluates 500+ million routes per day. The goal is the destination. The agent models the road network, searches for routes that achieve the goal, and delivers turn-by-turn directions. Change the destination, change the behavior. The planning engine stays the same.

    Game-Playing AI: IBM's Deep Blue searched 200 million chess positions per second in 1997. The goal was checkmate. Every calculation served that single objective.

    Task Automation: "Download all PDFs from this website" is a goal. The agent plans steps to navigate pages, identify links, and download files -- adapting when pages 404 or structures change.

    Code Example

    class GoalBasedAgent:
        def __init__(self, transition_model, goal_test):
            self.transition_model = transition_model
            self.goal_test = goal_test
            self.world_model = {}
            self.plan = []
    

    def search(self, start_state, goal_test): # BFS, A, or other search algorithm frontier = [(start_state, [])] explored = set()

    while frontier: state, actions = frontier.pop(0) if goal_test(state): return actions explored.add(state) for action in self.get_actions(state): next_state = self.transition_model.predict(state, action) if next_state not in explored: frontier.append((next_state, actions + [action])) return None

    def act(self, perception): self.update_model(perception) if not self.plan or self.needs_replan(): self.plan = self.search(self.world_model, self.goal_test) if self.plan: return self.plan.pop(0) return None

    Advantages Over Model-Based Reflex

    • Can plan multi-step sequences to achieve objectives
    • More flexible -- change the goal, change the behavior
    • Can handle novel situations by searching for solutions

    Limitations

    • Doesn't consider "how good" different goal-achieving paths are
    • Binary goal satisfaction (achieved or not)
    • May find suboptimal solutions
    • Search can be computationally expensive

    When to Use

    Goal-based agents work well when:

  • Tasks have clear success criteria
  • Multi-step planning is required
  • The goal is more stable than the path to achieve it
  • Optimality is less important than goal achievement
  • Type 4: Utility-Based Agents

    How They Work

    Goal-based agents think in binary: achieved or not. Utility-based agents think in gradients. They ask "how good is this outcome?" and assign every possible state a score. A utility function maps states to real numbers representing desirability, enabling the agent to make tradeoffs that goal-based agents cannot.

    Architecture

    Perception → World Model → Utility Function → Decision Theory → Action
    

    The agent: Models current state and possible future states Evaluates each possible outcome using the utility function Chooses actions that maximize expected utility Handles uncertainty through probability

    Examples

    Investment Algorithm: Renaissance Technologies' Medallion Fund averaged 66% annual returns from 1988 to 2018. It does not just seek profit (goal). It maximizes risk-adjusted returns (utility), weighing thousands of potential outcomes by probability and desirability simultaneously.

    Recommendation System: Netflix's recommendation engine drives 80% of content watched on the platform. It does not just recommend shows users might like (goal). It optimizes for long-term engagement, diversity, and satisfaction across a utility landscape with millions of dimensions.

    Autonomous Vehicle: Waymo's self-driving system does not just get from A to B (goal). It minimizes a weighted combination of travel time, energy consumption, safety risk, and passenger comfort -- making 100+ tradeoff decisions per second.

    Code Example

    class UtilityBasedAgent:
        def __init__(self, transition_model, utility_function):
            self.transition_model = transition_model
            self.utility_function = utility_function
            self.world_model = {}
    

    def expected_utility(self, state, action): outcomes = self.transition_model.get_outcomes(state, action) total_utility = 0 for outcome_state, probability in outcomes: total_utility += probability self.utility_function(outcome_state) return total_utility

    def act(self, perception): self.update_model(perception) best_action = None best_utility = float('-inf')

    for action in self.get_actions(): utility = self.expected_utility(self.world_model, action) if utility > best_utility: best_utility = utility best_action = action

    return best_action

    Advantages Over Goal-Based

    • Can choose among multiple goals
    • Handles tradeoffs explicitly
    • Deals with uncertainty through expected utility
    • Finds optimal solutions, not just satisfactory ones

    Limitations

    • Utility function must be defined (difficult for complex domains)
    • Computationally expensive for large state spaces
    • Doesn't learn utility functions from experience
    • May be sensitive to utility function design

    When to Use

    Utility-based agents work well when:

  • Tradeoffs between competing objectives exist
  • Outcomes are uncertain
  • Optimal (not just satisfactory) solutions are needed
  • The utility function can be well-specified
  • Type 5: Learning Agents

    How They Work

    This is the type that changed everything. Learning agents improve their performance over time based on experience. They have all the capabilities of previous agent types, plus something none of them possess: the ability to rewrite their own decision-making based on feedback. Every mistake makes them better.

    Architecture

                        ┌─────────────────────┐
                        │    Learning        │
                        │    Component       │
                        └─────────┬──────────┘
                                  │ feedback
    Perception → Performance → Critic → Problem Generator → Action
        Element                           │
                                          │ new problems
                                          ↓
                                  Learning goals
    

    A learning agent has:

  • Performance element: Selects actions (like previous agent types)
  • Critic: Evaluates how well the agent is doing
  • Learning element: Modifies the performance element based on feedback
  • Problem generator: Suggests exploratory actions to learn more
  • Examples

    Modern LLM Assistants: Claude, GPT-4, and Gemini learn from human feedback (RLHF) across millions of conversations. OpenAI reported that RLHF improved GPT-4's factual accuracy by 40% compared to the base model.

    AlphaGo: DeepMind's AlphaGo played 4.9 million games against itself before defeating world champion Lee Sedol in March 2016. Its move 37 in game two was so unexpected that commentators initially called it a mistake. It was genius the system taught itself.

    Personalized Recommendations: Spotify's Discover Weekly analyzes 5 billion playlists to learn your preferences, serving 600 million users with each experiencing a product shaped by their own behavior.

    Adaptive Spam Filters: Gmail's spam filter learns from 1.8 billion users flagging emails daily, catching 99.9% of spam. Every "mark as spam" click trains the model for everyone.

    Code Example

    class LearningAgent:
        def __init__(self, initial_policy, learning_rate=0.1):
            self.policy = initial_policy  # Maps states to action probabilities
            self.value_estimates = {}     # Estimated value of states
            self.learning_rate = learning_rate
            self.history = []
    

    def act(self, state): # Epsilon-greedy exploration if random.random() < self.exploration_rate: return self.explore(state) return self.policy.best_action(state)

    def learn(self, state, action, reward, next_state): # Q-learning update (simplified) old_value = self.value_estimates.get((state, action), 0) future_value = max(self.value_estimates.get((next_state, a), 0) for a in self.get_actions(next_state)) new_value = old_value + self.learning_rate ( reward + self.discount future_value - old_value ) self.value_estimates[(state, action)] = new_value self.policy.update(state, action, new_value)

    def feedback(self, reward): # Process reward signal and trigger learning state, action = self.history[-1] next_state = self.current_state self.learn(state, action, reward, next_state)

    Advantages Over Utility-Based

    • Adapts to changing environments
    • Can discover utility functions from experience
    • Improves with exposure to more data
    • Handles environments too complex to model explicitly

    Limitations

    • Requires substantial training data or experience
    • May learn incorrect patterns (distribution shift)
    • Learning process can be unstable
    • Exploration vs. exploitation tradeoff
    • Computational cost of training

    When to Use

    Learning agents work well when:

  • The optimal policy is unknown or complex
  • The environment provides feedback signals
  • Sufficient training data/experience is available
  • Adaptation to change is important
  • Comparison Table

    TypeMemoryGoalsOptimizationLearningExample Use Case
    Simple ReflexNoneNoneNoneNoneThermostat
    Model-BasedWorld modelNoneNoneNoneRobot vacuum
    Goal-BasedWorld modelExplicitGoal satisfactionNoneGPS navigation
    Utility-BasedWorld modelImplicit in utilityMaximize utilityNoneInvestment bot
    LearningWorld modelCan learnCan learnYesLLM assistant
    ## Modern LLM-Powered Agents

    Today's LLM-powered agents are typically combinations of these types:

    Claude, GPT-4, and similar models are primarily learning agents (trained on trillions of tokens) combined with goal-based behavior (following instructions) and model-based reasoning (maintaining conversation context). They are hybrid architectures, not a single type.

    Agentic systems built on LLMs layer additional capabilities:

  • Explicit planning capabilities (goal-based)
  • Tool use and environment interaction (model-based)
  • Reward signals and feedback loops (utility-based + learning)
  • Memory systems (model-based)
  • The five types are not just textbook theory. They are the architectural DNA of every intelligent system shipping today.

    Practical Implementation Considerations

    When building AI agents, the architecture decision matters more than the implementation details. Get the type right first. Here is a practical framework:

    Start Simple

    Begin with the simplest agent type that could work. Over-engineering kills more agent projects than under-engineering. Only add complexity when simple reflex rules provably fail.

      Match Type to Problem

    • Fully observable, simple domain → Simple reflex
    • Partial observability → Model-based
    • Clear goal, complex path → Goal-based
    • Multiple objectives, tradeoffs → Utility-based
    • Complex/unknown optimal behavior → Learning

    Hybrid Approaches

    Real production systems almost always combine types. Tesla's Autopilot uses model-based perception, goal-based planning, and learning-based refinement in a single pipeline. Do not be afraid to mix.

      Testing and Evaluation

    • Reflex agents: Test rule coverage (aim for 100% of known conditions)
    • Model-based: Test model accuracy against ground truth
    • Goal-based: Test goal achievement rate across 1,000+ scenarios
    • Utility-based: Test utility achieved vs. theoretical optimum
    • Learning: Test improvement curves over time

    How Clarvia Builds AI Agents

    At Clarvia, we build AI agents that combine these theoretical foundations with practical engineering. Our agents typically:

    • Use LLM capabilities as a foundation (learning agents)
    • Maintain explicit goals and context (goal-based)
    • Model the environment through RAG and memory (model-based)
    • Optimize for client-specific metrics (utility-based)

    This combination delivers agents that are sophisticated yet practical. See our approach in action: Why We Build Products Using Only AI.

    The Bottom Line

    The five agent types -- simple reflex, model-based, goal-based, utility-based, and learning -- are not academic categories. They are engineering blueprints. Every intelligent system you have ever used maps to one of these architectures, or a combination of them.

    The progression is clear: more sophisticated agents handle more complex problems, but at higher computational and design cost. The smartest engineering decision is not always the most advanced agent type. It is the simplest one that solves your problem.

    Architecture is destiny. Choose wisely.

    Building intelligent agents for your application? Contact Clarvia to discuss how we can help design and implement the right AI agent architecture for your needs.

    5 types of AI agentsAI agent typesreflex agentslearning agents

    Ready to Transform Your Development?

    Let's discuss how AI-first development can accelerate your next project.

    Book a Consultation

    Cookie Preferences

    We use cookies to enhance your experience. By continuing, you agree to our use of cookies.