Summary

Game AI agent design is the study of how to make NPCs, enemies, companions, and other autonomous agents behave in ways that are readable, purposeful, and fun to play against. In practice, students usually arrive here through questions like “how do I make enemy AI?”, “should I use behaviour trees or GOAP?”, or “how smart should an NPC be?” The answer is that good game AI is not about maximum intelligence; it is about the right level of intelligence for player experience, production cost, and design clarity. Millington’s production-focused textbook and Yannakakis and Togelius’ field overview together show that the topic spans both classical NPC control and the broader AI-and-games research landscape. (Millington, Artificial Intelligence for Games, see source-artificial-intelligence-for-games; Yannakakis and Togelius, Artificial Intelligence and Games, see source-ai-and-games)

(Prof Charles, CRE341 Wk 5.1, see source-cre341-lectures)


What is AI?

Russell and Norvig identify four definitions of AI:

FrameDefinitionNotes
Thinks like humansCognitive science approach; models tested against psychologyRarely the game goal
Acts like humansTuring Test (1950) — observable behaviour matches human outputCommon game goal
Thinks rationallyAristotelian logical reasoningBrittle in open environments
Acts rationallyActs to achieve goals given current beliefsMost applicable to game agents

Game AI primarily pursues acting rationally in the context of entertaining human opponents — including deliberate sub-rational behaviour.


Artificial stupidity

“Real people are stupid… sometimes! There can be much humour in a well-designed stupid-bot!” — Falstein, The 400 Project — Rules of Game Design

Game AI design rules (Falstein):

  1. Make the effects of the AI visible to the player — players should understand why the NPC did what it did
  2. Add a small amount of randomness to AI calculations — prevents deterministic exploitation; makes the AI feel alive
  3. Create AI in the mind of the player — the player’s perception of intelligence matters more than the actual algorithm

Design principle: opponents should be beatable. Unpredictability and minor mistakes make agents more entertaining than optimal opponents.


Autonomous agents

“An autonomous agent is a system situated within and a part of an environment that senses that environment and acts on it, over time, in pursuit of its own agenda and so as to effect what it senses in the future.”

Physical autonomous agent movement has three layers (each can be swapped independently):

LayerResponsibilityExample
Action SelectionChoose goals; decide which plan to follow”Attack nearest enemy”
SteeringCalculate desired trajectory to satisfy goalReynolds steering behaviours
LocomotionExecute physical movement from A to BNavMesh agent, Rigidbody physics

Separating these layers allows independent improvement or replacement of each component. See also steering-behaviours.


Agent architectures

Utility-based agents

Choose actions based on utility functions that evaluate expected benefit. Each possible action receives a score; the agent selects the highest-scoring action given its current state. Scales well to large action sets but requires careful tuning of utility functions.

Maslow’s hierarchy applied to agents

Agent needs can be modelled hierarchically (analogous to Maslow’s human needs pyramid):

  • Base needs: survival, health, immediate safety
  • Mid needs: territory, resources, social behaviour
  • Higher needs: goal pursuit, self-improvement

Higher-level behaviours only activate when lower-level needs are satisfied, producing naturalistic prioritisation without complex branching logic.


Behaviour Trees

See also behaviour-trees for a dedicated page.

A Behaviour Tree (BT) composes AI behaviours hierarchically using a tree structure of nodes. Execution flows from root to leaves; each node returns Success, Failure, or Running.

Node types

NodeSymbolBehaviour
Selector?Tries children left to right; returns Success on first success (OR logic)
SequenceRuns all children; returns Failure on first failure (AND logic)
DecoratorDiamondWraps a child; modifies its result (invert, repeat, delay, limit)
LeafRectangleExecutes an action or checks a condition

Example structure

Root (Selector)
├── Flee Sequence (Sequence)
│   ├── Is health low? (Condition)
│   └── Move away from enemy (Action)
├── Attack Sequence (Sequence)
│   ├── Can see enemy? (Condition)
│   └── Attack enemy (Action)
└── Patrol (Action)

Priority is expressed by child order. The Selector tries Flee first, then Attack, then Patrol — health preservation takes precedence over combat.

Why BTs over state machines:

  • Reusable subtrees (share “attack” logic across different enemy types)
  • Priority expressed structurally, not via explicit transition rules
  • Easier to debug (clear execution path is traceable)
  • Better for reactive agents with many competing behaviours

Resources:

  • Chris Simpson’s Gamasutra guide on Behaviour Trees (canonical intro)
  • HTN (Hierarchical Task Networks): combines BT-style composition with planning (aiandgames.com)

Goal Action Planning (GOAP)

See also goal-oriented-action-planning for a dedicated page.

GOAP is a deliberative planner architecture for game AI. Rather than a fixed behaviour tree, the agent dynamically constructs a plan — a sequence of actions — to reach a goal from the current world state.

Core components

ComponentDescription
World StateCurrent facts about the world (and agent’s inventory/status)
GoalDesired world state the agent wants to reach
ActionsAvailable atomic operations; each has preconditions and effects
PlannerSearches for a valid action sequence that transitions from current state to goal state
PlanOrdered sequence of actions output by the planner

Why GOAP over a fixed BT?

A BT encodes behaviours the designer anticipates. GOAP lets the agent compose novel plans from available actions — the agent can respond to situations the designer did not explicitly script, because any valid chain of actions leading to the goal is acceptable.

Classic example (F.E.A.R.): The enemy AI could kill a player with a gun, kick a table for cover, throw a grenade through a window, or sprint to a flanking position — all emergent from the same GOAP planner deciding the cheapest action sequence toward “enemy dead”.

Trade-offs

Behaviour TreeGOAP
ExpressivenessReactive priorities, hierarchicalDeliberative, plan-based
ComplexityModerateHigher (planning search cost)
Designer controlHigh (explicit tree)Lower (emergent plans)
Surprising behavioursLimitedCan surprise designers and players
Best forReactive, time-critical agentsComplex multi-step goal achievement

Multiple plans can also be sequenced (e.g. “get weapon” then “kill enemy”) for longer-horizon behaviour.


Hierarchical Task Network (HTN) planning

HTN planning is a structured alternative to GOAP. Rather than searching an unconstrained action space for a path to the goal, HTN decomposes tasks top-down through a hierarchy of methods — conditional decomposition rules authored by the designer.

Structure

TermDefinition
TaskAn abstract or primitive unit of work (e.g. behave, combat-behaviour, shoot-at-enemy)
Compound taskA task that decomposes into subtasks via methods
Primitive actionA leaf task with no further decomposition — directly executable
MethodA rule: if preconditions hold, decompose this compound task into [list of subtasks]
DomainThe complete set of tasks and methods for a character type

Execution

The planner attempts methods in declared priority order. The first method whose preconditions are satisfied is applied, producing subtasks that are in turn decomposed until all tasks are primitive. Execution proceeds through the resulting flat plan.

Task: behave
  Method 1 [precondition: health < 20%] → flee-to-safety
  Method 2 [precondition: isMedic AND allyNeedsHealing] → heal-ally
  Method 3 [precondition: hasSquadOrder] → execute-squad-order
  Method 4 [precondition: canSeeEnemy] → combat-behaviour
  Method 5 [precondition: always] → idle-patrol

Priority is unambiguous — Method 1 always beats Method 2, regardless of relative utility scores. This is the key advantage of HTN over GOAP or utility AI for hierarchical squad bots: designers can predict and guarantee plan priority (Killzone 3 team, Chapter 4, see source-game-ai-pro-360-tactics-strategy).

Replanning

The HTN planner does not replan every frame. Replanning is triggered by:

  • A plan step completing (needs next step).
  • A continuation condition failing — a check run every tick that validates the current plan is still applicable. If false, the agent replans immediately.
  • A new external order arriving (e.g. squad commander sends a new objective).
  • A fixed-rate background check (safety net for edge cases).

Bounding replanning to events rather than running it every frame keeps computational cost predictable.

HTN vs. GOAP

HTNGOAP
Priority controlExplicit — method orderImplicit — action costs
Designer legibilityHigh — the domain reads like a flowchartLower — emergent from cost functions
Surprising behavioursRare (predictable decomposition)Common (novel plans)
Best forHierarchical bots with clear priority orderingMulti-step goal pursuit in open environments

See squad-ai-patterns for a full production example of HTN at multiple hierarchy levels (individual bot, squad, commander).


Intelligent interactive storytelling

One of the core challenges of game AI is making NPCs behave consistently with narrative context in real time. Research and commercial projects are now combining game engines with large language models (LLMs) to produce dynamic narrative agents.

The following architecture (from a CRE341 research/consultancy project) illustrates one approach — a live-action theatre system where both human and AI actors participate in a shared interactive story:

IoT Devices ──────┐
Human Actors ─────┤──→ Interface ──→ Story Elements ──→
Human Director ───┘                                    ├──→ Unity App ──→ AI LLM Avatar
                                                       └──→ Networked Environs Manager

                  ChatGPT / Local LLM ──→ Story Sync

Components:

ComponentRole
IoT Devices + Human Actors + DirectorReal-world inputs providing scene context
Story Elements (from file)Authored scaffolding: characters, goals, narrative constraints
Unity AppManages virtual environment and scene state
AI LLM AvatarNPC driven by ChatGPT or local LLM; responds dynamically to scene state
Networked Environs ManagerCoordinates multiple Unity instances and actors
Story SyncKeeps AI dialogue consistent with current narrative state

The LLM generates contextually appropriate dialogue within the authored story constraints — neither fully scripted nor fully emergent. For the narrative design context, see narrative-design and generative-ai-game-dev.


Skills and behaviours split

In complex commercial AI systems, the action selection layer is often subdivided further. The Last of Us (Naughty Dog) uses a two-tier structure described by Botta (Game AI Pro 360, Ch. 1, see source-game-ai-pro-360-character-behavior):

  • Skills: High-level behavioural modes managed by a prioritised FSM. A skill decides what the NPC is trying to do — pursue the player, investigate a sound, patrol a route. Each skill is a discrete, named mode; only one skill is active at a time, and higher-priority skills interrupt lower-priority ones.
  • Behaviours: Modular, reusable implementations of individual capabilities — move to a location, play an animation, play a sound. Behaviours implement how the active skill achieves its goal. Multiple behaviours can run concurrently; they compose freely.

This split means skill code stays clean (decision logic only) and behaviours stay reusable across character types. A “Throw Brick” behaviour is shared by Ellie and enemy Infected without duplication.

Data-driven character types

An extension of this principle: all character types share a single AI class, differentiated entirely by data files. In The Last of Us, every Infected type (Runner, Stalker, Clicker, Bloater) runs the same C++ code. The data file for each type specifies which skills are available, which behaviours those skills activate, and all tuning values. New character types (Stalkers were added months before ship) require no code changes (Botta, Ch. 1, source-game-ai-pro-360-character-behavior).

This is the game AI equivalent of the component pattern: no character-type conditionals in code.

For a more detailed treatment of the commercial utility AI pattern (Dragon Age: Inquisition’s Behaviour Decision System), see utility-ai.


Challenges of digital game AI

  1. Intelligent interactive storytelling — NPCs must behave consistently with narrative context (see above)
  2. Dynamic learning and behaviour — agents that adapt to player tactics (see machine-learning-games)
  3. Representing emotion — credible emotional responses that serve the game’s tone

For CRE341, the most teachable route is not to treat every architecture as equally central. A better default is:

  1. ai-state-machine-pattern as the prerequisite architecture
  2. behaviour-trees as the canonical core
  3. utility-ai as the first major extension
  4. goal-oriented-action-planning as the advanced planning contrast
  5. HTN via squad-ai-patterns as the specialist production layer
  6. player-modelling as an adjacent AI-and-games topic rather than the core NPC-control path

This ordering fits the lecture framing particularly well: behaviour trees for reactivity, GOAP for emergent tactics, utility AI for smooth adaptation, and player modelling as part of the broader character/adaptation discussion rather than the default agent-control architecture. (Prof Charles, CRE341 Wks 4.2 and 5.1, see source-cre341-lectures)

See overview-cre341-agent-ai-route for the full argument and trade-offs.


In practice (Unity)

  • Unity NavMesh + NavMeshAgent handles Locomotion and pathfinding (see Wk 3 practicals)
  • Behaviour Trees: no native Unity support; common libraries include Behaviour Designer, NodeCanvas, or a custom implementation
  • GOAP: custom implementation typical; the GOAP tutorial series by Holistic3d (referenced in lectures) is a common starting point
  • The ai-state-machine-pattern page covers the simpler State Pattern approach, appropriate when BTs or GOAP are overkill