Summary

Utility AI is a decision-making architecture in which every action an agent could take is assigned a numerical utility-score, and the agent always selects the highest-scoring valid action. Unlike state machines, which transition between predefined states, and Behaviour Trees, which traverse a fixed hierarchy, utility AI evaluates a dynamic, open set of actions every update. The result is an agent that selects contextually appropriate actions without requiring explicit transition logic between them. Yannakakis and Togelius place utility-based AI alongside FSMs and BTs as one of the main authored-control approaches in game AI. (Yannakakis and Togelius, Artificial Intelligence and Games, see source-ai-and-games)

The approach is greedy: it does not plan ahead. It asks “what is the best thing to do right now?” rather than “what sequence of actions leads to my goal?” (For that, see GOAP.)


Key ideas

  • Action enumeration: At any given moment there is a finite, bounded set of actions the agent can perform. Each action is evaluated independently.
  • Utility score: A floating-point value that represents how desirable an action is in the current context. Higher = more desirable.
  • Greedy selection: After scoring all valid actions, the agent selects and executes the one with the highest score.
  • Target selection: Many actions require a target. The utility score is calculated per target; the best (target, action) pair wins.
  • Continuous reevaluation: Scores are recalculated every AI update. If the world changes and a different action now scores higher, the agent switches.

In practice — Dragon Age: Inquisition’s Behaviour Decision System (BDS)

Dragon Age: Inquisition (BioWare Edmonton) used a utility architecture called the Behaviour Decision System. The following is drawn from Hanlon and Watts (Game AI Pro 360, Chapter 7, see source-game-ai-pro-360-character-behavior).

Behaviour snippets

The fundamental unit of the BDS is the behaviour snippet — a data structure that encapsulates one way of using one ability. Snippets are registered to a character at runtime (e.g. when a weapon is equipped, its snippets are registered; when unequipped, they are removed). A complex AI character might have 10–50 snippets registered simultaneously.

Crucially, a snippet is not just an ability — it is a specific intent when using that ability. A “Charging Bull” ability might have two snippets: one for offensive charge, and one for retreat. The snippets encapsulate the knowledge of when and why each usage makes sense.

Evaluation trees

Each snippet contains an evaluation tree — a modified behaviour tree that returns a (score, target) pair instead of success/failure. The tree starts at zero and accumulates score as scoring nodes fire.

Evaluation is contextual: nodes have access to the evaluating character and the current world state, so the same tree produces different scores in different situations.

Target selection is handled by a dedicated TargetSelector node that iterates over designer-specified candidates, evaluates the subtree for each, and selects the highest-scoring target:

// Pseudocode from Hanlon & Watts (Game AI Pro 360, Ch. 7)
struct SnippetEvaluation {
    BehaviorSnippet snippet;
    bool result;
    int score;
    Character target;
};

Optional<SnippetEvaluation> EvaluateSnippets() {
    list<SnippetEvaluation> evaluatedSnippets;
    for (BehaviorSnippet snippet : registeredBehaviors) {
        SnippetEvaluation evaluation = snippet.evaluate();
        if (evaluation.result == true)
            evaluatedSnippets.push(evaluation);
    }
    sortByDescendingScore(evaluatedSnippets);
    if (!evaluatedSnippets.empty())
        return evaluatedSnippets.first();
    return None;
}

Scoring convention

DA:I used a designer-facing scoring convention rather than automatic normalisation. Scores are not percentages — they are in consistent ranges that everyone on the team understands:

Action ClassScore RangeIntent
Basic10Preferable to nothing; all basic actions are equivalent
Offensive20–40Always beat basic; 20 points of urgency dynamic range
Support25–45Beat same-urgency offensive actions; used before engaging
Reaction50–70Respond to immediate threats; trump everything else

A “Support” snippet starts by granting 25 points and conditionally adds up to 20 more based on context. A “Reaction” snippet only fires when specific threat criteria are met, but when it does, it outscores everything.

Execution trees

Once the highest-scoring snippet is selected, its execution tree runs. This is a standard behaviour tree that handles movement, positioning, and the actual ability trigger. Because the target is stored in the evaluation context, execution trees can be generic and reused across many snippets.

Execution continues until the “Execute Ability” node fires. The BDS keeps re-evaluating during execution, so if circumstances change and the current snippet is no longer optimal, a better one can interrupt it.

Passive and movement behaviours via utility

The BDS also governs passive behaviours. A “follow the leader” snippet returns a constant score of 0 — lower than any real action — and runs whenever nothing else is valid. Tethering (forcing a character to return to an area) is implemented as a snippet with a very high score that only activates when the character is out of bounds.

Unity implementation sketch

A minimal Utility AI loop in C#:

public class UtilityAI : MonoBehaviour
{
    [SerializeField] private List<AIAction> actions;
    private AIAction currentAction;
 
    void Update()
    {
        AIAction best = SelectBestAction();
        if (best != currentAction)
        {
            currentAction?.OnExit();
            currentAction = best;
            currentAction?.OnEnter();
        }
        currentAction?.OnTick();
    }
 
    private AIAction SelectBestAction()
    {
        AIAction best = null;
        float bestScore = float.MinValue;
        foreach (var action in actions)
        {
            float score = action.Evaluate(this);
            if (score > bestScore)
            {
                bestScore = score;
                best = action;
            }
        }
        return best;
    }
}
 
public abstract class AIAction : MonoBehaviour
{
    public abstract float Evaluate(UtilityAI agent);
    public virtual void OnEnter() {}
    public virtual void OnTick() {}
    public virtual void OnExit() {}
}

Each concrete AIAction subclass overrides Evaluate() with its own utility curve logic.


Trade-offs

When to use utility AI

  • The agent has many possible actions with complex, context-sensitive priorities.
  • You want designers to author priorities through data (score ranges) rather than code (transition conditions).
  • Actions need to be added and removed at runtime (e.g. equipment system).
  • You want transparent debugging — a table of scores is easy to inspect.

Limitations

  • No planning: Utility AI is greedy. It cannot plan a sequence of sub-optimal short-term actions to reach a better long-term state. Combine with GOAP if long-range planning is required.
  • Score normalisation is hard: Without a consistent convention (like DA:I’s scoring ranges), scores from different designers drift apart and the system becomes unpredictable.
  • Oscillation: If two actions score nearly equally, the agent may oscillate between them. Add hysteresis (a bias toward the currently executing action) to prevent this.
  • Performance: Evaluating all snippets every frame scales O(n) with the number of snippets. For large numbers of agents, evaluate less frequently or batch updates.

Comparison with behaviour trees

Utility AIBehaviour Tree
Decision logicScored selectionPriority traversal
Adding new behavioursRegister a new snippetEdit the tree
DebuggingScore table is readableTree visualiser needed
Emergent priorityContinuous, context-sensitiveFixed structural priority
PlanningNone (greedy)None (reactive)

Both can be combined — see the DA:I pattern where evaluation and execution trees are behaviour trees, but the selection mechanism is utility-based.


Evidence

  • Dragon Age: Inquisition’s Behaviour Decision System is described in detail by Hanlon and Watts (Game AI Pro 360, Ch. 7, see source-game-ai-pro-360-character-behavior). The BDS handled more than 60 abilities across party members and hostile creatures.
  • The action-ranking system in Shroff (Ch. 4, same source) describes using utility-based ranking to control which on-screen NPCs execute interesting behaviours — a lighter application of the same principle.
  • The stochastic grammar chapter (Lewis, Ch. 12) notes that utility scores can be fed into grammar weights at runtime, blending structured randomness with utility-based decision-making.

Implications

  • Utility AI is a strong alternative to behaviour trees for combat AI where many abilities must be prioritised contextually. It trades structural clarity (BTs) for flexible runtime scoring.
  • A scoring convention is as important as the scoring algorithm — it is the shared language between programmers and designers.
  • The snippet pattern (data asset = ability knowledge) is a useful model for any system where behaviours need to be added/removed at runtime based on game state (inventory, equipment, status effects).

Open questions

  • How should utility scores handle resource costs (mana, cooldowns)? DA:I scores abilities by energy cost as a proxy for quality — does this generalise?
  • What is the best way to implement utility AI in Unity’s ECS/DOTS context for large numbers of agents?
  • Can utility AI be combined with a Maslow-style needs hierarchy (see game-ai-agent-design) to handle both immediate tactical choices and longer-term goal selection in a single system?

game-ai-agent-design · overview-cre341-agent-ai-route · utility-score · ai-state-machine-pattern · steering-behaviours · influence-maps · combat-coordinator-pattern · source-game-ai-pro-360-character-behavior