Summary
NPC perception is the set of systems that determine what an agent can see, hear, and sense at any given moment, and how that information builds into awareness and action. Good perception design is as much about player experience as about technical accuracy — the player must be able to predict what the NPC can and cannot perceive, and the system should produce results that feel fair.
The techniques below are drawn from Botta (The Last of Us Infected AI), McIntosh (The Last of Us Human AI), and Dyckhoff (The Last of Us Buddy AI), all in Game AI Pro 360 (see source-game-ai-pro-360-character-behavior).
Key ideas
- Perception and affordance: What the NPC can perceive directly shapes what behaviours are available to it. Design perception before designing responses.
- Player communication: NPCs must communicate their perceptual state to the player (agitated states, animations, sounds). If a perception feature cannot be communicated clearly, consider removing it — even if technically correct.
- Fairness over realism: Player-favoured raycasting (e.g. checking the chest in stealth rather than the head) makes the system feel fairer without being obviously “wrong.”
- Separate AI senses from game audio: Logical sound events decouple AI hearing from the audio mix, enabling precise designer control over detection ranges.
Vision
Simple frustum (baseline — avoid)
The most naïve implementation: a cone frustum centred on the NPC’s forward vector, with a fixed half-angle and range. Any target within the cone and with no occlusion is detected.
Problem: At close range the cone is too narrow (targets next to the NPC are missed); at long range it is too wide (targets far away are detected unrealistically).
Variable-angle vision cone (recommended)
Used in The Last of Us (McIntosh). The angle of the vision cone is inversely proportional to distance:
// Approximate implementation
bool CanSeeTarget(Vector3 npcPos, Vector3 npcForward, Vector3 targetPos, float maxRange)
{
Vector3 toTarget = targetPos - npcPos;
float distance = toTarget.magnitude;
if (distance > maxRange) return false;
// Angle shrinks with distance — wider close up, narrower far away
float maxAngle = Mathf.Lerp(85f, 20f, distance / maxRange);
float angle = Vector3.Angle(npcForward, toTarget.normalized);
if (angle > maxAngle) return false;
return !Physics.Linecast(npcPos + Vector3.up * eyeHeight, targetPos, obstacleMask);
}This eliminates both failure modes of the simple frustum.
Awareness timers (graduated detection)
Rather than instant detection on first frame of visibility, run an awareness timer:
- Timer increments each frame the target is visible.
- Timer decrements each frame the target is not visible.
- Detection triggers when the timer reaches a threshold (e.g. 1–2 seconds in normal state).
- Threshold is lower when already in combat (faster re-acquisition); higher in stealth (player gets more grace).
private float awarenessTimer = 0f;
private float detectionThreshold = 1.5f; // seconds
void UpdatePerception()
{
if (CanSeeTarget(target.position))
awarenessTimer += Time.deltaTime;
else
awarenessTimer = Mathf.Max(0f, awarenessTimer - Time.deltaTime);
if (awarenessTimer >= detectionThreshold)
OnTargetDetected();
}Occlusion — single-point raycasting
The Last of Us tried multi-joint weighted raycasting (raycast to nearly every joint, threshold at 60% weighted coverage). This produced mixed results: players could not predict which cover positions were safe.
The solution was a single raycast to a context-dependent point on the body:
- Stealth: Raycast to centre of chest. Favours the player — requires more of the body to be exposed.
- Combat: Raycast to top of head. Assumes the NPC is alert and tracking precisely.
Vector3 GetRaycastTarget(PlayerState state, Transform playerTransform)
{
return state == PlayerState.Stealth
? playerTransform.position + Vector3.up * 1.0f // chest
: playerTransform.position + Vector3.up * 1.8f; // head
}Hearing — logical sound events
Design principle
AI hearing should not react to the actual audio heard in the game. Instead, use logical sound events: designer-specified data packets broadcast at a radius, independent of the audio system (Botta, Ch. 1, source-game-ai-pro-360-character-behavior).
Benefits:
- Audio team can change sounds without affecting AI.
- Designer controls exactly which sounds trigger which reactions at what range.
- Supports silent logical sounds (e.g. a “breathing” event at melee range with no audio — so NPCs notice a stationary nearby player).
Broadcasting
public struct LogicalSoundEvent
{
public Vector3 Position;
public float BaseRadius;
public SoundType Type;
}
// Broadcast: find all NPCs within radius and notify them
void BroadcastSound(LogicalSoundEvent sound)
{
float radius = sound.BaseRadius;
Collider[] nearby = Physics.OverlapSphere(sound.Position, radius, npcLayer);
foreach (var col in nearby)
{
NPC npc = col.GetComponent<NPC>();
if (npc == null) continue;
// Check occlusion
if (!IsOccluded(sound.Position, npc.transform.position))
npc.OnHearSound(sound);
}
}Hearing sensitivity modifiers
In The Last of Us, Infected hearing sensitivity varied by:
- Character type (all Infected hear ~6× better than Hunters)
- Current behaviour state (unaware NPCs hear less acutely — slows stealth pace; gives player time to plan)
- Speed of nearby player movement — movement sound radius scales with player velocity
// Effective radius multiplier example
float GetHearingMultiplier(NPCBehaviourState state)
{
return state switch {
NPCBehaviourState.Unaware => 0.4f,
NPCBehaviourState.Agitated => 0.8f,
NPCBehaviourState.Alerted => 1.0f,
_ => 1.0f
};
}Sound occlusion
Rays are cast from each NPC in range to the sound source. If completely occluded by geometry, the NPC does not hear it. Partial occlusion attenuates without blocking. This naturally encourages players to use cover to mask movement sounds.
Exposure maps
Used in The Last of Us human enemy AI (McIntosh, Ch. 2, source-game-ai-pro-360-character-behavior).
An exposure map is a 2D bitmap overlaid on the navmesh. Each bit (or float) represents whether that position is currently visible to a specific entity (typically the player).
Construction
On the PS3 SPUs, a simple height-map raycasting pass in a 360° circle from the player produced the bitmap in 2–3 ms. Updated continuously as the player moves.
Usage
The exposure map feeds into pathfinding as an additional cost:
pathCost(A → B) = standard_cost(A,B) + α × exposureIntegral(A→B)
Where exposureIntegral sums exposure map values along the path segment. An NPC pathfinding with exposure map cost will take routes that minimise how much of the path is visible to the player — producing natural flanking and cover-seeking behaviour without explicit “find cover” logic.
A secondary NPC exposure map (what all NPCs can see) enables the search map.
Search maps
When the player’s location is lost, a search map tracks probable player positions. Grid cells start active at the last known position, then bleed into neighbouring cells over time (spreading uncertainty). Any cells currently visible to an NPC are cleared. The result is a spatial probability surface guiding where NPCs should search.
In practice — Unity exposure map sketch
A simplified approach without SPU-level raycast parallelism:
public class ExposureMap : MonoBehaviour
{
public int resolution = 64;
public float mapRadius = 30f;
private float[,] map;
private Transform observer;
void Update()
{
RebuildFrom(observer.position);
}
void RebuildFrom(Vector3 origin)
{
map = new float[resolution, resolution];
float cellSize = (mapRadius * 2f) / resolution;
for (int x = 0; x < resolution; x++)
for (int y = 0; y < resolution; y++)
{
Vector3 cellWorld = new Vector3(
origin.x - mapRadius + (x + 0.5f) * cellSize,
origin.y,
origin.z - mapRadius + (y + 0.5f) * cellSize);
bool visible = !Physics.Linecast(origin + Vector3.up, cellWorld + Vector3.up, obstacleMask);
map[x, y] = visible ? 1f : 0f;
}
}
public float GetExposure(Vector3 worldPos)
{
// Map worldPos to grid index and return value
// ...
return 0f;
}
}For production use, rebuild asynchronously using jobs or compute shaders; 64×64 on the CPU each frame is expensive for most Unity targets.
Splinter Cell: Blacklist perception model
Described by Walsh (Chapter 7, see source-game-ai-pro-360-tactics-strategy). Blacklist’s stealth-oriented design required perception to feel fair and readable at all times.
“It is only important what’s plausible from the player’s point of view — it really doesn’t matter what the NPC should see or hear; it’s what the player thinks the NPC can see and hear.” — Walsh, Chapter 7
Coffin-shaped detection volumes
Rather than a simple cone frustum, Blacklist guards use a coffin-shaped detection volume — a box aligned to the guard’s facing direction with rounded ends. This shape is more predictable for players to reason about than a narrow cone (too easy to exploit the edges) or a pure sphere (ignores facing direction entirely).
The volume is further divided into near and far zones. Detection in the far zone is slower (higher awareness timer threshold) than the near zone, so players close to a guard are detected more quickly than players at the edge of vision — matching player expectation.
Bone-raycast visibility
Rather than a single point raycast (chest or head), Blacklist raycasts to multiple bone positions on the player skeleton (head, chest, hips, feet). Visibility is determined by the fraction of bones that are unoccluded:
- 0 bones visible → not seen
- 1–2 bones visible → partial visibility (slower awareness timer)
- 3+ bones visible → full visibility (full detection speed)
This produces graded visibility — a player peeking over cover is partially seen, not simply seen-or-not. The bone count threshold is tunable per guard type.
Tiered bark system
When a guard’s awareness changes, voice barks communicate state to the player. The bark system has three tiers:
| Tier | Trigger | Example |
|---|---|---|
| Tier 1 | Specific, contextual bark | ”I heard something near the crates.” |
| Tier 2 | Semi-contextual bark | ”Something’s not right here.” |
| Tier 3 | Generic fallback bark | ”Hello? Is someone there?” |
Tier 1 is used first. If the specific context is not available (no voice line for this exact situation), Tier 2 is used. Tier 3 is the final fallback. This avoids awkward silence while keeping barks contextually meaningful as often as possible.
TEAS — Tactical Environment Awareness System
TEAS is a pre-computed connectivity graph of the level divided into areas (rooms, corridors, sections of exterior). Edges in the graph represent the acoustic paths between areas — doorways, vents, open windows.
Sound propagation uses this graph rather than point-to-point raycasting:
- A sound event occurs in Area A.
- TEAS finds all areas connected to A within N hops.
- Guards in those connected areas receive attenuated sound events (attenuated by hop count and edge “openness” — a solid wall vs. a cracked window).
This approach is area-path-based, not geometrically accurate — it cannot model exact sound propagation directions, but it handles the common cases (sound through a door, muffled by a wall) with minimal runtime cost. Designers mark area connections explicitly, giving them control over which sounds propagate where.
Benefits over point-to-point raycast hearing:
- No per-guard raycasts to the sound source.
- Naturally handles multiple rooms and indirect propagation.
- Designer-controlled — a sound artist can mark a particular wall as “thick” (low openness) without geometry changes.
Offscreen hearing reduction
Guards offscreen (outside the camera frustum) have their hearing radius reduced. A guard that is not visible to the player is less likely to overhear the player — not because of any in-fiction reason, but because an unfair detection by an offscreen guard feels arbitrary and frustrating.
This is an explicit fairness trade-off: accuracy is sacrificed for player experience. Importantly, it is invisible to the player — they only know what they see on screen.
Social and contextual awareness
Guards communicate awareness states to each other via a group behaviour system. When one guard enters a suspicious state, nearby guards can be flagged with the same state — producing coordinated group responses without each guard independently detecting the player.
Contextual awareness modifies detection behaviour: a guard at a post (waiting, idle) has baseline sensitivity. A guard who was recently alerted has elevated sensitivity for a cooldown period. A guard who has just been subdued by an ally has maximum sensitivity — they are actively searching.
Trade-offs
| Feature | Cost | Benefit |
|---|---|---|
| Variable-angle cone | Negligible | Much better match to player expectation |
| Awareness timer | Minimal | Prevents instant detection; tolerates brief exposure |
| Single-point raycast | Minimal | Predictable, tunable per game state |
| Bone-raycast (Blacklist) | Low–moderate (fixed bone count) | Graded visibility; more nuanced cover use |
| Coffin-shaped volume | Designer setup | More predictable player mental model |
| Logical sound events | Moderate (designer setup) | Audio/AI decoupling; precise control |
| TEAS sound propagation | Preprocessing (graph build) | Handles multi-room propagation; designer-controlled |
| Exposure map | High (rebuild cost) | Enables threat-aware pathfinding and search |
Evidence
- Botta (Ch. 1, see source-game-ai-pro-360-character-behavior) describes The Last of Us Infected sensing in detail, including the logical breathing sound with no audio, hearing multipliers by behaviour state, and sound-speed scaling.
- McIntosh (Ch. 2, same source) describes the variable-angle vision cone, awareness timers, single-point raycasting, and the exposure map construction and use.
- Dyckhoff (Ch. 3, same source) notes that Ellie ultimately became invisible to enemies in stealth — the perception system was too accurate to be playable, requiring the “cheat” for buddy viability.
- Walsh (Ch. 7, see source-game-ai-pro-360-tactics-strategy) describes the Splinter Cell: Blacklist perception model in full, including coffin-shaped detection volumes, bone-raycast visibility, tiered bark system, TEAS connectivity graph, offscreen hearing reduction, and social/contextual awareness.
Implications
- Perception design directly impacts difficulty. Reducing sensitivity in unaware states creates pacing room for stealth; increasing it in combat creates pressure.
- Communicating perceptual states to the player (agitated animations, audio cues) is as important as the mechanics themselves — undiscoverable mechanics frustrate players.
- Exposure maps are high-value for stealth games but expensive. Consider approximating with a field-of-view polygon or per-cover-point precomputation for simpler games.
Open questions
- How can exposure maps be approximated efficiently in Unity for mobile targets?
- Is there a principled way to tune awareness timer thresholds during playtesting — what observable player behaviour signals that thresholds need adjustment?
- How does logical sound event design differ between stealth-focused and action-focused games?
Related
game-ai-agent-design · ai-state-machine-pattern · buddy-ai · combat-coordinator-pattern · influence-maps · tactical-position-selection · source-game-ai-pro-360-character-behavior · source-game-ai-pro-360-tactics-strategy