Summary
Strategic AI for real-time strategy (RTS) games operates above the individual unit level — it decides which objectives to pursue, which armies to commit, and how to respond to the broad spatial state of the map. The techniques described here address the two central problems of RTS strategic AI: spatial reasoning (where are the important locations and how do I reason about control of the map?) and combat outcome prediction (before committing forces, can I predict whether I will win the engagement?).
The spatial reasoning material draws from Kohan II / Total War AI (Chapters 10 and 11, see source-game-ai-pro-360-tactics-strategy). The combat prediction material draws from Dill (Chapter 13). The Hierarchical Portfolio Search approach is from Prismata / Chapter 18.
Key ideas
- Region-based spatial abstraction: Divide the map into regions of homogeneous terrain/ownership. Regions are the vocabulary of strategic reasoning — armies occupy regions, objectives are in regions, influence propagates between regions.
- Chokepoints and cul-de-sacs: Automatically detected from the region graph topology. These structures have clear strategic significance; the AI can exploit or defend them without hardcoded scripting.
- Hierarchical pathfinding: Route planning first at the region level, then at the unit level within the chosen region corridor. Orders-of-magnitude faster than full unit-level global search.
- Influence propagation: Enemy and friendly influence spreads across the region graph, giving the AI a continuous model of strategic pressure — allowing it to anticipate attacks and identify weak points.
- Lanchester models for combat prediction: Algebraic formulae (not simulation) predict battle outcomes in microseconds. Parameters are learned from replays.
Region-based map partitioning
Algorithm
The map is divided into homogeneous regions using a greedy flood-fill:
- Seed the fill from a grid of evenly spaced starting points across the walkable terrain.
- Expand each seed using flood-fill, only crossing into adjacent cells that are “similar” to the seed (same terrain type, no blocking geometry).
- Where two expanding regions meet, their boundary defines a border.
- Small regions below a minimum size are merged into their largest neighbour.
The result: a set of convex-ish regions whose boundaries naturally align with terrain features, walls, and passageways.
Chokepoint and cul-de-sac detection
The region graph has a node per region and an edge per shared border. Graph topology reveals strategic structure:
| Structure | Graph signature | Strategic meaning |
|---|---|---|
| Chokepoint | Edge whose removal disconnects the graph | Narrow passage; defensible bottleneck |
| Cul-de-sac | Node with exactly one edge | Dead end; attackable from one direction only |
| Hub | Node with 4+ edges | Central node; high strategic value |
Detection in Unity C# (simplified):
// Chokepoint detection: an edge (A,B) is a chokepoint if removing it
// disconnects the graph (bridge detection via DFS timestamps)
bool IsBridge(int u, int v, int[] disc, int[] low, bool[] visited)
{
// Tarjan's bridge-finding algorithm
// Returns true if edge (u,v) is a bridge
...
}The AI uses these labels directly: defend chokepoints before enemies reach them; avoid cul-de-sacs unless committing a full assault; contest hubs for strategic mobility.
Hierarchical pathfinding
Two-level pathfinding:
- Region-level A:* Find the sequence of regions from source to destination. Region graph is tiny (tens to hundreds of nodes) — very fast.
- Unit-level A:* Plan the unit route within the corridors defined by the region sequence. Constrained search — no cross-map wandering.
This is the same principle as NavMesh + high-level graph pathfinding (see also squad-ai-patterns area corridors). The cost of the region-level path can incorporate influence — routing around high-enemy-influence regions.
Region-level influence propagation
Overview
Each region has an influence value for each faction. Influence propagates between connected regions using a variant of Dijkstra’s algorithm run from all friendly and enemy regions simultaneously.
Benefits over a grid-based influence map:
- Works at strategic scale — a single region may be 50×50 units of terrain.
- Propagation respects the graph topology, so chokepoints naturally restrict influence flow.
- Much cheaper to compute: tens to hundreds of nodes rather than thousands of cells.
“Because the influence system gives the AI some warning that an attack is approaching — the enemy influence rises — we can avoid [the problem of defenders not being present].” — Dill, Chapter 10
Propagation formulae
Different propagation formulae model different strategic phenomena. Kohan II uses several simultaneously:
| Formula | Effect | Use |
|---|---|---|
Linear decay I(d) = I₀ - k·d | Influence falls steadily with region distance | General territorial presence |
Travel-time decay I(d) = I₀ / travelTime(d) | Influence weighted by actual movement time | Estimates arrival pressure |
Discontinuous (step) I(d) = I₀ if d < n, else 0 | Hard cutoff at n hops | ”Can reach within N turns” binary maps |
Power-law I(d) = I₀ × r^d | Exponential decay with ratio r | Threat modelling (power falls off quickly) |
Multiple maps are maintained simultaneously and queried in combination — e.g. (friendly_power - enemy_travel_time) identifies regions where allied strength exceeds expected incoming pressure, suitable for aggressive expansion.
Border calculation
Territory ownership is determined by comparing faction influence at each region’s border edges. A border region belongs to the faction with higher influence at that point. This naturally produces front lines without any explicit territory management code.
Combat outcome prediction — Lanchester models
Motivation
Before committing forces to an engagement, the AI needs to know: will I win? Simulation is too slow for planning purposes. Lanchester’s laws provide algebraic approximations.
Lanchester’s Linear Law (melee / one-on-one)
When combatants can only fight one-at-a-time (melee, single-target weapons):
dA/dt = -b · B
dB/dt = -a · A
Where a and b are per-unit kill rates. Outcome determined by:
A_survives = √(A₀² - (b/a) · B₀²)
If the expression inside the root is negative, force B wins.
Lanchester’s Square Law (ranged / area fire)
When all units can fire at all enemies simultaneously (ranged combat, AoE weapons):
dA/dt = -b · B
dB/dt = -a · A [same equations but different regime]
Outcome:
A_survives = √(A₀² - (b/a) · B₀²) [same formula]
The key difference is the interpretation of unit strength a and b. Under the square law, doubling unit count quadruples effective combat power — concentration of force has dramatically higher returns than in the linear law.
Mixed-regime generalisation
Real RTS games sit between pure melee and pure ranged. Dill (Chapter 13) models this as:
a_effective = a · A^(n-1)
Where n is a regime parameter:
n = 1: Linear law (pure melee)n = 2: Square law (pure ranged)n ≈ 1.56: Empirical fit for typical RTS combat
Parameter estimation from replays
Unit strength parameters a and b do not need to be hand-tuned. Given a replay dataset, they can be estimated using logistic regression:
- For each recorded battle: record starting army compositions, outcome (A wins / B wins), and surviving units.
- Formulate the Lanchester prediction as a logistic feature:
x = A₀^n - (b/a) · B₀^n. - Fit logistic regression: probability of A winning =
σ(x). - Gradient descent learns the ratio
b/aandnfrom the replay data.
This approach also generalises to heterogeneous armies: each unit type i has its own strength sᵢ, and army strength is A = Σ sᵢ · countᵢ^n. The regression learns all sᵢ simultaneously.
# Pseudocode: logistic regression for unit strengths
# features: [count_infantry_A, count_tank_A, count_infantry_B, count_tank_B]
# labels: [1 if A wins, 0 if B wins]
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(features, labels)
# Coefficients map to unit strength parameters after exponentiationThis approach is robust to game balance patches — re-run the regression on new replays to update the strength estimates.
Hierarchical Portfolio Search (HPS) — Prismata
For card/unit-based strategy games with very large action spaces, Hierarchical Portfolio Search (Roelofs et al., Chapter 18, see source-game-ai-pro-360-tactics-strategy) combines handcrafted partial-player functions with a top-level tree search.
Architecture
- Partial Player (PP) functions: Each PP handles one tactical phase (e.g. “worker management”, “defence assignment”, “attack selection”). A PP takes the game state and returns a complete set of actions for its phase.
- Portfolio: A set of PPs, one per phase. The full move for a turn is the cross-product of all PP outputs — every combination of partial decisions.
- Top-level search: MCTS or Negamax over the portfolio-generated children. The search evaluates full turn sequences, not individual action micro-decisions.
Benefits:
- The portfolio massively reduces the branching factor (from thousands of raw actions to tens of PP combinations).
- Individual PPs can be updated when the game is balance-patched without retraining the search.
- The symmetric-play evaluation heuristic (leaf nodes evaluate by simulating both sides playing the same PP) is fast and requires no explicit value function.
Master Bot results
The Prismata Master Bot built with HPS reached the top 25% of human-rated players in public matchmaking, despite Prismata being a complex simultaneous-decision game with no hidden information.
In practice (Unity RTS example)
A minimal region map with influence propagation:
public class RegionGraph
{
public List<Region> Regions;
public Dictionary<(int, int), Border> Borders;
public void PropagateInfluence(Faction faction, float decayPerHop)
{
// Dijkstra from all regions with friendly presence
var pq = new PriorityQueue<int, float>();
var influence = new float[Regions.Count];
foreach (var r in Regions.Where(r => r.GetForce(faction) > 0))
{
influence[r.Id] = r.GetForce(faction);
pq.Enqueue(r.Id, -influence[r.Id]);
}
while (pq.Count > 0)
{
pq.TryDequeue(out int id, out float neg);
float current = -neg;
if (current < influence[id]) continue; // stale entry
foreach (var neighbour in Regions[id].Neighbours)
{
float newInfluence = current - decayPerHop;
if (newInfluence > influence[neighbour.Id])
{
influence[neighbour.Id] = newInfluence;
pq.Enqueue(neighbour.Id, -newInfluence);
}
}
}
for (int i = 0; i < Regions.Count; i++)
Regions[i].SetInfluence(faction, influence[i]);
}
}Trade-offs
| Technique | Cost | Benefit |
|---|---|---|
| Region-based partitioning | Preprocessing; requires walkability data | Enables all region-level reasoning |
| Grid influence map | Moderate runtime rebuild | Higher spatial resolution |
| Region influence map | Very fast | Coarser; misses sub-region detail |
| Lanchester prediction | Near-zero | Very fast; generalisable to heterogeneous armies |
| Full simulation | High (per-frame unit sim) | Accurate; handles all edge cases |
| HPS + MCTS | Moderate (search + PP evaluation) | Scales to complex action spaces; patchable |
Evidence
- Dill (Chapter 10, see source-game-ai-pro-360-tactics-strategy) describes region partitioning, chokepoint detection, hierarchical pathfinding, and influence propagation as applied in Kohan II.
- Dill (Chapter 13) describes Lanchester models, the mixed-regime generalisation, and parameter estimation from replays.
- Roelofs et al. (Chapter 18) describe Hierarchical Portfolio Search and the Prismata Master Bot.
Open questions
- Lanchester models assume stationary engagement — how do they generalise to highly mobile, kiting-based combat where units are continuously repositioning?
- Is there a clean decomposition of the region-partitioning algorithm that works on NavMesh-generated geometry rather than raw heightmap grids?
- HPS requires handcrafted PP functions per tactical phase — how many phases are typically needed, and how is phase coverage determined for a new game?
Related
influence-maps · squad-ai-patterns · tactical-position-selection · mcts · game-ai-agent-design · source-game-ai-pro-360-tactics-strategy