Strategic AI (RTS)

Summary

Strategic AI for real-time strategy (RTS) games operates above the individual unit level — it decides which objectives to pursue, which armies to commit, and how to respond to the broad spatial state of the map. The techniques described here address the two central problems of RTS strategic AI: spatial reasoning (where are the important locations and how do I reason about control of the map?) and combat outcome prediction (before committing forces, can I predict whether I will win the engagement?).

The spatial reasoning material draws from Kohan II / Total War AI (Chapters 10 and 11, see source-game-ai-pro-360-tactics-strategy). The combat prediction material draws from Dill (Chapter 13). The Hierarchical Portfolio Search approach is from Prismata / Chapter 18.

Key ideas

Region-based spatial abstraction: Divide the map into regions of homogeneous terrain/ownership. Regions are the vocabulary of strategic reasoning — armies occupy regions, objectives are in regions, influence propagates between regions.
Chokepoints and cul-de-sacs: Automatically detected from the region graph topology. These structures have clear strategic significance; the AI can exploit or defend them without hardcoded scripting.
Hierarchical pathfinding: Route planning first at the region level, then at the unit level within the chosen region corridor. Orders-of-magnitude faster than full unit-level global search.
Influence propagation: Enemy and friendly influence spreads across the region graph, giving the AI a continuous model of strategic pressure — allowing it to anticipate attacks and identify weak points.
Lanchester models for combat prediction: Algebraic formulae (not simulation) predict battle outcomes in microseconds. Parameters are learned from replays.

Region-based map partitioning

Algorithm

The map is divided into homogeneous regions using a greedy flood-fill:

Seed the fill from a grid of evenly spaced starting points across the walkable terrain.
Expand each seed using flood-fill, only crossing into adjacent cells that are “similar” to the seed (same terrain type, no blocking geometry).
Where two expanding regions meet, their boundary defines a border.
Small regions below a minimum size are merged into their largest neighbour.

The result: a set of convex-ish regions whose boundaries naturally align with terrain features, walls, and passageways.

Chokepoint and cul-de-sac detection

The region graph has a node per region and an edge per shared border. Graph topology reveals strategic structure:

Structure	Graph signature	Strategic meaning
Chokepoint	Edge whose removal disconnects the graph	Narrow passage; defensible bottleneck
Cul-de-sac	Node with exactly one edge	Dead end; attackable from one direction only
Hub	Node with 4+ edges	Central node; high strategic value

Detection in Unity C# (simplified):

// Chokepoint detection: an edge (A,B) is a chokepoint if removing it
// disconnects the graph (bridge detection via DFS timestamps)
bool IsBridge(int u, int v, int[] disc, int[] low, bool[] visited)
{
    // Tarjan's bridge-finding algorithm
    // Returns true if edge (u,v) is a bridge
    ...
}

The AI uses these labels directly: defend chokepoints before enemies reach them; avoid cul-de-sacs unless committing a full assault; contest hubs for strategic mobility.

Hierarchical pathfinding

Two-level pathfinding:

Region-level A:* Find the sequence of regions from source to destination. Region graph is tiny (tens to hundreds of nodes) — very fast.
Unit-level A:* Plan the unit route within the corridors defined by the region sequence. Constrained search — no cross-map wandering.

This is the same principle as NavMesh + high-level graph pathfinding (see also squad-ai-patterns area corridors). The cost of the region-level path can incorporate influence — routing around high-enemy-influence regions.

Region-level influence propagation

Overview

Each region has an influence value for each faction. Influence propagates between connected regions using a variant of Dijkstra’s algorithm run from all friendly and enemy regions simultaneously.

Benefits over a grid-based influence map:

Works at strategic scale — a single region may be 50×50 units of terrain.
Propagation respects the graph topology, so chokepoints naturally restrict influence flow.
Much cheaper to compute: tens to hundreds of nodes rather than thousands of cells.

“Because the influence system gives the AI some warning that an attack is approaching — the enemy influence rises — we can avoid [the problem of defenders not being present].” — Dill, Chapter 10

Propagation formulae

Different propagation formulae model different strategic phenomena. Kohan II uses several simultaneously:

Formula	Effect	Use
Linear decay `I(d) = I₀ - k·d`	Influence falls steadily with region distance	General territorial presence
Travel-time decay `I(d) = I₀ / travelTime(d)`	Influence weighted by actual movement time	Estimates arrival pressure
Discontinuous (step) `I(d) = I₀ if d < n, else 0`	Hard cutoff at n hops	”Can reach within N turns” binary maps
Power-law `I(d) = I₀ × r^d`	Exponential decay with ratio r	Threat modelling (power falls off quickly)

Multiple maps are maintained simultaneously and queried in combination — e.g. (friendly_power - enemy_travel_time) identifies regions where allied strength exceeds expected incoming pressure, suitable for aggressive expansion.

Border calculation

Territory ownership is determined by comparing faction influence at each region’s border edges. A border region belongs to the faction with higher influence at that point. This naturally produces front lines without any explicit territory management code.

Combat outcome prediction — Lanchester models

Motivation

Before committing forces to an engagement, the AI needs to know: will I win? Simulation is too slow for planning purposes. Lanchester’s laws provide algebraic approximations.

Lanchester’s Linear Law (melee / one-on-one)

When combatants can only fight one-at-a-time (melee, single-target weapons):

dA/dt = -b · B
dB/dt = -a · A

Where a and b are per-unit kill rates. Outcome determined by:

A_survives = √(A₀² - (b/a) · B₀²)

If the expression inside the root is negative, force B wins.

Lanchester’s Square Law (ranged / area fire)

When all units can fire at all enemies simultaneously (ranged combat, AoE weapons):

dA/dt = -b · B
dB/dt = -a · A   [same equations but different regime]

Outcome:

A_survives = √(A₀² - (b/a) · B₀²)   [same formula]

The key difference is the interpretation of unit strength a and b. Under the square law, doubling unit count quadruples effective combat power — concentration of force has dramatically higher returns than in the linear law.

Mixed-regime generalisation

Real RTS games sit between pure melee and pure ranged. Dill (Chapter 13) models this as:

a_effective = a · A^(n-1)

Where n is a regime parameter:

n = 1: Linear law (pure melee)
n = 2: Square law (pure ranged)
n ≈ 1.56: Empirical fit for typical RTS combat

Parameter estimation from replays

Unit strength parameters a and b do not need to be hand-tuned. Given a replay dataset, they can be estimated using logistic regression:

For each recorded battle: record starting army compositions, outcome (A wins / B wins), and surviving units.
Formulate the Lanchester prediction as a logistic feature: x = A₀^n - (b/a) · B₀^n.
Fit logistic regression: probability of A winning = σ(x).
Gradient descent learns the ratio b/a and n from the replay data.

This approach also generalises to heterogeneous armies: each unit type i has its own strength sᵢ, and army strength is A = Σ sᵢ · countᵢ^n. The regression learns all sᵢ simultaneously.

# Pseudocode: logistic regression for unit strengths
# features: [count_infantry_A, count_tank_A, count_infantry_B, count_tank_B]
# labels: [1 if A wins, 0 if B wins]
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(features, labels)
# Coefficients map to unit strength parameters after exponentiation

This approach is robust to game balance patches — re-run the regression on new replays to update the strength estimates.

Hierarchical Portfolio Search (HPS) — Prismata

For card/unit-based strategy games with very large action spaces, Hierarchical Portfolio Search (Roelofs et al., Chapter 18, see source-game-ai-pro-360-tactics-strategy) combines handcrafted partial-player functions with a top-level tree search.

Architecture

Partial Player (PP) functions: Each PP handles one tactical phase (e.g. “worker management”, “defence assignment”, “attack selection”). A PP takes the game state and returns a complete set of actions for its phase.
Portfolio: A set of PPs, one per phase. The full move for a turn is the cross-product of all PP outputs — every combination of partial decisions.
Top-level search: MCTS or Negamax over the portfolio-generated children. The search evaluates full turn sequences, not individual action micro-decisions.

Benefits:

The portfolio massively reduces the branching factor (from thousands of raw actions to tens of PP combinations).
Individual PPs can be updated when the game is balance-patched without retraining the search.
The symmetric-play evaluation heuristic (leaf nodes evaluate by simulating both sides playing the same PP) is fast and requires no explicit value function.

Master Bot results

The Prismata Master Bot built with HPS reached the top 25% of human-rated players in public matchmaking, despite Prismata being a complex simultaneous-decision game with no hidden information.

In practice (Unity RTS example)

A minimal region map with influence propagation:

public class RegionGraph
{
    public List<Region> Regions;
    public Dictionary<(int, int), Border> Borders;
 
    public void PropagateInfluence(Faction faction, float decayPerHop)
    {
        // Dijkstra from all regions with friendly presence
        var pq = new PriorityQueue<int, float>();
        var influence = new float[Regions.Count];
 
        foreach (var r in Regions.Where(r => r.GetForce(faction) > 0))
        {
            influence[r.Id] = r.GetForce(faction);
            pq.Enqueue(r.Id, -influence[r.Id]);
        }
 
        while (pq.Count > 0)
        {
            pq.TryDequeue(out int id, out float neg);
            float current = -neg;
            if (current < influence[id]) continue; // stale entry
 
            foreach (var neighbour in Regions[id].Neighbours)
            {
                float newInfluence = current - decayPerHop;
                if (newInfluence > influence[neighbour.Id])
                {
                    influence[neighbour.Id] = newInfluence;
                    pq.Enqueue(neighbour.Id, -newInfluence);
                }
            }
        }
 
        for (int i = 0; i < Regions.Count; i++)
            Regions[i].SetInfluence(faction, influence[i]);
    }
}

Trade-offs

Technique	Cost	Benefit
Region-based partitioning	Preprocessing; requires walkability data	Enables all region-level reasoning
Grid influence map	Moderate runtime rebuild	Higher spatial resolution
Region influence map	Very fast	Coarser; misses sub-region detail
Lanchester prediction	Near-zero	Very fast; generalisable to heterogeneous armies
Full simulation	High (per-frame unit sim)	Accurate; handles all edge cases
HPS + MCTS	Moderate (search + PP evaluation)	Scales to complex action spaces; patchable

Evidence

Dill (Chapter 10, see source-game-ai-pro-360-tactics-strategy) describes region partitioning, chokepoint detection, hierarchical pathfinding, and influence propagation as applied in Kohan II.
Dill (Chapter 13) describes Lanchester models, the mixed-regime generalisation, and parameter estimation from replays.
Roelofs et al. (Chapter 18) describe Hierarchical Portfolio Search and the Prismata Master Bot.

Open questions

Lanchester models assume stationary engagement — how do they generalise to highly mobile, kiting-based combat where units are continuously repositioning?
Is there a clean decomposition of the region-partitioning algorithm that works on NavMesh-generated geometry rather than raw heightmap grids?
HPS requires handcrafted PP functions per tactical phase — how many phases are typically needed, and how is phase coverage determined for a new game?

influence-maps · squad-ai-patterns · tactical-position-selection · mcts · game-ai-agent-design · source-game-ai-pro-360-tactics-strategy

GDnD Wiki

Explorer

GDnD Wiki

Strategic AI (RTS)

Summary

Key ideas

Region-based map partitioning

Algorithm

Chokepoint and cul-de-sac detection

Hierarchical pathfinding

Region-level influence propagation

Overview

Propagation formulae

Border calculation

Combat outcome prediction — Lanchester models

Motivation

Lanchester’s Linear Law (melee / one-on-one)

Lanchester’s Square Law (ranged / area fire)

Mixed-regime generalisation

Parameter estimation from replays

Hierarchical Portfolio Search (HPS) — Prismata

Architecture

Master Bot results

In practice (Unity RTS example)

Trade-offs

Evidence

Open questions

Graph View

Table of Contents

Backlinks

GDnD Wiki

Explorer

Strategic AI (RTS)

Summary

Key ideas

Region-based map partitioning

Algorithm

Chokepoint and cul-de-sac detection

Hierarchical pathfinding

Region-level influence propagation

Overview

Propagation formulae

Border calculation

Combat outcome prediction — Lanchester models

Motivation

Lanchester’s Linear Law (melee / one-on-one)

Lanchester’s Square Law (ranged / area fire)

Mixed-regime generalisation

Parameter estimation from replays

Hierarchical Portfolio Search (HPS) — Prismata

Architecture

Master Bot results

In practice (Unity RTS example)

Trade-offs

Evidence

Open questions

Related

Graph View

Table of Contents

Backlinks