Historical Scenario Design Methodology¶
Overview¶
This document describes how Napoleonic battle scenarios are designed, encoded as YAML files, and loaded into the simulation engine for historical validation experiments (Epic E5.4).
Goals¶
- Calibrate the simulation engine against documented real-world outcomes.
- Stress-test trained policies by initialising them in historically significant tactical configurations.
- Discover whether agents reproduce historically-documented tactics or find novel superior alternatives.
Scenario YAML Schema¶
Each scenario is a single YAML file stored under
configs/scenarios/historical/. The file is parsed by
envs.scenarios.historical.ScenarioLoader.
Top-Level Keys¶
| Key | Required | Description |
|---|---|---|
scenario |
Yes | Battle metadata (name, date, description) |
factions |
No | Display names for blue and red sides |
terrain |
No | Terrain configuration (defaults to flat) |
units |
No | Initial-condition lists for blue and red (defaults to empty lists if omitted) |
historical_outcome |
No | Documented result used as validation baseline (defaults applied if omitted) |
scenario Block¶
scenario:
name: "Battle of Waterloo (1815)" # Display name
date: "1815-06-18" # ISO-8601
description: >
Narrative context (multi-line OK).
factions Block¶
terrain Block¶
terrain:
type: "generated" # "flat" or "generated"
width: 1000.0 # map width in metres
height: 1000.0
rows: 20 # grid resolution
cols: 20
seed: 1815 # RNG seed for generated terrain
n_hills: 4 # number of Gaussian elevation blobs
n_forests: 3 # number of Gaussian cover/forest blobs
When type is "flat", n_hills and n_forests are ignored.
units Block¶
units:
blue:
- id: "1st_division" # unique identifier (snake_case recommended)
x: 400.0 # world-space position in metres
y: 650.0
theta: "south" # facing angle (radians or named direction)
strength: 0.75 # relative strength in [0, 1]
red:
- id: "old_guard"
x: 380.0
y: 300.0
theta: "north"
strength: 0.95
Named directions for theta:
| Name | Radians | Description |
|---|---|---|
east |
0 | Facing positive-x |
north |
π/2 | Facing positive-y |
west |
π | Facing negative-x |
south |
−π/2 | Facing negative-y |
Numeric radians (e.g. 1.57) are also accepted.
historical_outcome Block¶
historical_outcome:
winner: 0 # 0=blue, 1=red, null=draw/inconclusive
blue_casualties: 0.36 # fraction of blue force lost [0, 1]
red_casualties: 0.68
duration_steps: 420 # indicative battle length in simulation steps
description: >
Free-text summary of the historical outcome.
Simulation Scale Convention¶
Each YAML unit represents a brigade-level formation (~2,000–8,000 men) rather than a single battalion, to allow manageable scenario sizes. The mapping rules are:
| Real-world unit | strength value |
Notes |
|---|---|---|
| Fresh full-strength brigade | 1.00 | Maximum effectiveness |
| Tired / partially engaged | 0.80–0.95 | Moderate fatigue |
| Already-engaged / worn | 0.60–0.79 | Significant losses |
| Heavily attrited | 0.40–0.59 | Combat ineffective soon |
| Depleted remnant | < 0.40 | Near-routing condition |
Positions are placed on a 1,000 × 1,000 m field. A rough scale guide:
- 1 m (simulation) ≈ 1–5 m (real-world), depending on the battle frontage.
- For a 5 km frontage, use 1 m (sim) = 5 m (real).
Casualty Calculation¶
Historical casualty figures are sourced from academic battle histories and expressed as the fraction of the engaged force that became casualties (killed, wounded, or captured):
These are used as the validation baseline in OutcomeComparator. The
comparator measures the absolute difference between simulated and historical
casualty rates for each side, combined with winner accuracy and duration
fidelity into a single fidelity score ∈ [0, 1].
Bundled Scenarios¶
1. Battle of Waterloo (1815-06-18)¶
File: configs/scenarios/historical/waterloo.yaml
Represents the climactic evening phase: Wellington's exhausted Anglo-Allied line defends the ridge of Mont-Saint-Jean against Napoleon's final Imperial Guard assault, with Prussian reinforcements arriving on the French right flank.
| Metric | Value |
|---|---|
| Blue (Allied) units | 4 |
| Red (French) units | 5 |
| Historical winner | Blue (Allied) |
| Allied casualties | ~36 % |
| French casualties | ~68 % |
| Terrain | Generated (ridge + orchards) |
2. Battle of Austerlitz (1805-12-02)¶
File: configs/scenarios/historical/austerlitz.yaml
Napoleon's tactical masterpiece: he feigned weakness on his right to draw the Allied left wing off the Pratzen Heights, then struck the weakened centre with Soult's corps.
| Metric | Value |
|---|---|
| Blue (French) units | 4 |
| Red (Allied) units | 5 |
| Historical winner | Blue (French) |
| French casualties | ~12 % |
| Allied casualties | ~47 % |
| Terrain | Generated (Pratzen plateau) |
3. Battle of Borodino (1812-09-07)¶
File: configs/scenarios/historical/borodino.yaml
The bloodiest day of the Napoleonic Wars: Napoleon's frontal assaults on fortified Russian earthworks produced enormous casualties on both sides but no decisive strategic result.
| Metric | Value |
|---|---|
| Blue (French) units | 4 |
| Red (Russian) units | 5 |
| Historical winner | Draw (null) |
| French casualties | ~35 % |
| Russian casualties | ~40 % |
| Terrain | Generated (redoubts + forests) |
Adding New Scenarios¶
- Create a new YAML file in
configs/scenarios/historical/. - Follow the schema above. Use the three bundled scenarios as templates.
- Source casualty and order-of-battle data from peer-reviewed histories or established wargame order-of-battle databases (e.g., ORBAT references).
- Add at least one test in
tests/test_historical_scenarios.pythat: - Loads the new file without error.
- Verifies the expected number of blue and red units.
- Verifies the historical winner field.
- Run
python -m pytest tests/test_historical_scenarios.pyto confirm.
Outcome Comparison Methodology¶
OutcomeComparator.compare(episode_result) returns a ComparisonResult
with the following sub-scores:
| Sub-score | Weight | Calculation |
|---|---|---|
| Winner accuracy | 0.5 | 1.0 if winner matches, else 0.0 |
| Casualty accuracy | 0.3 | 1 − 0.5 × (|Δblue| + |Δred|), clamped to [0, 1] |
| Duration accuracy | 0.2 | 1 − |Δsteps| / duration_steps, clamped to [0, 1] |
Fidelity score = weighted mean of the three sub-scores ∈ [0, 1].
A fidelity score ≥ 0.7 is considered a good historical match. Scores < 0.5 indicate significant divergence from the historical record.
References¶
- Chandler, D. G. (1966). The Campaigns of Napoleon. Macmillan.
- Uffindell, A. (2003). The Eagle's Last Triumph: Napoleon's Victory at Ligny, June 1815. Greenhill Books.
- Riehn, R. K. (1990). 1812: Napoleon's Russian Campaign. McGraw-Hill.
- Duffy, C. (1977). Austerlitz 1805. Seeley Service & Co.