M6.1 — Gauntlet in a pipeline¶

What you'll learn

Where Gauntlet sits in an automated pipeline: after build → cook → stage, before the gate.
How BuildGraph and a CI system (e.g. Horde) orchestrate that sequence.
What "gating" means and the difference between pre-submit and post-submit Gauntlet.

How it applies (QA)

A Gauntlet test only changes behavior if a result blocks something. Understanding the pipeline tells you what a red Gauntlet node actually stops (a submit, a build promotion, a release) and where to look when the test never ran because an upstream node failed. Most "Gauntlet didn't run" incidents are upstream-node failures, not Gauntlet.

Concepts¶

The pipeline sequence¶

Gauntlet is one node in a longer chain. A typical ordering:

compile (UBT) ─▶ cook ─▶ stage/package ─▶ [reserve device] ─▶ RunUnreal (Gauntlet) ─▶ gate
   └──────────── upstream: produces the build (M3.4) ─────────┘   └─ consumes it ─┘   └ acts on result

Everything left of RunUnreal is the build production Gauntlet depends on (M1.1 boundary #1, M3.4). If any upstream node fails, the Gauntlet node typically doesn't run at all — there's no build to test. "Gauntlet skipped / not run" usually means look upstream, not fix the test.

BuildGraph and Horde¶

BuildGraph — Epic's [UAT]-based system for declaring a graph of build steps (nodes) and their dependencies in XML. The cook node depends on compile; the stage node on cook; the Gauntlet node on stage. BuildGraph runs them in order and stops a branch when a dependency fails.
Horde — Epic's build/CI system that executes BuildGraph jobs across a farm, manages the device pool (M3.3), and surfaces results (logs, artifacts, pass/fail) in a dashboard. It's the layer that schedules the pipeline and shows you the red X.

You don't have to use Horde — Gauntlet runs from any CI that can invoke RunUAT — but the BuildGraph node model (Gauntlet as a dependency-gated step after staging) is the common shape.

Gating — what red blocks¶

The Gauntlet node's exit code (M2.4) feeds a gate: the policy that decides what a failure prevents. Common gates:

Pre-submit / pre-flight — run Gauntlet on a shelved/proposed change before it lands. Red → the change is blocked from submitting. Catches breakage before it hits the mainline. Expensive, so usually a fast subset (boot/smoke).
Post-submit (CIS) — run on every (or batched) mainline change. Red → the build is marked bad / not promotable, and ideally the offending CL is flagged. Catches what slipped past pre-submit.
Promotion / release gate — run a fuller suite before a build is promoted to a wider audience (QA build, cert candidate). Red → no promotion.

The same UE.BootTest can sit in several gates at different breadths. A test with no gate is informational only — it changes nothing when it's red, which is rarely what you want.

Pitfall: a non-gating test is a decoration

A Gauntlet job that runs but whose result blocks nothing (no submit gate, no promotion gate, no alert) provides false comfort: it looks like coverage but stops no bad build. Before calling a test "coverage," name what its red result prevents. If the answer is "nothing," it's a dashboard ornament.

Worked example — reading a pipeline failure¶

Horde shows a branch red. Two shapes, very different actions:

Cook node red, Gauntlet node grey/skipped → the build was never produced; Gauntlet had nothing to run. Action: fix the cook (or the content/code that broke it). The Gauntlet test is fine.
Cook/stage green, Gauntlet node red → the build exists and the test failed. Action: triage the run (M2.4) — is it the build's behavior, a device, or the test? Now Gauntlet is in scope.

The grey-vs-red distinction on the Gauntlet node is the first thing to read: did my test even get a build to run against?

Exercise 1 — Which gate?

Match each goal to a gate (pre-submit / post-submit / promotion):

Stop a change that breaks boot from ever landing on mainline.
Don't let a build become the weekly QA build unless the full smoke suite passes.
Catch a boot break that two changes combined to cause, after they each landed.

Exercise 2 — Skipped or failed?

For each, say whether Gauntlet ran and failed or never ran, and where you'd look:

Gauntlet node is grey; the stage node above it is red.
Gauntlet node is red; everything above it is green.
"No build found" logged by UnrealBuildSource inside the Gauntlet node.

Lab — Place your boot test in the pipeline

For the boot test you've been building since M1.3: (1) name the upstream nodes it depends on, (2) choose which gate(s) it belongs in and justify the breadth at each, (3) state exactly what a red result prevents at each gate, (4) name one upstream failure that would show your node as skipped, not failed.

Self-check — answers

Exercise 1: 1 pre-submit, 2 promotion, 3 post-submit (CIS).

Exercise 2: 1 never ran — upstream stage failed, no build; look at the stage node. 2 ran and failed — build exists; triage the Gauntlet run (M2.4). 3 effectively didn't test — the node ran but UnrealBuildSource found no build; an upstream cook/stage or a wrong -build/-platform/-configuration (M3.4) — look upstream / at the command, not the test logic.

Lab: Upstream — compile → cook → stage for the target platform/config. Gates — pre-submit for a fast boot check (block bad changes cheaply) and a promotion gate before the QA build (broader); post-submit if pre-submit can't cover every change. Red prevents — submit (pre-submit) / promotion (release gate). Skipped-not-failed example — the cook node failing for that platform leaves the Gauntlet node with no build, shown grey/skipped.

Done when

[ ] You can place Gauntlet in the build→cook→stage→test→gate sequence.
[ ] You can describe BuildGraph (dependency graph) and Horde (scheduler/dashboard/device pool).
[ ] You can distinguish pre-submit, post-submit, and promotion gates and what each blocks.
[ ] You can read a skipped vs. failed Gauntlet node and act accordingly.

Next: M6.2 — Flake, retries, triage.