Skip to content

M1.2 — Gauntlet vs. other UE automation

What you'll learn

  • The three automation systems a UE QA owner meets: the in-engine Automation System, Low-Level Tests, and Gauntlet — and what each is actually for.
  • Why Gauntlet is not a competitor to the in-engine system but often its launcher.
  • A decision rule for "which tool does this test belong in."

How it applies (QA)

Picking the wrong layer wastes weeks. People try to force a multi-client matchmaking test into the in-engine automation system (single process, no real second client) or try to write a pure math unit test as a full Gauntlet session (minutes of boot for a microsecond of logic). The map below is the thing you reach for when someone asks "where should this test live?"

Concepts

The three layers

Lives inside a single engine process (editor or game). Tests are C++ classes registered with the automation framework (IMPLEMENT_SIMPLE_AUTOMATION_TEST, the Automation Spec BDD style, or AFunctionalTest actors placed in a level). You browse and launch them from the editor's Session Frontend → Automation tab, or headless from the command line.

Strengths: deep access to engine internals, gameplay, and assets from inside the running game. Limits: one process. It cannot stand up a real second client or a separate dedicated server, and it does not manage installs on remote devices or console devkits.

C++ tests built on Catch2 that compile into a small standalone executable and run without booting the full engine (or with a minimal subset). Built for fast, deterministic unit/ integration testing of isolated systems — math, containers, a parser, a subsystem with mocked dependencies.

Strengths: fast, no editor, CI-friendly. Limits: no real game, no rendering, no session — by design.

An external C#/UAT orchestrator. It launches one or more game processes (roles) on one or more devices from a pre-cooked build, supervises them, and validates results from the outside. The only one of the three that does multi-process, multi-device, real-hardware, and process-lifecycle ([PLM]) testing.

Strengths: tests the shipping artifact on the target platform, the way a player runs it. Limits: needs a cooked build, slower (full boot), assertions are yours to define.

The relationship people miss

Gauntlet and the in-engine Automation System are complementary, and they are wired together. Two of Gauntlet's built-in tests exist specifically to run the in-engine automation tests through Gauntlet:

  • UE.EditorAutomation — launches the editor and runs in-engine automation tests inside it.
  • UE.TargetAutomation — launches a packaged/target build and runs the in-engine automation tests inside that.

So the question is rarely "automation tests or Gauntlet." It's: write the test logic in the in-engine automation system, then use Gauntlet as the launcher/harness that runs it on a real target build, collects the logs, and reports pass/fail to CI. Gauntlet supplies the orchestration and results plumbing the in-engine system lacks.

A decision rule

Ask, in order:

  1. Does it need the engine at all? No → Low-Level Test.
  2. Does it fit in one process on a dev PC? Yes, and it's gameplay/asset logic → in-engine Automation System (and optionally launch it via UE.EditorAutomation in CI).
  3. Does it need multiple processes, a real device/console, install/PLM behavior, or the shipping build? → Gauntlet.

Pitfall: 'it passes in PIE' is not 'it passes on device'

A test green in the in-engine system on a dev PC says nothing about the cooked build on a console: cook-only bugs, missing-content bugs, platform PLM bugs, and packaging bugs are invisible to single-process editor automation. That gap is exactly the territory Gauntlet covers — and why "we have automation tests" doesn't mean "we have Gauntlet coverage."

Worked example

Triage four proposed tests into a layer:

Proposed test Layer Why
FVector::Normalize handles a zero vector Low-Level Test Pure math, no engine boot needed.
Opening the inventory screen shows the right item count In-engine Automation Single process, gameplay/UI logic; run it in CI via UE.EditorAutomation.
Two clients can complete a match against a dedicated server Gauntlet Three processes, real networking — impossible in one process.
The shipping PS5 build boots and reaches the menu after a suspend/resume Gauntlet Target build + PLM on real hardware.

Exercise 1 — Place the test

Assign each to LLT, In-engine, or Gauntlet, and give a one-line reason:

  1. A regression test that a save file from last patch still loads.
  2. A check that the damage formula returns the expected number for 20 input pairs.
  3. A 4-player free-for-all that must run on Switch devkits nightly.
  4. Confirming a newly placed trigger volume fires its event when the player walks through it (editor).

Exercise 2 — Explain the wiring

A producer says: "If we already have Gauntlet, why are we also writing in-engine automation tests? Isn't that duplicate work?" Answer in two sentences using UE.EditorAutomation.

Self-check — answers

Exercise 1:

  1. Gauntlet if it must prove the cooked target build loads an old save (packaging/cook surface); in-engine if you only need to prove the load code path works in-editor. Name the surface you care about — that picks the layer.
  2. LLT — deterministic numeric logic, no engine needed.
  3. Gauntlet — multi-client on real devkits is multi-process + real-hardware.
  4. In-engine (AFunctionalTest) — single-process gameplay event in a level.

Exercise 2: The test logic lives in the in-engine automation tests; Gauntlet via UE.EditorAutomation (or UE.TargetAutomation) is the launcher that runs those same tests on a real build in CI and reports the result. It's not duplicate work — one writes the assertion, the other runs it where it matters.

Done when

  • [ ] You can name the three layers and the one-line scope of each.
  • [ ] You can explain why Gauntlet is the only layer for multi-process / multi-device / target-build tests.
  • [ ] You can describe how UE.EditorAutomation / UE.TargetAutomation make Gauntlet a launcher for in-engine tests.
  • [ ] You can apply the three-question decision rule cold.

Next: M1.3 — The three-tier architecture — the internal layering, used as a map for locating failures.