Skip to content

M2.4 — Reading results

What you'll learn

  • The four result surfaces Gauntlet produces: exit code, per-role logs, collected /Saved artifacts, and crash/ensure detection.
  • How a pass/fail verdict is actually determined.
  • A triage order for a red run that turns "it failed" into "it failed here, because."

How it applies (QA)

Authoring tests is a fraction of the job; reading their output is most of it. A test owner who knows where each signal lives can triage a nightly failure in minutes instead of re-running blind. This lesson is the one you'll return to most.

Concepts

Four result surfaces

  1. Exit code. The overall run returns 0 for pass, non-zero for fail. This is what CI keys on. It's the verdict, not the explanation.
  2. Per-role logs. Each role (each launched process) produces its own Unreal log. Gauntlet parses these for the conditions it cares about and preserves them for you. A 4-client + server session yields five logs — read the one for the role that failed.
  3. Collected /Saved artifacts. Gauntlet gives you access to the /Saved directory each run produced — logs, crash dumps, screenshots, profiling captures, any files the game wrote. After a remote/devkit run, these are pulled back so you can inspect them locally.
  4. Crash / ensure detection. Gauntlet parses logs and crash output to detect crashes, ensures, and fatal errors automatically, and folds them into the verdict and the report — you don't hand-grep for "Fatal."

How the verdict is decided

A run's pass/fail combines:

  • Did the session complete as the test required? (all roles launched, ran, exited as expected)
  • Did the test node's own success condition hold? (its log marker matched, its exit code was right, its TestController reported success — whatever that node asserts).
  • Were there disqualifying events? A detected crash/fatal/ensure typically fails the run even if the nominal success condition was met.

For the built-in tests the success condition is pre-defined (e.g. UE.BootTest: reached a running state, no crash, within the timeout). For your tests (M4) you define it. Either way the node's verdict becomes the process exit code CI sees.

The artifact layout (orientation)

A run's collected output is organized per-role under the run's artifact directory — conceptually:

<ArtifactRoot>/<TestName>/
   ├─ Client0/  → its Unreal log, its /Saved (crashes, screenshots, ...)
   ├─ Client1/
   ├─ Server/
   └─ <run summary / report>

Verify on a real build: exact folder names, the report file format, and the artifact root location are configured per studio/version. The shape — per-role subtrees plus a summary — is the stable part.

Pitfall: reading the wrong role's log

In a multi-role session the server may be the thing that died while a client log looks clean (it just lost connection). Start from the role the verdict blames, and when a client "disconnected" for no clear reason, check the server log before concluding the client is broken.

Worked example — triage order

A nightly UE.BootTest on PS5 is red. Work the surfaces in order:

  1. Exit code — non-zero confirms failure (CI already knew). No detail yet.
  2. Crash/ensure summary — does the report list a crash? If yes, jump to the crash dump in that role's /Saved. Often this ends triage immediately.
  3. The failing role's log — no crash listed? Read the tail of the role's Unreal log: did it hang before the boot marker (timeout) or log an error? "Reached menu" present but still failed points at the success condition/timeout, not the game.
  4. /Saved artifacts — screenshots/profiling to corroborate (e.g. a black screen at the "booted" moment = booted but broken render).

The discipline: each surface narrows the next. You almost never need all four — the crash summary or the log tail resolves most reds.

Exercise 1 — Match signal to surface

Which result surface answers each question?

  1. "Did CI consider this run a pass?"
  2. "Did the server crash?"
  3. "What did the screen actually look like when it 'booted'?"
  4. "Did client 2 ever print the join-success line?"

Lab — Write a triage note from a log tail

Given this (synthetic) failing-role log tail, write the two-line triage note you'd post: (a) your best hypothesis, (b) the next surface you'd open.

[2026.06.19-03.14.07] LogLoad: Game class is 'ShooterGameMode'
[2026.06.19-03.14.09] LogShooter: Display: Reached MainMenu
[2026.06.19-03.16.09] LogGauntlet: Error: Test timed out after 120s waiting for marker 'PlayLevelLoaded'

Exercise 2 — Verdict logic

A custom test's success condition (log marker Smoke OK) matched, but the run still failed. Name the most likely disqualifying event and where you'd confirm it.

Self-check — answers

Exercise 1: 1 exit code, 2 crash/ensure detection (then the server's /Saved crash dump), 3 /Saved artifacts (screenshot), 4 the per-role (client 2) log.

Lab (example note): (a) Game booted and reached MainMenu fine; failure is a timeout waiting for the PlayLevelLoaded marker — likely the level-load step never completed or never emits that marker, not a boot problem. (b) Open Client0's /Saved (screenshot + full log) to see whether level load stalled or the marker name/emit is wrong. The key read: success came after the boot markers, so suspect the level-load/marker, not startup.

Exercise 2: A detected crash/fatal/ensure during the run — these typically fail the verdict even when the nominal success marker matched. Confirm it in the run's crash/ensure summary, then the offending role's crash dump in /Saved.

Done when

  • [ ] You can name the four result surfaces and what each tells you.
  • [ ] You can explain how exit code, success condition, and crash detection combine into a verdict.
  • [ ] You can state a triage order and justify why each surface narrows the next.
  • [ ] You can write a concise triage note from a log tail.

Next: M3.1 — UnrealSession, roles, instances.