M2.4 — Reading results¶
What you'll learn
- The four result surfaces Gauntlet produces: exit code, per-role logs, collected
/Savedartifacts, and crash/ensure detection. - How a pass/fail verdict is actually determined.
- A triage order for a red run that turns "it failed" into "it failed here, because."
How it applies (QA)
Authoring tests is a fraction of the job; reading their output is most of it. A test owner who knows where each signal lives can triage a nightly failure in minutes instead of re-running blind. This lesson is the one you'll return to most.
Concepts¶
Four result surfaces¶
- Exit code. The overall run returns 0 for pass, non-zero for fail. This is what CI keys on. It's the verdict, not the explanation.
- Per-role logs. Each role (each launched process) produces its own Unreal log. Gauntlet parses these for the conditions it cares about and preserves them for you. A 4-client + server session yields five logs — read the one for the role that failed.
- Collected
/Savedartifacts. Gauntlet gives you access to the/Saveddirectory each run produced — logs, crash dumps, screenshots, profiling captures, any files the game wrote. After a remote/devkit run, these are pulled back so you can inspect them locally. - Crash / ensure detection. Gauntlet parses logs and crash output to detect crashes, ensures, and fatal errors automatically, and folds them into the verdict and the report — you don't hand-grep for "Fatal."
How the verdict is decided¶
A run's pass/fail combines:
- Did the session complete as the test required? (all roles launched, ran, exited as expected)
- Did the test node's own success condition hold? (its log marker matched, its exit code was
right, its
TestControllerreported success — whatever that node asserts). - Were there disqualifying events? A detected crash/fatal/ensure typically fails the run even if the nominal success condition was met.
For the built-in tests the success condition is pre-defined (e.g. UE.BootTest: reached a
running state, no crash, within the timeout). For your tests (M4) you define it. Either way the
node's verdict becomes the process exit code CI sees.
The artifact layout (orientation)¶
A run's collected output is organized per-role under the run's artifact directory — conceptually:
<ArtifactRoot>/<TestName>/
├─ Client0/ → its Unreal log, its /Saved (crashes, screenshots, ...)
├─ Client1/
├─ Server/
└─ <run summary / report>
Verify on a real build: exact folder names, the report file format, and the artifact root location are configured per studio/version. The shape — per-role subtrees plus a summary — is the stable part.
Pitfall: reading the wrong role's log
In a multi-role session the server may be the thing that died while a client log looks clean (it just lost connection). Start from the role the verdict blames, and when a client "disconnected" for no clear reason, check the server log before concluding the client is broken.
Worked example — triage order¶
A nightly UE.BootTest on PS5 is red. Work the surfaces in order:
- Exit code — non-zero confirms failure (CI already knew). No detail yet.
- Crash/ensure summary — does the report list a crash? If yes, jump to the crash dump in that
role's
/Saved. Often this ends triage immediately. - The failing role's log — no crash listed? Read the tail of the role's Unreal log: did it hang before the boot marker (timeout) or log an error? "Reached menu" present but still failed points at the success condition/timeout, not the game.
/Savedartifacts — screenshots/profiling to corroborate (e.g. a black screen at the "booted" moment = booted but broken render).
The discipline: each surface narrows the next. You almost never need all four — the crash summary or the log tail resolves most reds.
Exercise 1 — Match signal to surface
Which result surface answers each question?
- "Did CI consider this run a pass?"
- "Did the server crash?"
- "What did the screen actually look like when it 'booted'?"
- "Did client 2 ever print the join-success line?"
Lab — Write a triage note from a log tail
Given this (synthetic) failing-role log tail, write the two-line triage note you'd post: (a) your best hypothesis, (b) the next surface you'd open.
Exercise 2 — Verdict logic
A custom test's success condition (log marker Smoke OK) matched, but the run still failed. Name
the most likely disqualifying event and where you'd confirm it.
Self-check — answers
Exercise 1: 1 exit code, 2 crash/ensure detection (then the server's /Saved crash dump), 3
/Saved artifacts (screenshot), 4 the per-role (client 2) log.
Lab (example note): (a) Game booted and reached MainMenu fine; failure is a timeout waiting
for the PlayLevelLoaded marker — likely the level-load step never completed or never emits that
marker, not a boot problem. (b) Open Client0's /Saved (screenshot + full log) to see whether
level load stalled or the marker name/emit is wrong. The key read: success came after the
boot markers, so suspect the level-load/marker, not startup.
Exercise 2: A detected crash/fatal/ensure during the run — these typically fail the
verdict even when the nominal success marker matched. Confirm it in the run's crash/ensure
summary, then the offending role's crash dump in /Saved.
Done when
- [ ] You can name the four result surfaces and what each tells you.
- [ ] You can explain how exit code, success condition, and crash detection combine into a verdict.
- [ ] You can state a triage order and justify why each surface narrows the next.
- [ ] You can write a concise triage note from a log tail.