M2.4 — Reading results¶

What you'll learn

The four result surfaces Gauntlet produces: exit code, per-role logs, collected /Saved artifacts, and crash/ensure detection.
How a pass/fail verdict is actually determined.
A triage order for a red run that turns "it failed" into "it failed here, because."

How it applies (QA)

Authoring tests is a fraction of the job; reading their output is most of it. A test owner who knows where each signal lives can triage a nightly failure in minutes instead of re-running blind. This lesson is the one you'll return to most.

Concepts¶

Four result surfaces¶

Exit code. The overall run returns 0 for pass, non-zero for fail. This is what CI keys on. It's the verdict, not the explanation.
Per-role logs. Each role (each launched process) produces its own Unreal log. Gauntlet parses these for the conditions it cares about and preserves them for you. A 4-client + server session yields five logs — read the one for the role that failed.
Collected /Saved artifacts. Gauntlet gives you access to the /Saved directory each run produced — logs, crash dumps, screenshots, profiling captures, any files the game wrote. After a remote/devkit run, these are pulled back so you can inspect them locally.
Crash / ensure detection. Gauntlet parses logs and crash output to detect crashes, ensures, and fatal errors automatically, and folds them into the verdict and the report — you don't hand-grep for "Fatal."

How the verdict is decided¶

A run's pass/fail combines:

Did the session complete as the test required? (all roles launched, ran, exited as expected)
Did the test node's own success condition hold? (its log marker matched, its exit code was right, its TestController reported success — whatever that node asserts).
Were there disqualifying events? A detected crash/fatal/ensure typically fails the run even if the nominal success condition was met.

For the built-in tests the success condition is pre-defined (e.g. UE.BootTest: reached a running state, no crash, within the timeout). For your tests (M4) you define it. Either way the node's verdict becomes the process exit code CI sees.

The artifact layout (orientation)¶

A run's collected output is organized per-role under the run's artifact directory — conceptually:

<ArtifactRoot>/<TestName>/
   ├─ Client0/  → its Unreal log, its /Saved (crashes, screenshots, ...)
   ├─ Client1/
   ├─ Server/
   └─ <run summary / report>

Verify on a real build: exact folder names, the report file format, and the artifact root location are configured per studio/version. The shape — per-role subtrees plus a summary — is the stable part.

Pitfall: reading the wrong role's log

In a multi-role session the server may be the thing that died while a client log looks clean (it just lost connection). Start from the role the verdict blames, and when a client "disconnected" for no clear reason, check the server log before concluding the client is broken.

Worked example — triage order¶

A nightly UE.BootTest on PS5 is red. Work the surfaces in order:

Exit code — non-zero confirms failure (CI already knew). No detail yet.
Crash/ensure summary — does the report list a crash? If yes, jump to the crash dump in that role's /Saved. Often this ends triage immediately.
The failing role's log — no crash listed? Read the tail of the role's Unreal log: did it hang before the boot marker (timeout) or log an error? "Reached menu" present but still failed points at the success condition/timeout, not the game.
/Saved artifacts — screenshots/profiling to corroborate (e.g. a black screen at the "booted" moment = booted but broken render).

The discipline: each surface narrows the next. You almost never need all four — the crash summary or the log tail resolves most reds.

Exercise 1 — Match signal to surface

Which result surface answers each question?

"Did CI consider this run a pass?"
"Did the server crash?"
"What did the screen actually look like when it 'booted'?"
"Did client 2 ever print the join-success line?"

Lab — Write a triage note from a log tail

Given this (synthetic) failing-role log tail, write the two-line triage note you'd post: (a) your best hypothesis, (b) the next surface you'd open.

[2026.06.19-03.14.07] LogLoad: Game class is 'ShooterGameMode'
[2026.06.19-03.14.09] LogShooter: Display: Reached MainMenu
[2026.06.19-03.16.09] LogGauntlet: Error: Test timed out after 120s waiting for marker 'PlayLevelLoaded'

Exercise 2 — Verdict logic

A custom test's success condition (log marker Smoke OK) matched, but the run still failed. Name the most likely disqualifying event and where you'd confirm it.

Self-check — answers

Exercise 1: 1 exit code, 2 crash/ensure detection (then the server's /Saved crash dump), 3 /Saved artifacts (screenshot), 4 the per-role (client 2) log.

Lab (example note): (a) Game booted and reached MainMenu fine; failure is a timeout waiting for the PlayLevelLoaded marker — likely the level-load step never completed or never emits that marker, not a boot problem. (b) Open Client0's /Saved (screenshot + full log) to see whether level load stalled or the marker name/emit is wrong. The key read: success came after the boot markers, so suspect the level-load/marker, not startup.

Exercise 2: A detected crash/fatal/ensure during the run — these typically fail the verdict even when the nominal success marker matched. Confirm it in the run's crash/ensure summary, then the offending role's crash dump in /Saved.

Done when

[ ] You can name the four result surfaces and what each tells you.
[ ] You can explain how exit code, success condition, and crash detection combine into a verdict.
[ ] You can state a triage order and justify why each surface narrows the next.
[ ] You can write a concise triage note from a log tail.

Next: M3.1 — UnrealSession, roles, instances.