Skip to content

M4.4 — TestExecutor

What you'll learn

  • What TestExecutor does: create, queue, and monitor a set of tests.
  • How parallel execution works and what a test must satisfy to run in parallel safely.
  • Why unique port allocation exists and the isolation discipline it implies for your tests.

How it applies (QA)

TestExecutor is why a Gauntlet job can run several tests, and why two parallel tests don't stomp each other's ports. When a test passes alone but fails in a batch, the cause is almost always an isolation violation the executor's guarantees exposed. Knowing its model turns "flaky in parallel" from mysterious into diagnosable.

Concepts

What TestExecutor owns

TestExecutor handles "creation, queuing, and monitoring of a set of tests." RunUnreal parses -test= into a list of nodes and hands that list to the executor, which:

  • Creates each test node and gives it its context.
  • Queues them, respecting how many can run at once.
  • Monitors each running node — ticking it, watching for completion, enforcing duration.
  • Aggregates results into the overall run verdict/exit code.

You don't call it; RunUnreal does. But its behavior shapes how your tests must be written.

Parallel execution — opt-in safety

Multiple tests "can be executed in parallel if the TestNode supports such a thing." Parallelism is not free or automatic: a node must be safe to run alongside others. The executor handles the mechanics that make parallelism possible — most notably allocating unique ports so two simultaneous sessions don't both grab the same port and collide.

Parallel cuts wall-clock time but spends resources: each parallel test needs its own processes/devices. On a single PC you can parallelize cheap tests; on a finite devkit pool, parallelism is bounded by device count (M3.3).

The isolation contract

For a test to behave the same alone and in a batch, it must not depend on anything another test might also touch. Practical rules:

  • Never hardcode ports. Let the framework's unique-port allocation assign them. A hardcoded 7777 works alone and collides the instant a second test (or a second instance) wants it.
  • Don't assume a fixed path/file. Two tests writing the same scratch file race. Use per-run artifact locations.
  • Don't depend on global device state another test could change (a kit you both reboot, a shared save).
  • Don't rely on run order. The executor may schedule/parallelize differently than you wrote them.

A test that obeys these is isolatable; one that doesn't is the classic "green solo, red in the nightly batch."

Pitfall: hardcoded port = the canonical parallel flake

A test with a literal port number is a time bomb: green in isolation, intermittently red once anything else runs concurrently, and non-reproducible when you re-run it alone to investigate. If a test flakes only in batches, grep it for hardcoded ports/paths before suspecting the game.

Worked example — solo-green, batch-red

A team's UE.Networking-style custom test passes every time run by itself, but fails ~1 in 4 nightly runs where it's batched with two other networked tests. Trace it:

  1. Solo: only one session exists; its server binds port 7777 (hardcoded) — fine.
  2. Batch (parallel): two networked tests run at once; both try to bind 7777; one wins, the other's server fails to start → its clients log "connection refused" → red.
  3. Re-run the failing test alone to investigate → passes (no contention) → looks like flake.

Root cause: a hardcoded port defeating the executor's unique-port allocation. Fix: take the port from the framework instead of a literal. The symptom (intermittent, batch-only, non-reproducible solo) is the fingerprint of an isolation bug, not a game bug.

Exercise 1 — Isolatable or not?

Mark each test design safe or unsafe for parallel execution, and why:

  1. Binds a server to a port the framework allocates.
  2. Writes results to C:\Temp\result.txt.
  3. Reboots "the devkit" by a fixed name at startup.
  4. Reads only its own role's logs and the build it was given by context.

Exercise 2 — Fingerprint the failure

A test is green when run alone, red ~30% of the time in the batched nightly, and green again every time you re-run it in isolation to debug. Name the bug class and the first thing you'd grep for.

Self-check — answers

Exercise 1: 1 safe (uses allocated port); 2 unsafe (shared fixed path → races with any other test using it); 3 unsafe (mutates shared device state another test may rely on); 4 safe (touches only per-run, context-provided resources).

Exercise 2: An isolation / shared-resource bug (not a product flake) — the batch-only, non-reproducible-solo signature is its fingerprint. First grep: hardcoded ports, then fixed file paths and shared device names. Let the executor's unique-port allocation do its job instead of fighting it with a literal.

Done when

  • [ ] You can state what TestExecutor creates, queues, and monitors.
  • [ ] You can explain that parallelism is opt-in and bounded by resources/devices.
  • [ ] You can list the isolation rules (no hardcoded ports/paths, no shared device state, no order dependence).
  • [ ] You can recognize the solo-green/batch-red fingerprint and name its usual cause.

Next: M5.1 — The Gauntlet plugin & TestController.