Skip to content

Testing

Introduction

The Sandbox package includes FakeSandbox, a test double that implements the same CanExecuteCommand interface as all real drivers. It lets you write fast, deterministic tests for code that depends on sandbox execution -- without spawning any processes, pulling container images, or requiring any system binaries.

FakeSandbox supports canned responses, FIFO queuing for repeated commands, a default fallback response, array-based response definitions, command recording, streaming callback simulation, and custom policies.

Creating a Fake

The fromResponses() Factory

The quickest way to create a mock is with the fromResponses() static factory. Pass an associative array where keys are the expected command strings and values are lists of ExecResult objects to return in order:

use Cognesy\Sandbox\Data\ExecResult;
use Cognesy\Sandbox\Testing\FakeSandbox;

$sandbox = FakeSandbox::fromResponses([
    'php -v' => [
        new ExecResult(
            stdout: 'PHP 8.3.0 (cli)',
            stderr: '',
            exitCode: 0,
            duration: 0.01,
        ),
    ],
]);

$result = $sandbox->execute(['php', '-v']);
echo $result->stdout(); // "PHP 8.3.0 (cli)"

The command key is formed by joining the argv array with spaces: ['php', '-v'] becomes 'php -v'. Make sure the key in your response map matches exactly.

The Constructor

For more control, use the constructor directly. This lets you specify a custom ExecutionPolicy:

use Cognesy\Sandbox\Config\ExecutionPolicy;

$policy = ExecutionPolicy::in('/tmp')->withTimeout(60);

$sandbox = new FakeSandbox(
    policy: $policy,
    responses: [
        'php script.php' => [
            new ExecResult(stdout: 'done', stderr: '', exitCode: 0, duration: 1.5),
        ],
    ],
);

echo $sandbox->policy()->timeoutSeconds(); // 60

The fromResponses() factory uses ExecutionPolicy::default() when no policy is specified.

Queueing Multiple Responses

When the same command is called multiple times, provide multiple results in the list. They are consumed in FIFO order -- each call to execute() shifts the next response off the queue:

$sandbox = FakeSandbox::fromResponses([
    'php script.php' => [
        new ExecResult(stdout: 'first run',  stderr: '', exitCode: 0, duration: 0.1),
        new ExecResult(stdout: 'second run', stderr: '', exitCode: 0, duration: 0.2),
        new ExecResult(stdout: 'third run',  stderr: '', exitCode: 0, duration: 0.3),
    ],
]);

$sandbox->execute(['php', 'script.php'])->stdout(); // "first run"
$sandbox->execute(['php', 'script.php'])->stdout(); // "second run"
$sandbox->execute(['php', 'script.php'])->stdout(); // "third run"

If the queue is exhausted and no default response is set, the next call throws a RuntimeException.

Default Response

To provide a fallback for any command that does not have a specific canned response, pass a defaultResponse:

$sandbox = FakeSandbox::fromResponses(
    responses: [],
    defaultResponse: new ExecResult(
        stdout: '',
        stderr: 'command not found',
        exitCode: 127,
        duration: 0.0,
    ),
);

$result = $sandbox->execute(['anything', '--help']);
echo $result->exitCode(); // 127

The default response is also used when a command's specific queue has been exhausted.

Enqueuing Responses After Construction

You can add responses to an existing fake at any time using the enqueue() method. This is useful in test setups where you build the fake incrementally:

$sandbox = FakeSandbox::fromResponses([]);

$sandbox->enqueue('php -v', new ExecResult(
    stdout: 'PHP 8.3.0',
    stderr: '',
    exitCode: 0,
    duration: 0.01,
));

$sandbox->enqueue('php -v', new ExecResult(
    stdout: 'PHP 8.3.1',
    stderr: '',
    exitCode: 0,
    duration: 0.01,
));

$sandbox->execute(['php', '-v'])->stdout(); // "PHP 8.3.0"
$sandbox->execute(['php', '-v'])->stdout(); // "PHP 8.3.1"

Array-Based Responses

For convenience, you can define responses as associative arrays instead of ExecResult objects. The mock normalizes them automatically:

$sandbox = FakeSandbox::fromResponses([
    'php -v' => [
        ['stdout' => 'PHP 8.3.0', 'exit_code' => 0, 'duration' => 0.01],
    ],
    'php script.php' => [
        ['stdout' => 'output', 'stderr' => 'warning', 'exit_code' => 0],
    ],
]);

The recognized keys match the ExecResult::toArray() format:

Key Type Default
stdout string ''
stderr string ''
exit_code int 0
duration float 0.0
timed_out bool false
truncated_stdout bool false
truncated_stderr bool false

Any omitted key uses its default value, so you only need to specify the fields relevant to your test.

Inspecting Recorded Commands

The mock records every command it receives. Use the commands() method to retrieve the full history:

$sandbox = FakeSandbox::fromResponses([
    'php -v' => [
        new ExecResult(stdout: 'PHP 8.3.0', stderr: '', exitCode: 0, duration: 0.01),
    ],
    'php -r echo 1;' => [
        new ExecResult(stdout: '1', stderr: '', exitCode: 0, duration: 0.01),
    ],
]);

$sandbox->execute(['php', '-v']);
$sandbox->execute(['php', '-r', 'echo 1;']);

$commands = $sandbox->commands();
// [
//     ['php', '-v'],
//     ['php', '-r', 'echo 1;'],
// ]

This is useful for asserting that your code called the expected commands in the expected order.

Standard Input Recording

When stdin is provided, it is appended to the recorded argv as a [stdin=...] entry:

$sandbox = FakeSandbox::fromResponses([
    'php -r echo fgets(STDIN);' => [
        new ExecResult(stdout: 'hello', stderr: '', exitCode: 0, duration: 0.01),
    ],
]);

$sandbox->execute(['php', '-r', 'echo fgets(STDIN);'], 'hello');

$commands = $sandbox->commands();
// [
//     ['php', '-r', 'echo fgets(STDIN);', '[stdin=hello]'],
// ]

Streaming Callback Support

The FakeSandbox honors the streaming callback, just like real drivers. When a callback is provided, the fake delivers stdout and stderr from the canned response as single chunks:

$sandbox = FakeSandbox::fromResponses([
    'php script.php' => [
        new ExecResult(stdout: 'output', stderr: 'warning', exitCode: 0, duration: 0.1),
    ],
]);

$chunks = [];
$sandbox->execute(
    ['php', 'script.php'],
    null,
    function (string $type, string $chunk) use (&$chunks) {
        $chunks[] = [$type, $chunk];
    }
);

// $chunks === [['out', 'output'], ['err', 'warning']]

Empty stdout or stderr is not delivered to the callback, matching the behavior of real drivers.

Testing Failure Scenarios

Simulating Timeouts

Create an ExecResult with timedOut: true and exit code 124:

$sandbox = FakeSandbox::fromResponses([
    'php long-script.php' => [
        new ExecResult(
            stdout: 'partial output...',
            stderr: '',
            exitCode: 124,
            duration: 30.0,
            timedOut: true,
        ),
    ],
]);

$result = $sandbox->execute(['php', 'long-script.php']);
assert($result->timedOut() === true);
assert($result->exitCode() === 124);

Simulating Truncated Output

Set the truncation flags to test how your code handles oversized output:

$sandbox = FakeSandbox::fromResponses([
    'php noisy-script.php' => [
        new ExecResult(
            stdout: '...last portion of output',
            stderr: '',
            exitCode: 0,
            duration: 5.0,
            truncatedStdout: true,
        ),
    ],
]);

$result = $sandbox->execute(['php', 'noisy-script.php']);
assert($result->truncatedStdout() === true);

Simulating Command Failures

Test error handling by returning non-zero exit codes:

$sandbox = FakeSandbox::fromResponses([
    'php broken.php' => [
        new ExecResult(
            stdout: '',
            stderr: 'PHP Fatal error: ...',
            exitCode: 255,
            duration: 0.5,
        ),
    ],
]);

$result = $sandbox->execute(['php', 'broken.php']);
assert($result->success() === false);
assert($result->exitCode() === 255);

Using FakeSandbox with Dependency Injection

The FakeSandbox implements CanExecuteCommand, so it can be injected anywhere a real sandbox is expected. This is the recommended approach for testing application services:

use Cognesy\Sandbox\Contracts\CanExecuteCommand;

class CodeExecutor
{
    public function __construct(
        private readonly CanExecuteCommand $sandbox,
    ) {}

    public function run(string $code): string
    {
        $result = $this->sandbox->execute(['php', '-r', $code]);
        if (!$result->success()) {
            throw new \RuntimeException("Execution failed: " . $result->stderr());
        }
        return $result->stdout();
    }
}

// In your test:
$mock = FakeSandbox::fromResponses([
    'php -r echo "hello";' => [
        new ExecResult(stdout: 'hello', stderr: '', exitCode: 0, duration: 0.01),
    ],
]);

$executor = new CodeExecutor($mock);
assert($executor->run('echo "hello";') === 'hello');

// Verify the command was called
assert($mock->commands() === [['php', '-r', 'echo "hello";']]);

Quick Reference

Method Purpose
FakeSandbox::fromResponses($responses, $defaultResponse) Create fake with canned responses and optional default
new FakeSandbox($policy, $responses, $defaultResponse) Create fake with custom policy
$mock->enqueue($key, $result) Add a response to the queue after construction
$mock->execute($argv, $stdin, $onOutput) Execute and return next canned response
$mock->commands() Get all recorded commands as list<list<string>>
$mock->policy() Access the execution policy