Instructor Package Cheatsheet¶

Code-verified quick reference for packages/instructor.

Core Flow¶

use Cognesy\Instructor\StructuredOutput;

class User {
    public string $name;
    public int $age;
}

$user = (new StructuredOutput)
    ->with(
        messages: 'Jason is 28 years old.',
        responseModel: User::class,
    )
    ->get();

Create `StructuredOutput`¶

use Cognesy\Instructor\StructuredOutput;
use Cognesy\Polyglot\Inference\Config\LLMConfig;

$so = new StructuredOutput();
$so = StructuredOutput::using('openai');
$so = StructuredOutput::using('openai', '/custom/config/path');
$so = StructuredOutput::fromConfig(LLMConfig::fromArray(['driver' => 'openai']));

Request Configuration (`StructuredOutput`)¶

$so = (new StructuredOutput)
    ->withMessages($messages)
    ->withInput($input)
    ->withRequest($request)
    ->withResponseModel(User::class)
    ->withResponseClass(User::class)
    ->withResponseObject(new User())
    ->withResponseJsonSchema($jsonSchema)
    ->withSystem('You are a precise extractor')    // string|\Stringable
    ->withPrompt('Extract user profile')            // string|\Stringable
    ->withExamples($examples)
    ->withModel('gpt-4o-mini')
    ->withOptions(['temperature' => 0])
    ->withOption('max_tokens', 1200)
    ->withStreaming(true);

Single-call variant:

$so = (new StructuredOutput)->with(
    messages: $messages,
    responseModel: User::class,
    system: '...',                         // string|\Stringable|null
    prompt: '...',                         // string|\Stringable|null
    examples: $examples,
    model: 'gpt-4o-mini',
    options: ['temperature' => 0],
);

Runtime / Provider Setup¶

use Cognesy\Instructor\Core\RequestMaterializer;
use Cognesy\Instructor\Core\StructuredPromptRequestMaterializer;
use Cognesy\Instructor\StructuredOutput;
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Polyglot\Inference\Config\LLMConfig;
use Cognesy\Instructor\Enums\OutputMode;

$so = StructuredOutput::fromConfig(LLMConfig::fromArray(['driver' => 'openai']));

$runtime = StructuredOutputRuntime::fromConfig(LLMConfig::fromDsn('driver=openai,model=gpt-4o-mini'))
    ->withOutputMode(OutputMode::Json)
    ->withMaxRetries(2);

// Create from LLMProvider
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new());

// Configure with StructuredOutputConfig
$runtime = $runtime->withConfig(StructuredOutputConfig::fromArray([...]));
$runtime = $runtime->withConfig(StructuredOutputConfig::fromDsn('outputMode=json,maxRetries=2'));
$runtime = $runtime->withRequestMaterializer($customMaterializer);
$runtime = $runtime->withRequestMaterializer(new RequestMaterializer()); // legacy/default path
$runtime = $runtime->withRequestMaterializer(new StructuredPromptRequestMaterializer()); // new structured prompt path

$so = (new StructuredOutput)->withRuntime($runtime);

RequestMaterializer remains the legacy/default implementation during rollout. StructuredPromptRequestMaterializer is the new path: it renders one system prompt text, keeps examples inside that system prompt, and sends cached prompt content through InferenceRequest::cachedContext() when used with the current runtime.

Prompt Config References¶

use Cognesy\Instructor\Config\StructuredOutputConfig;
use Cognesy\Instructor\Enums\OutputMode;

$config = new StructuredOutputConfig(
    modePromptClasses: [
        OutputMode::Json->value => App\Prompts\JsonSystemPrompt::class,
        OutputMode::Tools->value => App\Prompts\ToolsSystemPrompt::class,
    ],
    retryPromptClass: App\Prompts\RetryFeedbackPrompt::class,
    deserializationErrorPromptClass: App\Prompts\DeserializationRepairPrompt::class,
);

YAML-safe shape uses FQN strings:

modePromptClasses:
  json: 'App\\Prompts\\JsonSystemPrompt'
  tool_call: 'App\\Prompts\\ToolsSystemPrompt'

retryPromptClass: 'App\\Prompts\\RetryFeedbackPrompt'
deserializationErrorPromptClass: 'App\\Prompts\\DeserializationRepairPrompt'

OutputMode Enum¶

use Cognesy\Instructor\Enums\OutputMode;

OutputMode::Tools;        // 'tool_call' — default, uses tool/function calling
OutputMode::Json;         // 'json' — JSON mode
OutputMode::JsonSchema;   // 'json_schema' — structured outputs / JSON schema mode
OutputMode::MdJson;       // 'md_json' — extract JSON from Markdown code blocks
OutputMode::Text;         // 'text' — plain text extraction
OutputMode::Unrestricted; // 'unrestricted' — no constraints on output format

Pipeline Overrides¶

use Cognesy\Instructor\StructuredOutput;
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Polyglot\Inference\Config\LLMConfig;

$runtime = StructuredOutputRuntime::fromConfig(LLMConfig::fromDsn('driver=openai,model=gpt-4o-mini'))
    ->withValidator($validator)
    ->withTransformer($transformer)
    ->withDeserializer($deserializer)
    ->withExtractor($extractor);

$so = (new StructuredOutput)->withRuntime($runtime);

Execution¶

$result = $so->get();          // parsed value
$response = $so->response();   // StructuredOutputResponse
$raw = $so->inferenceResponse(); // InferenceResponse
$stream = $so->stream();       // StructuredOutputStream

$pending = $so->create();
$result = $pending->get();
$response = $pending->response();
$raw = $pending->inferenceResponse();
$stream = $pending->stream();
$array = $pending->toArray();
$json = $pending->toJson();
$jsonObject = $pending->toJsonObject();
$execution = $pending->execution();

PendingStructuredOutput is a lazy handle: - no provider call happens until one of the read methods above is used - get(), response(), inferenceResponse(), and stream() coordinate one execution - mutable lifecycle bookkeeping sits behind the internal execution session, not on the facade-facing handle - long-lived streaming state stays in the dedicated stream/state objects

Type helpers (available on StructuredOutput and PendingStructuredOutput):

$so->getString();
$so->getInt();
$so->getFloat();
$so->getBoolean();
$so->getObject();
$so->getArray();

Additional type helper (only on PendingStructuredOutput):

$pending->getInstanceOf(User::class);

Streaming (`StructuredOutputStream`)¶

$stream = $so->withStreaming()->stream();

foreach ($stream->partials() as $partial) {
    // every parsed partial update
}

foreach ($stream->sequence() as $sequenceUpdate) {
    // one update per completed sequence item (Sequence responses only)
}

foreach ($stream->responses() as $responseUpdate) {
    // StructuredOutputResponse, partial or final
}

foreach ($stream->getIterator() as $rawUpdate) {
    // raw emitted StructuredOutputResponse snapshots
}

$latestValue = $stream->lastUpdate();
$latestResponse = $stream->lastResponse();
$usage = $stream->usage();
$finalValue = $stream->finalValue();
$finalResponse = $stream->finalResponse();
$finalRaw = $stream->finalInferenceResponse();

lastResponse() / finalResponse() return StructuredOutputResponse. Use ->inferenceResponse() when you need the nested raw InferenceResponse.

Response Model Helpers¶

`Sequence`¶

use Cognesy\Instructor\Extras\Sequence\Sequence;

$people = (new StructuredOutput)
    ->with(
        messages: $text,
        responseModel: Sequence::of(Person::class),
    )
    ->get();

$count = $people->count();
$first = $people->first();
$last = $people->last();
$item = $people->get(0);
$all = $people->all();

`Scalar`¶

use Cognesy\Instructor\Extras\Scalar\Scalar;

$name = (new StructuredOutput)
    ->with(messages: $text, responseModel: Scalar::string('name'))
    ->get();

$age = (new StructuredOutput)
    ->with(messages: $text, responseModel: Scalar::integer('age'))
    ->get();

$isAdult = (new StructuredOutput)
    ->with(messages: $text, responseModel: Scalar::boolean('isAdult'))
    ->get();

$sentiment = (new StructuredOutput)
    ->with(messages: $text, responseModel: Scalar::enum(Sentiment::class, 'sentiment'))
    ->get();

`Maybe`¶

use Cognesy\Instructor\Extras\Maybe\Maybe;

$maybeUser = (new StructuredOutput)
    ->with(messages: $text, responseModel: Maybe::is(User::class))
    ->get();

if ($maybeUser->hasValue()) {
    $user = $maybeUser->get();
}

$error = $maybeUser->error();

Output Controls¶

use Cognesy\Instructor\Extras\Scalar\Scalar;
use Cognesy\Instructor\StructuredOutput;
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Instructor\Enums\OutputMode;
use Cognesy\Polyglot\Inference\Config\LLMConfig;

$runtime = StructuredOutputRuntime::fromConfig(LLMConfig::fromDsn('driver=openai,model=gpt-4o-mini'))
    ->withOutputMode(OutputMode::Json)
    ->withMaxRetries(3)
    ->withDefaultToStdClass(true);

$so = (new StructuredOutput)->withRuntime($runtime);

$asArray = (new StructuredOutput)->intoArray();
$asClass = (new StructuredOutput)->intoInstanceOf(User::class);
$asObject = (new StructuredOutput)->intoObject(Scalar::integer('rating'));

Cached Context¶

$result = (new StructuredOutput)
    ->withCachedContext(
        messages: $longContext,
        system: 'You know the full context',
    )
    ->with(
        prompt: 'Extract only contact details',
        responseModel: Contact::class,
    )
    ->get();

Examples API¶

use Cognesy\Instructor\Extras\Example\Example;

$result = (new StructuredOutput)
    ->withExamples([
        Example::fromText(
            'John Doe, john@example.com',
            ['name' => 'John Doe', 'email' => 'john@example.com'],
        ),
    ])
    ->with(messages: $text, responseModel: Contact::class)
    ->get();

Events¶

use Cognesy\Instructor\StructuredOutput;
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Instructor\Events\StructuredOutput\StructuredOutputRequestReceived;
use Cognesy\Polyglot\Inference\LLMProvider;

$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
    ->onEvent(StructuredOutputRequestReceived::class, function (object $event): void {
        // use $event->data['requestId'] / ['executionId'] for correlation
    })
    ->wiretap(function (object $event): void {
        // handle all events
    });

$result = (new StructuredOutput)
    ->withRuntime($runtime)
    ->with(messages: $text, responseModel: User::class)
    ->get();

Testing¶

Deterministic test seams:

Tests\Support\FakeInferenceDriver
queue sync InferenceResponse fixtures or streaming PartialInferenceDelta batches
best for most unit and regression tests inside packages/instructor
Tests\MockHttp
builds an HTTP client around MockHttpDriver
use when provider adapter and HTTP response shape still matter
Tests\Integration\Support\ProbeStreamDriver
observation helper for streaming immediacy and call-count assertions
Tests\Support\ProbeIterator
explicit iterator helper for controlled delta emission in integration tests