Introduction

Instructor is a PHP library for extracting structured, validated data from LLM responses. You define the shape of the data you need using plain PHP classes, and Instructor handles the rest: schema generation, prompt construction, response parsing, validation, and automatic retries.

The library is inspired by the Instructor library for Python created by Jason Liu.

use Cognesy\Instructor\StructuredOutput;

final class City {
    public string $name;
    public string $country;
    public int $population;
}

$city = StructuredOutput::using('openai')
    ->with(
        messages: 'What is the capital of France?',
        responseModel: City::class,
    )
    ->get();

echo $city->name;       // Paris
echo $city->country;    // France
echo $city->population; // 2148000

The package is distributed as cognesy/instructor-struct and requires PHP 8.3+.

Core Architecture¶

Instructor's API is built around four types, each with a distinct responsibility:

Type	Role
`StructuredOutput`	Builds and executes a single request. Provides the primary developer-facing API.
`StructuredOutputRuntime`	Holds provider configuration, retry policy, output mode, event listeners, and pipeline extensions. Reusable across requests.
`PendingStructuredOutput`	A lazy handle returned by `create()`. Execution happens only when you call `get()`, `response()`, or `stream()`.
`StructuredOutputStream`	Exposes streaming partial updates, sequence items, and the final response.

How It Works¶

Define a response model. Use a PHP class with typed public properties. Instructor generates a JSON Schema from the class and sends it to the LLM.
Build a request. Provide input messages, select a provider, and optionally customize the system prompt, examples, or model options.
Read the result. Call get() for the deserialized object, stream() for partial updates, or response() for the full response wrapper including raw LLM output.

Under the hood, Instructor translates your response model into a schema the LLM understands, wraps it in the appropriate output mode (tool calls, JSON mode, or JSON Schema mode), and deserializes the response back into your PHP object. If the response fails validation, Instructor feeds the errors back to the LLM and retries automatically.

Feature Highlights¶

Structured Responses & Validation¶

Extract typed objects, arrays, or scalar values from LLM responses
Automatic validation of returned data using Symfony Validator constraints
Configurable retry policy when the LLM returns invalid data

Flexible Inputs¶

Process text, chat message arrays, or images
Provide examples to improve extraction quality
Structure-to-structure processing: pass an object or array as input and receive a typed result

Multiple Response Model Formats¶

PHP classes with typed properties and optional validation attributes
JSON Schema arrays for dynamic or runtime-defined shapes
Scalar types via built-in helpers (getString(), getInt(), getBoolean(), etc.)

Sync & Streaming¶

Synchronous extraction with get()
Streaming partial updates with stream()->partials()
Streaming completed sequence items with stream()->sequence()

Provider Support¶

Works with OpenAI, Anthropic, Google Gemini, Cohere, Azure OpenAI, Groq, Mistral, Fireworks AI, Ollama, OpenRouter, Together AI, and more
Switch providers by changing a single preset name or LLMConfig

Observability¶

Fine-grained event system for monitoring every stage of the extraction pipeline
Wiretap support for logging and debugging

Start Here¶

Quickstart -- extract your first typed object in minutes
Setup -- installation and provider configuration
Usage -- the full request-building API
Data Model -- defining response models
Validation -- validation rules and retry behavior
Partials -- streaming partial updates

Instructor in Other Languages¶

Instructor has been implemented across multiple technology stacks: