Features
A comprehensive overview of Instructor's capabilities.
Core Features¶
Structured Output Extraction¶
Define a PHP class, get a populated object back:
<?php
class Person {
public string $name;
public int $age;
public string $occupation;
}
$person = (new StructuredOutput)
->withResponseClass(Person::class)
->withMessages("Extract: Sarah, 32, software architect")
->get();
Key capabilities:
- Works with any PHP class with typed properties
- Supports nested objects and arrays
- Handles nullable fields gracefully
- Preserves type information throughout
Automatic Validation¶
Built-in support for Symfony Validator:
<?php
use Symfony\Component\Validator\Constraints as Assert;
class User {
#[Assert\NotBlank]
#[Assert\Email]
public string $email;
#[Assert\Range(min: 18, max: 120)]
public int $age;
#[Assert\Choice(['active', 'inactive', 'pending'])]
public string $status;
}
Validation features:
- All Symfony validation constraints supported
- Custom validators work out of the box
- Validation errors trigger automatic retry
- Error messages sent to LLM for self-correction
Self-Correcting Retries¶
When validation fails, Instructor automatically retries:
<?php
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Polyglot\Inference\LLMProvider;
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
->withMaxRetries(3);
$result = (new StructuredOutput($runtime))
->withResponseClass(User::class)
->withMessages($text)
->get();
Retry behavior:
- LLM generates response
- Response validated against constraints
- On failure: errors sent back to LLM with context
- LLM attempts correction
- Repeat until valid or max retries reached
Input Flexibility¶
Text Input¶
Simple string input:
Chat Messages¶
OpenAI-style message arrays:
<?php
->withMessages([
['role' => 'system', 'content' => 'You are a data extraction expert.'],
['role' => 'user', 'content' => 'Extract the person: John, 25, engineer']
])
Image Input¶
Process images with vision-capable models:
<?php
use Cognesy\Addons\Image\Image;
->with(messages: Image::fromFile('path/to/image.jpg')->toMessage())
->withPrompt("Extract all text from this document")
Supported formats: JPEG, PNG, GIF, WebP
Structured Input¶
Pass objects or arrays as input:
<?php
$inputData = [
'document' => $documentText,
'metadata' => ['source' => 'email', 'date' => '2024-01-15']
];
$result = (new StructuredOutput)
->withResponseClass(Analysis::class)
->withInput($inputData)
->get();
Output Modes¶
Tools Mode (Default)¶
Uses LLM function/tool calling:
<?php
use Cognesy\Instructor\Enums\OutputMode;
use Cognesy\Instructor\StructuredOutputRuntime;
use Cognesy\Polyglot\Inference\LLMProvider;
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
->withOutputMode(OutputMode::Tools);
Best for: OpenAI, Anthropic, most modern models
JSON Schema Mode¶
Strict schema enforcement:
<?php
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
->withOutputMode(OutputMode::JsonSchema);
Best for: GPT-4, models with strict JSON Schema support
JSON Mode¶
Basic JSON response format:
<?php
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
->withOutputMode(OutputMode::Json);
Best for: Models supporting JSON mode without strict schemas
Markdown JSON Mode¶
Prompting-based extraction:
<?php
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
->withOutputMode(OutputMode::MdJson);
Best for: Models without JSON mode, fallback option
Response Types¶
Single Object¶
Arrays of Objects¶
Use Sequence::of() to extract lists:
<?php
use Cognesy\Instructor\Extras\Sequence\Sequence;
$people = (new StructuredOutput)
->withResponseClass(Sequence::of(Person::class))
->withMessages($text)
->get();
// Iterate over results
foreach ($people as $person) {
echo $person->name;
}
// Or use array-like access
$first = $people->first();
$count = $people->count();
$all = $people->toArray();
Scalar Values¶
Extract simple types with adapters:
<?php
use Cognesy\Instructor\Extras\Scalars\Scalar;
// Boolean
$isSpam = (new StructuredOutput)
->withResponseClass(Scalar::boolean('isSpam'))
->get();
// Integer
$count = (new StructuredOutput)
->withResponseClass(Scalar::integer('count'))
->get();
// String
$summary = (new StructuredOutput)
->withResponseClass(Scalar::string('summary'))
->get();
Enums¶
<?php
enum Sentiment: string {
case Positive = 'positive';
case Negative = 'negative';
case Neutral = 'neutral';
}
$sentiment = (new StructuredOutput)
->withResponseClass(Scalar::enum(Sentiment::class, 'sentiment'))
->get();
Streaming¶
Partial Updates¶
Get incremental results as they arrive:
<?php
$stream = (new StructuredOutput)
->withResponseClass(Article::class)
->with(
messages: $text,
options: ['stream' => true]
)
->stream();
foreach ($stream->partials() as $partial) {
// $partial has incrementally populated fields
updateUI($partial);
}
$final = $stream->finalValue();
Or subscribe to streaming events:
<?php
use Cognesy\Instructor\Events\PartialsGenerator\PartialResponseGenerated;
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new())
->onEvent(PartialResponseGenerated::class, fn(PartialResponseGenerated $event) => updateUI($event->partialResponse));
$stream = (new StructuredOutput)
->withRuntime($runtime)
->with(
responseModel: Article::class,
messages: $text,
options: ['stream' => true],
)
->stream();
$article = $stream->finalValue();
Sequence Streaming¶
Stream sequence items as they complete:
<?php
use Cognesy\Instructor\Extras\Sequence\Sequence;
$list = (new StructuredOutput)
->withResponseClass(Sequence::of(Person::class))
->with(
messages: $text,
options: ['stream' => true],
)
->stream();
foreach ($list->sequence() as $seq) {
processComplete($seq->last());
}
$final = $list->finalValue();
LLM Providers¶
Supported Providers¶
| Provider | API Type | Streaming | Vision | Tool Calling |
|---|---|---|---|---|
| OpenAI | Native | ✓ | ✓ | ✓ |
| Anthropic | Native | ✓ | ✓ | ✓ |
| Google Gemini | Native | ✓ | ✓ | ✓ |
| Azure OpenAI | OpenAI-compatible | ✓ | ✓ | ✓ |
| Mistral | Native | ✓ | - | ✓ |
| Cohere | OpenAI-compatible | ✓ | - | ✓ |
| Groq | OpenAI-compatible | ✓ | - | ✓ |
| Fireworks AI | OpenAI-compatible | ✓ | ✓ | ✓ |
| Together AI | OpenAI-compatible | ✓ | ✓ | ✓ |
| Ollama | OpenAI-compatible | ✓ | ✓ | ✓ |
| OpenRouter | OpenAI-compatible | ✓ | ✓ | ✓ |
| Perplexity | OpenAI-compatible | ✓ | - | - |
| DeepSeek | OpenAI-compatible | ✓ | - | ✓ |
| xAI (Grok) | OpenAI-compatible | ✓ | - | ✓ |
| Cerebras | OpenAI-compatible | ✓ | - | ✓ |
| SambaNova | OpenAI-compatible | ✓ | - | ✓ |
Provider Selection¶
<?php
use Cognesy\Instructor\StructuredOutputRuntime;
// Use preset from config
$structuredOutput = StructuredOutput::using('anthropic');
// Or configure runtime explicitly (advanced)
$structuredOutput = (new StructuredOutput)->withRuntime(
StructuredOutputRuntime::fromConfig(
\Cognesy\Polyglot\Inference\Config\LLMConfig::fromDsn('preset=anthropic,model=claude-3-5-sonnet-latest')
)
);
Schema Definition¶
Type-Hinted Classes¶
<?php
class Order {
public string $orderId;
public Customer $customer;
/** @var LineItem[] */
public array $items;
public float $total;
public string|null $notes;
}
PHP DocBlocks for Instructions¶
<?php
class Product {
/** The product SKU, e.g., "SKU-12345" */
public string $sku;
/** Price in USD, without currency symbol */
public float $price;
/** @var string[] List of applicable categories */
public array $categories;
}
Attributes for Detailed Control¶
<?php
use Cognesy\Instructor\Schema\Attributes\Description;
use Cognesy\Instructor\Schema\Attributes\Instructions;
class Analysis {
#[Description("Sentiment score from -1.0 (negative) to 1.0 (positive)")]
public float $sentiment;
#[Instructions("Extract the 3 most important points only")]
/** @var string[] */
public array $keyPoints;
}
Dynamic Schemas with Structure¶
<?php
use Cognesy\Dynamic\StructureBuilder;
use Cognesy\Instructor\StructuredOutput;
$schema = StructureBuilder::define('user')
->string('name')
->int('age', required: false)
->collection('tags', 'string', required: false)
->build();
$result = (new StructuredOutput)
->with(
messages: 'Extract user profile from this text...',
responseModel: $schema,
)
->get();
Advanced Features¶
Context Caching¶
Reduce costs with cached context (Anthropic):
<?php
->withCachedContext([
'Large document or context here...',
'This won\'t be re-sent on retries'
])
Custom Prompts¶
Override default extraction prompts:
<?php
use Cognesy\Instructor\Config\StructuredOutputConfig;
->withPrompt("Extract the following fields precisely: ...")
->withConfig(new StructuredOutputConfig(
retryPrompt: "The previous attempt had errors: {errors}. Please correct."
))
Event System¶
Monitor internal processing:
<?php
use Cognesy\Instructor\Events\StructuredOutput\StructuredOutputRequestReceived;
use Cognesy\Instructor\Events\StructuredOutput\StructuredOutputResponseGenerated;
$runtime = StructuredOutputRuntime::fromProvider(LLMProvider::new());
$runtime->onEvent(StructuredOutputRequestReceived::class, function($event) {
logger()->info('Request received', $event->toArray());
});
$runtime->onEvent(StructuredOutputResponseGenerated::class, function($event) {
logger()->info('Response generated', $event->toArray());
});
Debug Mode¶
See all LLM interactions:
Outputs: - Full request payloads - Raw LLM responses - Validation errors - Retry attempts
Framework Integration¶
Laravel¶
<?php
// Service provider auto-registers
// Use facade
use Cognesy\Instructor\Facades\Instructor;
$result = Instructor::respond()
->withResponseClass(Person::class)
->withMessages($text)
->get();
// Or inject
public function handle(StructuredOutput $instructor)
{
return $instructor->withResponseClass(Person::class)->get();
}
Symfony¶
<?php
// Configure as service
// services.yaml
services:
Cognesy\Instructor\StructuredOutput:
autowire: true
// Use in controller
public function extract(StructuredOutput $instructor): Response
{
$result = $instructor->withResponseClass(Person::class)->get();
return $this->json($result);
}
Standalone¶
<?php
// No framework needed
require 'vendor/autoload.php';
$result = (new StructuredOutput)
->withResponseClass(Person::class)
->withMessages($text)
->get();
Observability¶
Token Usage¶
<?php
$response = (new StructuredOutput)
->withResponseClass(Person::class)
->withMessages($text)
->getResponse();
echo $response->usage->inputTokens;
echo $response->usage->outputTokens;
echo $response->usage->totalTokens;
Timing¶
Event-Based Logging¶
<?php
$runtime->onEvent('*', function($event) {
$logger->log($event->name(), $event->toArray());
});
What's Next¶
- Getting Started - Quick installation guide
- Why Instructor - Understanding the value proposition
- Use Cases - Industry-specific examples
- Cookbook - 60+ working examples
- API Reference - Complete documentation