Skip to content

Customize Prompts

Instructor builds a structured prompt from several components: system text, user messages, a mode-specific instruction prompt, examples, and retry context. You can customize most of these to tune extraction behavior without changing the underlying extraction flow.

System And Prompt Text

The two most common customization points are the system message and the prompt text:

use Cognesy\Instructor\StructuredOutput;

$result = (new StructuredOutput)
    ->withSystem('You are a precise data extraction assistant. Return only factual data.')
    ->withPrompt('Extract the contact details from the text below.')
    ->with(messages: $text, responseModel: Contact::class)
    ->get();
  • System text sets the model's persona and overall behavior. Use it for stable instructions that apply across many requests.
  • Prompt text provides task-specific instructions for this particular extraction. Instructor appends it to the conversation alongside the mode-specific extraction prompt.

You can also pass both through the with() method:

$result = (new StructuredOutput)
    ->with(
        messages: $text,
        responseModel: Contact::class,
        system: 'Return concise, accurate data.',
        prompt: 'Extract the contact details.',
    )
    ->get();

Examples

Few-shot examples are another prompt component. They are appended to the conversation to demonstrate the expected extraction style:

use Cognesy\Instructor\Extras\Example\Example;

$result = (new StructuredOutput)
    ->withExamples([
        Example::fromText('Jane Doe, 31', ['name' => 'Jane Doe', 'age' => 31]),
    ])
    ->with(messages: $text, responseModel: Person::class)
    ->get();

See the Demonstrations page for details on the Example class.

Cached Context

Some providers (notably Anthropic) support prompt caching, where stable parts of the conversation are cached between requests to reduce latency and cost. Use withCachedContext() to mark content as cacheable:

$result = (new StructuredOutput)
    ->withCachedContext(
        messages: $referenceDocument,
        system: 'You are a document analyst.',
        prompt: 'Extract entities from the document.',
        examples: $examples,
    )
    ->with(messages: 'Now extract from this specific paragraph...', responseModel: Entity::class)
    ->get();

The cached context is placed before the per-request content in the prompt. Content passed through withCachedContext() is marked with cache control headers where the provider supports them.

Mode-Specific Prompts

Instructor uses a default prompt for each output mode that tells the model how to format its response. These prompts are configured in StructuredOutputConfig and can be customized per mode:

Mode Default prompt behavior
Tools "Extract correct and accurate data from the input using provided tools."
Json Includes the JSON Schema and asks for a strict JSON response
JsonSchema Asks for a strict JSON response following the provided schema
MdJson Includes the JSON Schema and asks for JSON inside a Markdown code block

Overriding Mode Prompts

You can replace the default prompt for any mode through the config:

use Cognesy\Instructor\Config\StructuredOutputConfig;
use Cognesy\Instructor\Enums\OutputMode;

$config = new StructuredOutputConfig(
    modePrompts: [
        OutputMode::Tools->value => 'Use the provided tool to extract data accurately.',
        OutputMode::Json->value => "Respond with a JSON object matching this schema:\n<|json_schema|>\n",
    ],
);

Template Placeholders

Mode prompts support the <|json_schema|> placeholder, which Instructor replaces with the JSON Schema generated from your response model. This is particularly important for Json and MdJson modes, where the schema must be embedded in the prompt:

$config = new StructuredOutputConfig(
    modePrompts: [
        OutputMode::Json->value => "Your task is to respond with a JSON object. "
            . "Response must follow this JSON Schema:\n<|json_schema|>\n",
    ],
);

Tool Name And Description

In OutputMode::Tools, the tool definition sent to the model includes a name and description. These provide semantic context that can improve extraction quality:

use Cognesy\Instructor\Config\StructuredOutputConfig;

$config = new StructuredOutputConfig(
    toolName: 'extract_person',
    toolDescription: 'Extract personal information from the provided text.',
);

The defaults are extracted_data and Function call based on user instructions. respectively. Overriding them with task-specific values can help the model understand what the tool represents.

OutputMode::Json and OutputMode::MdJson ignore tool name and description since they do not use tool calling.

Retry Prompt

When validation fails and retries are enabled, Instructor appends a retry prompt to the conversation. The default is:

JSON generated incorrectly, fix following errors:

You can customize this through the config:

$config = new StructuredOutputConfig(
    retryPrompt: 'The previous response had validation errors. Please correct them:',
);

Chat Structure

Instructor assembles the final prompt from named sections in a specific order. The default structure includes sections for system messages, cached context, prompt, examples, messages, and retries. You can reorder or extend this through StructuredOutputConfig:

$config = new StructuredOutputConfig(
    chatStructure: [
        'system',
        'pre-cached', 'cached-prompt', 'cached-examples', 'cached-messages', 'post-cached',
        'pre-prompt', 'prompt', 'post-prompt',
        'pre-examples', 'examples', 'post-examples',
        'pre-messages', 'messages', 'post-messages',
        'pre-retries', 'retries', 'post-retries',
    ],
);

Most applications will never need to modify the chat structure. It is exposed for advanced use cases where you need precise control over prompt ordering.