Skip to content

Request Options

The options array is the escape hatch for provider-specific request fields that fall outside Polyglot's unified API. Anything you place in options is passed through to the underlying provider driver.

Note: Except for max_tokens and stream, all option keys are provider-specific and may not be available or behave identically across providers. Always consult the provider's API documentation for details.

Setting Options

Pass options through withOptions() or the options parameter on with():

<?php
use Cognesy\Polyglot\Inference\Inference;
use Cognesy\Messages\Messages;

$text = Inference::using('openai')
    ->withMessages(Messages::fromString('Write one short sentence about PHP.'))
    ->withOptions([
        'temperature' => 0.2,
        'top_p' => 0.9,
    ])
    ->get();

Options are merged additively -- calling withOptions() multiple times will merge the new keys into the existing set rather than replacing it.

Common Options

These options are widely supported across most providers:

$options = [
    'temperature' => 0.7,         // Controls randomness (0.0 to 1.0)
    'max_tokens' => 1000,         // Maximum tokens to generate
    'top_p' => 0.95,              // Nucleus sampling parameter
    'frequency_penalty' => 0.0,   // Penalize repeated tokens
    'presence_penalty' => 0.0,    // Penalize repeated topics
    'stop' => ["\n\n", "User:"],  // Stop sequences
];

Dedicated Helpers Over Raw Options

For common behaviors, prefer the dedicated fluent helpers instead of manually placing values in the options array. The helpers ensure correct handling across all providers:

Instead of this... Use this...
withOptions(['stream' => true]) withStreaming(true)
withOptions(['max_tokens' => 256]) withMaxTokens(256)
withOptions(['retryPolicy' => [...]]) withRetryPolicy($policy)

These helpers set values that the request builder manages separately from the raw options array, ensuring they are applied correctly regardless of the provider.

Provider-Specific Options

Different providers accept different option keys. Here are a few examples:

OpenAI

$response = Inference::using('openai')
    ->withMessages(Messages::fromString('Write a poem about programming.'))
    ->withOptions([
        'temperature' => 0.7,
        'top_p' => 0.9,
        'frequency_penalty' => 0.5,
        'presence_penalty' => 0.3,
    ])
    ->get();

Anthropic

$response = Inference::using('anthropic')
    ->withMessages(Messages::fromString('Write a poem about programming.'))
    ->withOptions([
        'temperature' => 0.7,
        'top_p' => 0.9,
        'top_k' => 40,
    ])
    ->get();

Retry Policy

Retry behavior is configured explicitly through withRetryPolicy() -- never place it inside the options array. Polyglot will throw an InvalidArgumentException if you do.

<?php
use Cognesy\Polyglot\Inference\Config\InferenceRetryPolicy;
use Cognesy\Polyglot\Inference\Inference;

$retryPolicy = new InferenceRetryPolicy(
    maxAttempts: 3,
    baseDelayMs: 500,
    maxDelayMs: 8000,
    jitter: 'full',               // none, full, or equal
    retryOnStatus: [429, 500, 502, 503, 504],
);

$response = Inference::using('openai')
    ->withMessages(Messages::fromString('Summarize this article.'))
    ->withRetryPolicy($retryPolicy)
    ->get();

The retry policy supports exponential backoff with configurable jitter and can also recover from truncated responses:

Parameter Default Description
maxAttempts 1 Total attempts (1 = no retry)
baseDelayMs 250 Base delay in milliseconds
maxDelayMs 8000 Maximum delay cap
jitter 'full' Jitter strategy: none, full, or equal
retryOnStatus [408,429,500,...] HTTP status codes that trigger a retry
retryOnExceptions Timeout, Network Exception classes that trigger a retry
lengthRecovery 'none' Recovery mode: none, continue, increase_max_tokens
lengthMaxAttempts 1 Max attempts for length recovery
lengthContinuePrompt 'Continue.' Prompt used for continue recovery mode
maxTokensIncrement 512 Token increment for increase_max_tokens mode

Response Cache Policy

Control whether responses are cached in memory for reuse:

<?php
use Cognesy\Polyglot\Inference\Enums\ResponseCachePolicy;
use Cognesy\Polyglot\Inference\Inference;

$response = Inference::using('openai')
    ->withMessages(Messages::fromString('What is 2 + 2?'))
    ->withResponseCachePolicy(ResponseCachePolicy::Memory)
    ->get();

Available policies:

  • ResponseCachePolicy::None -- no caching (default)
  • ResponseCachePolicy::Memory -- cache responses in memory for the current process

Cached Context

Use withCachedContext() to attach stable, reusable context that should be prepended to every request. This is useful when you have shared system instructions, tool definitions, or response formats that remain constant across multiple calls:

<?php
use Cognesy\Polyglot\Inference\Inference;
use Cognesy\Messages\Messages;
use Cognesy\Polyglot\Inference\Data\ToolChoice;
use Cognesy\Polyglot\Inference\Data\ResponseFormat;

$base = Inference::using('openai')->withCachedContext(
    messages: Messages::fromArray([
        ['role' => 'system', 'content' => 'You are an expert PHP developer.'],
    ]),
    tools: $sharedToolDefinitions,
    toolChoice: ToolChoice::auto(),
    responseFormat: ResponseFormat::jsonObject(),
);

// Each call inherits the cached context automatically
$response1 = $base->withMessages(Messages::fromString('Explain SOLID principles.'))->get();
$response2 = $base->withMessages(Messages::fromString('What is the Repository pattern?'))->get();

When the request is executed, cached context is merged with the request-level fields: cached messages are prepended to the request messages, and cached tools, tool choice, and response format are used as defaults when the request does not specify its own.

Drivers can map cached context to provider-native caching features (such as Anthropic's prompt caching) when available.