Skip to content

Connection Mgmt

One of Polyglot's core design principles is that your request code should remain stable while the provider configuration changes. A preset is a named YAML file that bundles everything the runtime needs -- driver type, API URL, credentials, default model, and token limits -- into a single, swappable unit.

When you call Inference::using('openai'), Polyglot loads the openai.yaml preset from the configuration directory and builds a fully wired runtime behind the scenes. Switching providers is a one-line change.

Switching Providers

Because presets encapsulate all provider details, the same request code works against any supported backend:

<?php

use Cognesy\Messages\Messages;
use Cognesy\Polyglot\Inference\Inference;

$prompt = Messages::fromString('Explain dependency injection in one sentence.');

$openai    = Inference::using('openai')->withMessages($prompt)->get();
$anthropic = Inference::using('anthropic')->withMessages($prompt)->get();
$gemini    = Inference::using('gemini')->withMessages($prompt)->get();

You can also override the model on a per-request basis without creating a new preset:

<?php

use Cognesy\Messages\Messages;
use Cognesy\Polyglot\Inference\Inference;

$response = Inference::using('openai')
    ->with(
        messages: Messages::fromString('What is the capital of France?'),
        model: 'gpt-4.1',
    )
    ->get();

Understanding Presets vs. Driver Types

It is important to distinguish between a preset name and a driver type. A preset name (e.g. openai, ollama, custom-local) is an arbitrary label for a YAML configuration file. A driver type (e.g. openai, anthropic, openai-compatible) refers to the underlying protocol implementation that Polyglot uses to communicate with the API.

Multiple presets can share the same driver. For example, you might create a local-llama preset that uses the openai-compatible driver pointed at a local Ollama instance, and a together preset that also uses the openai-compatible driver pointed at the Together AI API.

Polyglot ships with the following driver types:

Driver Providers
openai OpenAI
openai-responses OpenAI (Responses API)
anthropic Anthropic
gemini Google Gemini (native API)
gemini-oai Google Gemini (OpenAI-compatible API)
azure Azure OpenAI
bedrock-openai AWS Bedrock (OpenAI-compatible)
a21 AI21 Labs
cerebras Cerebras
cohere Cohere
deepseek DeepSeek
fireworks Fireworks AI
glm GLM
groq Groq
huggingface Hugging Face
inception Inception
meta Meta
minimaxi MiniMaxi
mistral Mistral AI
openrouter OpenRouter
openresponses Open Responses
perplexity Perplexity
qwen Qwen
sambanova SambaNova
xai xAI (Grok)
openai-compatible Any OpenAI-compatible API (Ollama, Together, Moonshot, etc.)

Implementing Fallbacks

Polyglot does not impose a fallback policy. Fallback behavior belongs in application code, where you have the context to decide which providers to try and how to handle failures:

<?php

use Cognesy\Polyglot\Inference\Inference;
use Cognesy\Http\Exceptions\HttpRequestException;
use Cognesy\Messages\Messages;

function withFallback(array $presets, Messages $prompt): string {
    $lastException = null;

    foreach ($presets as $preset) {
        try {
            return Inference::using($preset)
                ->withMessages($prompt)
                ->get();
        } catch (HttpRequestException $e) {
            $lastException = $e;
            // Optionally log the failure before trying the next provider
        }
    }

    throw new \RuntimeException(
        'All providers failed. Last error: ' . $lastException?->getMessage()
    );
}

$response = withFallback(
    presets: ['openai', 'anthropic', 'gemini'],
    prompt: Messages::fromString('What is the capital of France?'),
);

This pattern gives you full control over retry logic, logging, and error handling at each step of the fallback chain.

Cost-Aware Provider Selection

You can route requests to different presets based on the complexity or importance of each task. This pattern lets you reserve expensive models for critical work while using cheaper alternatives for simpler queries:

<?php

use Cognesy\Messages\Messages;
use Cognesy\Polyglot\Inference\Inference;

class CostAwareRouter {
    private array $tiers = [
        'low'    => ['preset' => 'ollama',  'model' => 'llama3'],
        'medium' => ['preset' => 'mistral', 'model' => 'mistral-small-latest'],
        'high'   => ['preset' => 'openai',  'model' => 'gpt-4.1'],
    ];

    public function ask(string $question, string $tier = 'medium'): string {
        $provider = $this->tiers[$tier] ?? $this->tiers['medium'];

        return Inference::using($provider['preset'])
            ->with(messages: Messages::fromString($question), model: $provider['model'])
            ->get();
    }
}

$router = new CostAwareRouter();

// Simple question -- use low-cost tier
echo $router->ask('What is 2+2?', 'low');

// Moderate complexity -- use medium tier
echo $router->ask('Explain monads in simple terms.', 'medium');

// High-stakes analysis -- use premium tier
echo $router->ask('Analyze the ethical implications of AI in healthcare.', 'high');

Task-Based Provider Selection

Different providers may excel at different tasks. You can map task types to the most appropriate preset, routing creative writing to one model and code generation to another:

<?php

use Cognesy\Messages\Messages;
use Cognesy\Polyglot\Inference\Inference;

class TaskRouter {
    private array $routes = [
        'creative' => 'anthropic',
        'factual'  => 'openai',
        'code'     => 'gemini',
        'default'  => 'openai',
    ];

    public function ask(string $question, string $taskType = 'default'): string {
        $preset = $this->routes[$taskType] ?? $this->routes['default'];

        return Inference::using($preset)
            ->withMessages(Messages::fromString($question))
            ->get();
    }
}

$router = new TaskRouter();

echo $router->ask('Write a short poem about the ocean.', 'creative');
echo $router->ask('What is the capital of France?', 'factual');
echo $router->ask('Write a PHP function to reverse a string.', 'code');

Tip: You can combine cost-aware and task-based routing. For example, use a cheap local model for simple factual lookups but route complex creative tasks to a premium cloud provider.

Reusing an Inference Instance

Each call to Inference::using() loads the preset YAML and builds a new runtime. If you plan to issue many requests against the same provider, create the instance once and reuse it:

<?php

use Cognesy\Messages\Messages;
use Cognesy\Polyglot\Inference\Inference;

$inference = Inference::using('openai');

$answer1 = $inference->withMessages(Messages::fromString('What is PHP?'))->get();
$answer2 = $inference->withMessages(Messages::fromString('What is Laravel?'))->get();

Because Inference uses immutable builder methods (each call returns a new copy), sharing a single instance across concurrent requests is safe.