LLM Provider Expansion Guide

Paladin Multi-Provider Support

This document provides a comprehensive comparison of LLM providers supported by Paladin and guidance for configuring and using them effectively.

Overview
Provider Comparison
Configuration Guide
Use Case Recommendations
Migration Guide
Performance Characteristics

Overview

Paladin supports multiple LLM providers out of the box, allowing you to choose the best provider for your specific needs. All providers implement the same LlmPort trait, making it easy to switch between them without changing your application logic.

Supported Providers

OpenAI (GPT-4, GPT-3.5-turbo, GPT-4 Vision)
DeepSeek (DeepSeek-Chat, DeepSeek-Coder)
Anthropic (Claude 3.5 Sonnet, Claude 3 Opus/Haiku)

Provider Comparison

Feature	OpenAI	DeepSeek	Anthropic
Streaming	✅ Yes	✅ Yes	✅ Yes
Tool Calling	✅ Yes	✅ Yes	✅ Yes
Function Calling	✅ Yes	✅ Yes	✅ Yes
Vision/Images	✅ GPT-4V	❌ No	✅ Claude 3+
Max Context	128K (GPT-4)	64K	200K (Claude 3)
Best For	General purpose, production	Cost-effective, reasoning	Safety-critical, analysis
Pricing	$$	$	$$$
Latency	Low	Low	Low-Medium

Detailed Feature Matrix

OpenAI

Strengths:
- Most mature ecosystem with extensive tooling
- Wide range of models (GPT-4, GPT-3.5-turbo, GPT-4 Vision)
- Excellent for general-purpose applications
- Strong vision/multimodal capabilities
- Large community and documentation
Limitations:
- Higher cost compared to alternatives
- Context window smaller than Claude
- Rate limiting on free tier
Ideal Use Cases:
- Production deployments requiring reliability
- Applications needing vision/image analysis
- General-purpose AI assistants
- Well-documented, standard use cases

DeepSeek

Strengths:
- Most cost-effective option
- Strong reasoning and code generation
- High throughput capabilities
- Good for analytical tasks
- Competitive performance at lower cost
Limitations:
- Smaller context window (64K)
- No vision support
- Newer ecosystem, less community resources
Ideal Use Cases:
- Cost-sensitive deployments
- Code generation and analysis
- Logical reasoning tasks
- High-volume/batch processing
- Internal tooling and development

Anthropic Claude

Strengths:
- Largest context window (200K tokens)
- Strong safety and ethical guidelines
- Excellent for complex analysis
- Superior long-document processing
- Strong instruction following
Limitations:
- Higher cost
- Claude-specific API differences (system messages separate)
- Requires max_tokens parameter
Ideal Use Cases:
- Safety-critical applications
- Complex document analysis
- Long-context reasoning
- Compliance and governance
- Medical/legal/financial applications

Configuration Guide

Environment Variables

All providers can be configured via environment variables:

# OpenAI
export OPENAI_API_KEY="sk-..."

# DeepSeek
export DEEPSEEK_API_KEY="..."
export DEEPSEEK_BASE_URL="https://api.deepseek.com/v1"  # Optional
export DEEPSEEK_MODEL="deepseek-chat"                    # Optional

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
export ANTHROPIC_BASE_URL="https://api.anthropic.com/v1" # Optional
export ANTHROPIC_MODEL="claude-3-5-sonnet-20241022"      # Optional

Configuration Files

Add provider configurations to config.yml:

llm:
  # Default provider if multiple are configured
  default_provider: "openai"

  openai:
    api_key: "${OPENAI_API_KEY}"
    base_url: "https://api.openai.com/v1"
    model: "gpt-4"
    timeout_seconds: 30

  deepseek:
    api_key: "${DEEPSEEK_API_KEY}"
    base_url: "https://api.deepseek.com/v1"
    model: "deepseek-chat"
    timeout_seconds: 60

  anthropic:
    api_key: "${ANTHROPIC_API_KEY}"
    base_url: "https://api.anthropic.com/v1"
    model: "claude-3-5-sonnet-20241022"
    timeout_seconds: 30

Programmatic Configuration

OpenAI

#![allow(unused)]
fn main() {
use paladin::infrastructure::adapters::llm::openai_adapter::OpenAILlmAdapter;
use std::time::Duration;

let adapter = OpenAILlmAdapter::new(
    api_key,
    None, // Use default base URL
    Some(Duration::from_secs(30))
)?;
}

DeepSeek

#![allow(unused)]
fn main() {
use paladin::infrastructure::adapters::llm::deepseek_adapter::{
    DeepSeekAdapter, DeepSeekConfig
};

// From environment
let config = DeepSeekConfig::from_env()?;
let adapter = DeepSeekAdapter::new(config)?;

// Or custom
let config = DeepSeekConfig::new(
    api_key,
    "https://api.deepseek.com/v1".to_string(),
    "deepseek-chat".to_string()
);
let adapter = DeepSeekAdapter::new(config)?;
}

Anthropic

#![allow(unused)]
fn main() {
use paladin::infrastructure::adapters::llm::anthropic_adapter::{
    AnthropicAdapter, AnthropicConfig
};

// From environment
let config = AnthropicConfig::from_env()?;
let adapter = AnthropicAdapter::new(config)?;

// Or custom
let config = AnthropicConfig::new(
    api_key,
    "https://api.anthropic.com/v1".to_string(),
    "claude-3-5-sonnet-20241022".to_string()
);
let adapter = AnthropicAdapter::new(config)?;
}

Use Case Recommendations

When to Use OpenAI

Best for:

General-purpose AI applications
Production deployments requiring proven reliability
Applications needing vision/image analysis
Multimodal applications
Projects with complex tooling requirements

Example Use Cases:

Customer support chatbots
Content generation systems
Image analysis and description
General AI assistants
Document Q&A systems

When to Use DeepSeek

Best for:

Cost-sensitive deployments
Code generation and analysis
Logical reasoning tasks
High-volume batch processing
Internal development tools

Example Use Cases:

Code review automation
Test generation
Documentation generation
Internal knowledge bases
Analytical pipelines

When to Use Anthropic Claude

Best for:

Safety-critical applications
Long-document analysis
Complex reasoning tasks
Compliance-sensitive domains
High-stakes decision support

Example Use Cases:

Legal document analysis
Medical record processing
Financial compliance checking
Research paper analysis
Complex contract review

Migration Guide

From OpenAI to DeepSeek

DeepSeek uses an OpenAI-compatible API, making migration straightforward:

#![allow(unused)]
fn main() {
// Before (OpenAI)
let llm_port = Arc::new(OpenAILlmAdapter::new(
    openai_key,
    None,
    Some(Duration::from_secs(30))
)?);

// After (DeepSeek)
let config = DeepSeekConfig::from_env()?;
let llm_port = Arc::new(DeepSeekAdapter::new(config)?);

// Your Paladin code remains the same
let paladin = PaladinBuilder::new(llm_port)
    .system_prompt("Your prompt")
    .build()?;
}

Considerations:

DeepSeek has no vision support
Context window is 64K vs 128K for GPT-4
Response style may differ slightly

From OpenAI to Anthropic

Anthropic Claude requires some adjustments due to API differences:

#![allow(unused)]
fn main() {
// Before (OpenAI)
let llm_port = Arc::new(OpenAILlmAdapter::new(
    openai_key,
    None,
    Some(Duration::from_secs(30))
)?);

// After (Anthropic)
let config = AnthropicConfig::from_env()?;
let llm_port = Arc::new(AnthropicAdapter::new(config)?);

// Your Paladin code remains the same
let paladin = PaladinBuilder::new(llm_port)
    .system_prompt("Your prompt")
    .build()?;
}

Key Differences:

Claude requires max_tokens parameter (defaults to 4096)
System messages are sent separately
Larger context window (200K tokens)
Different SSE streaming format

Provider Fallback Pattern

Implement graceful fallback for higher reliability:

#![allow(unused)]
fn main() {
use paladin::paladin_ports::output::llm_port::LlmPort;
use std::sync::Arc;

fn create_llm_provider() -> Result<Arc<dyn LlmPort>, Box<dyn std::error::Error>> {
    // Try DeepSeek first (cost-effective)
    if let Ok(config) = DeepSeekConfig::from_env() {
        if let Ok(adapter) = DeepSeekAdapter::new(config) {
            return Ok(Arc::new(adapter));
        }
    }

    // Fallback to Anthropic (powerful)
    if let Ok(config) = AnthropicConfig::from_env() {
        if let Ok(adapter) = AnthropicAdapter::new(config) {
            return Ok(Arc::new(adapter));
        }
    }

    // Final fallback to OpenAI (default)
    let api_key = std::env::var("OPENAI_API_KEY")?;
    Ok(Arc::new(OpenAILlmAdapter::new(
        api_key,
        None,
        Some(Duration::from_secs(30))
    )?))
}
}

Performance Characteristics

Latency Comparison (Approximate)

Provider	First Token (p50)	First Token (p95)	Throughput
OpenAI GPT-4	500-800ms	1-2s	Medium
OpenAI GPT-3.5	200-400ms	500ms-1s	High
DeepSeek	300-600ms	800ms-1.5s	High
Anthropic Claude	400-700ms	1-2s	Medium

Note: Actual performance varies based on request size, load, and region

Cost Comparison (Approximate)

Per 1M Tokens (Input/Output):

Provider	Model	Input	Output
OpenAI	GPT-4	$10	$30
OpenAI	GPT-3.5-turbo	$0.50	$1.50
DeepSeek	deepseek-chat	$0.10	$0.20
Anthropic	Claude 3.5 Sonnet	$3	$15

Prices are approximate and subject to change

Scaling Considerations

OpenAI:

Rate limits: Tier-based (requests/min, tokens/min)
Horizontal scaling: Good
Burst capacity: Moderate

DeepSeek:

Rate limits: Generous
Horizontal scaling: Excellent (high throughput)
Burst capacity: High

Anthropic:

Rate limits: Tier-based
Horizontal scaling: Good
Burst capacity: Moderate

Best Practices

1. Use Provider Capabilities

Query provider capabilities before attempting operations:

#![allow(unused)]
fn main() {
let caps = provider.get_capabilities();

if caps.supports_vision {
    // Send image-based requests
}

if caps.supports_streaming {
    // Use streaming for better UX
}
}

2. Set Appropriate Timeouts

Different providers may have different response times:

#![allow(unused)]
fn main() {
// Higher timeout for Claude with long contexts
let claude_config = AnthropicConfig::new(/* ... */);
// Timeout handled internally

// Standard timeout for others
let openai = OpenAILlmAdapter::new(
    api_key,
    None,
    Some(Duration::from_secs(30))
)?;
}

3. Handle Provider-Specific Errors

#![allow(unused)]
fn main() {
match provider.generate(&request).await {
    Ok(response) => // Handle response,
    Err(LlmError::RateLimitExceeded { retry_after }) => {
        tokio::time::sleep(Duration::from_secs(retry_after)).await;
        // Retry
    }
    Err(LlmError::AuthenticationError(_)) => {
        // Check API keys
    }
    Err(e) => // Handle other errors
}
}

4. Monitor Usage and Costs

#![allow(unused)]
fn main() {
let response = provider.generate(&request).await?;

// Log token usage
println!("Input tokens: {}", response.usage.prompt_tokens);
println!("Output tokens: {}", response.usage.completion_tokens);
println!("Total cost: ${}", calculate_cost(&response, provider_name));
}

Troubleshooting

Authentication Errors

Issue: LlmError::AuthenticationError

Solutions:

Verify API key is set correctly
Check API key has necessary permissions
Ensure API key hasn't expired
Verify base URL is correct for your region

Rate Limiting

Issue: LlmError::RateLimitExceeded

Solutions:

Implement exponential backoff (built-in to adapters)
Consider upgrading API tier
Implement request queuing
Switch to provider with higher limits

Timeout Errors

Issue: LlmError::Timeout

Solutions:

Increase timeout duration
Reduce request complexity
Check network connectivity
Consider switching to streaming mode

Context Length Errors

Issue: LlmError::InvalidRequest (context too long)

Solutions:

Reduce input size
Switch to provider with larger context (Claude: 200K)
Implement context windowing
Summarize older conversation history

Additional Resources

Paladin Examples - Working code examples
Contributing Providers Guide - Add new providers
API Documentation - Full API reference
GitHub Issues - Report issues

Last Updated: January 2026
Version: 0.1.0

Paladin Framework

LLM Provider Expansion Guide

Table of Contents

Overview

Supported Providers

Provider Comparison

Detailed Feature Matrix

OpenAI

DeepSeek

Anthropic Claude

Configuration Guide

Environment Variables

Configuration Files

Programmatic Configuration

OpenAI

DeepSeek

Anthropic

Use Case Recommendations

When to Use OpenAI

When to Use DeepSeek

When to Use Anthropic Claude

Migration Guide

From OpenAI to DeepSeek

From OpenAI to Anthropic

Provider Fallback Pattern

Performance Characteristics

Latency Comparison (Approximate)

Cost Comparison (Approximate)

Scaling Considerations

Best Practices

1. Use Provider Capabilities

2. Set Appropriate Timeouts

3. Handle Provider-Specific Errors

4. Monitor Usage and Costs

Troubleshooting

Authentication Errors

Rate Limiting

Timeout Errors

Context Length Errors

Additional Resources