Model Configuration#

Configure which LLM models your agents use.

Overview#

Pantheon provides a unified LLM interface with native SDK adapters, giving access to many LLM providers through a consistent API.

Key features:

  • Smart Model Selection: Use quality tags (high, normal, low) instead of hardcoding model names

  • Automatic Provider Detection: Pantheon detects available providers from your API keys

  • Capability Filtering: Select models by capabilities (vision, reasoning, tools, etc.)

  • Fallback Chains: Automatic failover to backup models

Smart Model Selection#

Instead of hardcoding specific model names, you can use Pantheon’s intelligent tag-based selection.

Quality Tags#

Select models by quality level:

Tag

Description

Typical Models (System Defaults)

high

Most capable models for complex tasks

openai/gpt-5.4-pro, anthropic/claude-opus-4-5-20251101, gemini/gemini-3-pro-preview

normal

Balanced performance and cost

openai/gpt-5.4, anthropic/claude-sonnet-4-5-20250929, gemini/gemini-3-flash-preview

low

Fast and cost-effective

openai/gpt-5-mini, anthropic/claude-haiku-4-5-20251001, gemini/gemini-flash-lite-latest

from pantheon.agent import Agent

# Let Pantheon choose the best available model
agent = Agent(model="high")      # Best quality
agent = Agent(model="normal")    # Balanced (default)
agent = Agent(model="low")       # Fast and cheap

Capability Tags#

Filter models by required capabilities:

Tag

Description

vision

Image understanding (GPT-4o, Claude 3, Gemini Pro Vision)

reasoning

Enhanced reasoning (o1, DeepSeek R1)

tools

Function/tool calling support

audio_in

Audio input support

audio_out

Audio output/TTS support

web

Built-in web search

pdf

PDF document input

computer

Computer use capabilities

schema

Structured output/response schema

prefill

Assistant message prefilling

Combine quality and capability tags:

# High quality with vision support
agent = Agent(model="high,vision")

# Cost-effective with tool calling
agent = Agent(model="low,tools")

# Normal quality with reasoning
agent = Agent(model="normal,reasoning")

Automatic Provider Detection#

Pantheon automatically detects available providers from environment variables:

# Set any of these - Pantheon will use the first available
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GEMINI_API_KEY="..."

Provider priority (configurable in settings.json):

  1. OpenAI

  2. Anthropic

  3. Gemini

  4. Z.ai (Zhipu)

  5. DeepSeek

Supported Providers#

Pantheon supports many LLM providers. Here are the most common ones:

Major Cloud Providers#

Provider

Prefix

Example Models

OpenAI

openai/

gpt-5.4, gpt-5.4-pro, gpt-5.2, gpt-5-mini, gpt-4o

Anthropic

anthropic/

claude-opus-4-5-20251101, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001

Google AI

gemini/

gemini-3-pro-preview, gemini-3-flash-preview, gemini-flash-lite-latest

Azure OpenAI

azure/

Your deployed model names

AWS Bedrock

bedrock/

anthropic.claude-3, amazon.titan

Google Vertex AI

vertex_ai/

gemini-pro, claude-3-sonnet

Specialized AI Platforms#

Provider

Prefix

Example Models

Mistral AI

mistral/

mistral-large, mistral-medium, codestral

Cohere

cohere/

command-r-plus, command-r

Groq

groq/

llama-3.3-70b-versatile, mixtral-8x7b-32768

Together AI

together_ai/

llama-3, qwen, deepseek

Fireworks AI

fireworks_ai/

llama-v3, mixtral

DeepSeek

deepseek/

deepseek-chat, deepseek-reasoner

Perplexity

perplexity/

pplx-70b-online, sonar-medium

OpenRouter

openrouter/

Access to 100+ models via single API

Local & Open Source#

Provider

Prefix

Example Models

Ollama

ollama/

llama3, mistral, codellama, qwen

vLLM

vllm/

Any HuggingFace model

LM Studio

lm_studio/

Local models

HuggingFace

huggingface/

meta-llama/Llama-3

Chinese Providers#

Provider

Prefix

Example Models

Z.ai (Zhipu)

zai/

glm-4.6, glm-4.5, glm-4.5v

Qwen (Alibaba)

qwen/

qwen-turbo, qwen-plus

Baidu ERNIE

ernie/

ernie-bot-4

Kimi (Moonshot)

(none)

kimi-for-coding

Note

Kimi Coding: Use model name kimi-for-coding with LLM_API_BASE=https://api.kimi.com/coding/v1 and your Kimi API key as LLM_API_KEY. See Kimi Code Docs for details.

Note

For additional providers, see the Pantheon documentation or configure custom endpoints.

Model Format#

You can specify models in two ways:

1. Strict Model Name (provider/model-name):

openai/gpt-5.4
anthropic/claude-opus-4-5-20251101
gemini/gemini-3-pro-preview
deepseek/deepseek-chat

2. Tags (quality and/or capability):

high
normal
low
high,vision
normal,reasoning
low,tools

For OpenAI models, the prefix is optional:

# These are equivalent
agent = Agent(model="gpt-4o")
agent = Agent(model="openai/gpt-4o")

Configuration#

Settings File#

Configure model defaults in .pantheon/settings.json:

{
  "models": {
    "provider_priority": ["openai", "anthropic", "gemini", "deepseek"],
    "provider_models": {
      "openai": {
        "high": ["openai/gpt-5.4-pro", "openai/gpt-5.4", "openai/gpt-5.2"],
        "normal": ["openai/gpt-5.4", "openai/gpt-5.2", "openai/gpt-4o"],
        "low": ["openai/gpt-5-mini", "openai/gpt-4o-mini"]
      },
      "anthropic": {
        "high": ["anthropic/claude-opus-4-5-20251101"],
        "normal": ["anthropic/claude-sonnet-4-5-20250929"],
        "low": ["anthropic/claude-haiku-4-5-20251001"]
      }
    }
  }
}

Provider Priority#

Control which provider is used when multiple API keys are available:

{
  "models": {
    "provider_priority": ["anthropic", "openai", "gemini"]
  }
}

Model in Templates#

In agent templates (.pantheon/agents/*.md):

---
name: Smart Agent
model: high,vision
---

You are a helpful assistant with vision capabilities.

In team templates:

---
name: Research Team
model: normal
agents:
  - name: researcher
    model: high
    instructions: Research complex topics.
  - name: writer
    model: low
    instructions: Write summaries.
---

Python API#

from pantheon.agent import Agent

# Using tags (recommended)
agent = Agent(model="normal")
agent = Agent(model="high,vision")

# Using strict model name
agent = Agent(model="anthropic/claude-opus-4-5-20251101")

# Using fallback chain
agent = Agent(model=["openai/gpt-5.4", "openai/gpt-5.2", "openai/gpt-4o"])

API Keys#

Set API keys via environment variables (recommended):

# Major providers
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GEMINI_API_KEY="..."
export GOOGLE_API_KEY="..."  # Alternative for Gemini

# Other providers
export MISTRAL_API_KEY="..."
export COHERE_API_KEY="..."
export GROQ_API_KEY="..."
export DEEPSEEK_API_KEY="..."
export TOGETHER_API_KEY="..."
export FIREWORKS_API_KEY="..."
export OPENROUTER_API_KEY="..."

# Chinese providers
export ZAI_API_KEY="..."
export QWEN_API_KEY="..."

Or in .pantheon/settings.json (not recommended for security):

{
  "api_keys": {
    "OPENAI_API_KEY": "sk-..."
  }
}

Custom API Endpoint (Third-party Proxy)#

To route all LLM calls through a third-party proxy or custom OpenAI-compatible endpoint, set the universal LLM_API_BASE and LLM_API_KEY:

export LLM_API_BASE="https://your-proxy.com/v1"
export LLM_API_KEY="your-proxy-key"

Or add to your .env / ~/.pantheon/.env file:

LLM_API_BASE=https://your-proxy.com/v1
LLM_API_KEY=your-proxy-key

Interactive setup:

# Launch the setup wizard
pantheon setup
# Select option [0] Custom API Endpoint

# Or use /keys in the REPL
/keys 0 https://your-proxy.com/v1 your-proxy-key

Priority rules:

  • Base URL: OPENAI_API_BASE (provider-specific) > LLM_API_BASE (universal)

  • API Key (unified proxy mode): When LLM_API_BASE is set, LLM_API_KEY takes priority over provider-specific keys (e.g. OPENAI_API_KEY). This ensures all requests to the proxy use the correct credentials.

  • API Key (normal mode): When no LLM_API_BASE is set, provider-specific keys (e.g. OPENAI_API_KEY) take priority over LLM_API_KEY.

Azure OpenAI#

Configure Azure endpoints:

export AZURE_API_KEY="..."
export AZURE_API_BASE="https://your-resource.openai.azure.com"
export AZURE_API_VERSION="2024-02-01"

Use with:

agent = Agent(model="azure/your-deployment-name")

Local Models (Ollama)#

  1. Install Ollama

  2. Pull a model: ollama pull llama3

  3. Use in Pantheon:

agent = Agent(model="ollama/llama3")

Model Parameters#

Set model parameters per-agent:

agent = Agent(
    model="high",
    model_params={
        "temperature": 0.7,
        "max_tokens": 4000,
        "top_p": 0.9
    }
)

In templates:

---
name: Creative Writer
model: high
temperature: 0.9
max_tokens: 4000
---

Temperature Guidelines#

  • 0.0-0.3: Deterministic, factual (code, analysis)

  • 0.4-0.7: Balanced (general tasks)

  • 0.8-1.0: Creative (writing, brainstorming)

Troubleshooting#

“No provider available”

# Check if API keys are set
echo $OPENAI_API_KEY
echo $ANTHROPIC_API_KEY

# Set one
export OPENAI_API_KEY="sk-..."

“Model not found”

  • Check model name spelling

  • Verify the model exists for that provider

  • Check provider prefix is correct

“Rate limit exceeded”

  • Use quality tags - Pantheon will fallback automatically

  • Configure fallback models in settings.json

  • Add delays between requests

Checking Available Models

from pantheon.utils.model_selector import get_model_selector

selector = get_model_selector()
info = selector.list_available_models()

print("Available providers:", info["available_providers"])
print("Current provider:", info["current_provider"])
print("Models:", info["models_by_provider"])