Hermes Agent Tutorial 6: Multi-Model Configuration — Power Configuration for Flexibility

Tutorial Overview

Series index: Hermes Agent Tutorial Series

This tutorial covers Hermes Agent’s Multi-Model Configuration — a power configuration that lets you switch between 200+ LLM providers instantly.

Note: This is labeled “Power Configuration” because it’s an enhancement to your deployed Hermes (CLI or Gateway). Not all users need this immediately, but it’s essential for cost optimization and capability flexibility.

What you will learn

✅ Supported providers and their offerings
✅ Provider configuration steps
✅ Model switching with hermes model
✅ Cost optimization strategies
✅ Custom endpoint setup

Why Model Flexibility Matters

Single-provider limitation

Most AI tools lock you into one provider:

Claude Code → Anthropic only
ChatGPT → OpenAI only
Gemini → Google only

Problems:

Rate limits hit? You’re stuck
Cost spike? No alternatives
Model down? Can’t switch

Hermes’s multi-provider approach

flowchart TD
    A[Hermes Agent] --> B[Nous Portal]
    A --> C[OpenRouter 200+]
    A --> D[z.ai/GLM]
    A --> E[Kimi/Moonshot]
    A --> F[MiniMax]
    A --> G[OpenAI]
    A --> H[Custom Endpoint]

    style A fill:#e1f5ff

Switch instantly:

/model openrouter:auto        # Auto-select cheapest
/model anthropic:claude-4     # Premium for complex tasks
/model openai:gpt-4o-mini     # Fast, cheap for simple tasks

Supported Providers

Provider comparison

Provider	Models	Free Tier	Rate Limits	Best For
Nous Portal	Hermes family	Yes (limited)	Moderate	Native Hermes
OpenRouter	200+ models	Yes	Per-model	Flexibility
z.ai/GLM	GLM series	Yes	Moderate	Chinese users
Kimi/Moonshot	Kimi models	Yes	Moderate	Long context
MiniMax	MiniMax series	Yes	Moderate	Multimodal
OpenAI	GPT-4, GPT-4o	No	Strict	Enterprise
Anthropic	Claude family	No	Moderate	Quality

Model recommendations

Use Case	Recommended Model	Provider
Quick queries	`openrouter:auto`	OpenRouter
Complex reasoning	`anthropic:claude-4`	Anthropic
Code generation	`openai:gpt-4o`	OpenAI
Long documents	`moonshot:kimi`	Kimi
Chinese tasks	`z.ai:glm-4`	z.ai
Budget mode	`openrouter:claude-3-haiku`	OpenRouter

Provider Configuration

Nous Portal

Nous Research’s native provider:

hermes config set providers.nous.api_key "YOUR_NOUS_KEY"
hermes model nous:hermes-3

OpenRouter

Access to 200+ models via single API:

# Get key from openrouter.ai
hermes config set providers.openrouter.api_key "YOUR_OPENROUTER_KEY"

# Use auto-selection
hermes model openrouter:auto

# Or specific model
hermes model openrouter:anthropic/claude-3.5-sonnet

Model format: provider/model-name

z.ai/GLM

Chinese LLM provider:

hermes config set providers.zai.api_key "YOUR_ZAI_KEY"
hermes model zai:glm-4

Kimi/Moonshot

Long context specialist:

hermes config set providers.moonshot.api_key "YOUR_MOONSHOT_KEY"
hermes model moonshot:kimi

MiniMax

Multimodal capabilities:

hermes config set providers.minimax.api_key "YOUR_MINIMAX_KEY"
hermes model minimax:abab-6

OpenAI

hermes config set providers.openai.api_key "YOUR_OPENAI_KEY"
hermes model openai:gpt-4o

Anthropic

hermes config set providers.anthropic.api_key "YOUR_ANTHROPIC_KEY"
hermes model anthropic:claude-4

Model Switching

Interactive switching

sequenceDiagram
    participant U as User
    participant H as Hermes
    participant P as Provider

    U->>H: /model openrouter:auto
    H->>P: Validate provider
    P->>H: Available
    H->>U: Switched to openrouter:auto

    U->>H: Query
    H->>P: Send to auto-selected model
    P->>H: Response
    H->>U: Display

    style H fill:#e1f5ff

Switch during conversation

You: This is a complex task
Hermes: Using claude-4 for complex reasoning...

/model openai:gpt-4o-mini
Hermes: Switched to gpt-4o-mini

You: Now a simple question
Hermes: Using gpt-4o-mini (cheaper)

Default model configuration

# Set default
hermes config set model.default "openrouter:auto"

# Set fallback (when default fails)
hermes config set model.fallback "openrouter:claude-3-haiku"

Cost Optimization

Strategy 1: Auto-selection

openrouter:auto picks the best value model:

hermes model openrouter:auto

It considers:

Current request complexity
Available model quotas
Historical success rates

Strategy 2: Tiered approach

flowchart TD
    A[Request received] --> B{Complexity?}
    B -->|Simple| C[gpt-4o-mini: $0.15/1M]
    B -->|Medium| D[claude-3-haiku: $0.25/1M]
    B -->|Complex| E[claude-4: $3/1M]

    style C fill:#e8f5e9
    style E fill:#fff3e0

Configure tiered routing:

routing:
  simple: openai:gpt-4o-mini
  medium: openrouter:claude-3-haiku
  complex: anthropic:claude-4
  thresholds:
    simple: 100  # tokens
    medium: 1000

Strategy 3: Quota management

# Set daily limits
hermes config set quota.daily 100000  # tokens

# Set per-model limits
hermes config set quota.models.claude-4 50000

Cost tracking

/usage                          # Current session
/insights --days 7              # Weekly breakdown

Output:

Model               Tokens    Cost
────────────────────────────────────
openrouter:auto     15,420    $0.23
claude-3-haiku      8,200     $0.02
claude-4            3,100     $0.93
────────────────────────────────────
Total               26,720    $1.18

Custom Endpoint Setup

Add custom endpoint

For self-hosted or enterprise models:

hermes config set providers.custom.mycompany.url "https://api.mycompany.com/v1"
hermes config set providers.custom.mycompany.api_key "YOUR_KEY"
hermes model custom:mycompany:model-name

OpenAI-compatible endpoints

Most self-hosted models use OpenAI format:

providers:
  custom:
    local_llama:
      url: "http://localhost:8080/v1"
      api_key: "none"
      format: openai

Switch:

/model custom:local_llama:llama-3

Enterprise gateway

For corporate API gateways:

providers:
  enterprise:
      url: "https://gateway.company.com/ai"
      api_key: "${ENTERPRISE_TOKEN}"
      headers:
        X-Department: engineering

Troubleshooting

Model not available

Cause: Provider quota exhausted or model down.

Fix:

hermes model --fallback    # Use fallback
hermes model openrouter:auto  # Let auto-selection handle

Rate limit hit

Cause: Too many requests to single provider.

Fix:

# Switch provider
hermes model openai:gpt-4o-mini   # Different quota pool

# Or use distributed routing
hermes config set routing.distributed true

API key invalid

Cause: Key expired or wrong.

Fix:

hermes config set providers.PROVIDER.api_key "NEW_KEY"
hermes doctor --provider PROVIDER

Summary

Multi-Model Configuration is Hermes’s flexibility advantage:

200+ models — OpenRouter alone offers massive selection
Instant switching — /model command changes provider live
Cost optimization — Auto-selection, tiered routing, quotas
Custom endpoints — Self-hosted and enterprise integration

Key takeaways

✅ OpenRouter provides 200+ model access
✅ /model provider:model switches instantly
✅ openrouter:auto optimizes cost automatically
✅ Tiered routing balances cost and quality
✅ Custom endpoints for enterprise deployment

Series navigation:

← Previous: Tutorial 5: CLI and TUI — Terminal Interface
→ Next: Tutorial 7: Terminal Backends — Local and Cloud Deployment
Back: Series Index

Graduation Milestone G2: After completing this tutorial (plus Tutorials 4-5), you’ve achieved Operator level — your Hermes agent is deployed with multi-model support and can run on messaging platforms or terminal.

Tutorial Overview

What you will learn

Why Model Flexibility Matters

Single-provider limitation

Hermes’s multi-provider approach

Supported Providers

Provider comparison

Model recommendations

Provider Configuration

Nous Portal

OpenRouter

z.ai/GLM

Kimi/Moonshot

MiniMax

OpenAI

Anthropic

Model Switching

Interactive switching

Switch during conversation

Default model configuration

Cost Optimization

Strategy 1: Auto-selection

Strategy 2: Tiered approach

Strategy 3: Quota management

Cost tracking

Custom Endpoint Setup

Add custom endpoint

OpenAI-compatible endpoints

Enterprise gateway

Troubleshooting

Model not available

Rate limit hit

API key invalid

Summary

Key takeaways

🍪 Cookie 使用通知

Cookie 偏好设置

必要 Cookie

分析 Cookie

广告 Cookie