Back

Hermes Agent Tutorial 6: Multi-Model Configuration — Power Configuration for Flexibility

Configure multiple LLM providers for Hermes Agent: Nous Portal, OpenRouter, z.ai, Kimi, MiniMax, OpenAI. Learn model switching, cost optimization, and custom endpoint setup.

Tutorial Overview

Series index: Hermes Agent Tutorial Series

This tutorial covers Hermes Agent’s Multi-Model Configuration — a power configuration that lets you switch between 200+ LLM providers instantly.

Note: This is labeled “Power Configuration” because it’s an enhancement to your deployed Hermes (CLI or Gateway). Not all users need this immediately, but it’s essential for cost optimization and capability flexibility.

What you will learn

  • ✅ Supported providers and their offerings
  • ✅ Provider configuration steps
  • ✅ Model switching with hermes model
  • ✅ Cost optimization strategies
  • ✅ Custom endpoint setup

Why Model Flexibility Matters

Single-provider limitation

Most AI tools lock you into one provider:

  • Claude Code → Anthropic only
  • ChatGPT → OpenAI only
  • Gemini → Google only

Problems:

  • Rate limits hit? You’re stuck
  • Cost spike? No alternatives
  • Model down? Can’t switch

Hermes’s multi-provider approach

flowchart TD
    A[Hermes Agent] --> B[Nous Portal]
    A --> C[OpenRouter 200+]
    A --> D[z.ai/GLM]
    A --> E[Kimi/Moonshot]
    A --> F[MiniMax]
    A --> G[OpenAI]
    A --> H[Custom Endpoint]

    style A fill:#e1f5ff

Switch instantly:

/model openrouter:auto        # Auto-select cheapest
/model anthropic:claude-4     # Premium for complex tasks
/model openai:gpt-4o-mini     # Fast, cheap for simple tasks

Supported Providers

Provider comparison

Provider Models Free Tier Rate Limits Best For
Nous Portal Hermes family Yes (limited) Moderate Native Hermes
OpenRouter 200+ models Yes Per-model Flexibility
z.ai/GLM GLM series Yes Moderate Chinese users
Kimi/Moonshot Kimi models Yes Moderate Long context
MiniMax MiniMax series Yes Moderate Multimodal
OpenAI GPT-4, GPT-4o No Strict Enterprise
Anthropic Claude family No Moderate Quality

Model recommendations

Use Case Recommended Model Provider
Quick queries openrouter:auto OpenRouter
Complex reasoning anthropic:claude-4 Anthropic
Code generation openai:gpt-4o OpenAI
Long documents moonshot:kimi Kimi
Chinese tasks z.ai:glm-4 z.ai
Budget mode openrouter:claude-3-haiku OpenRouter

Provider Configuration

Nous Portal

Nous Research’s native provider:

hermes config set providers.nous.api_key "YOUR_NOUS_KEY"
hermes model nous:hermes-3

OpenRouter

Access to 200+ models via single API:

# Get key from openrouter.ai
hermes config set providers.openrouter.api_key "YOUR_OPENROUTER_KEY"

# Use auto-selection
hermes model openrouter:auto

# Or specific model
hermes model openrouter:anthropic/claude-3.5-sonnet

Model format: provider/model-name

z.ai/GLM

Chinese LLM provider:

hermes config set providers.zai.api_key "YOUR_ZAI_KEY"
hermes model zai:glm-4

Kimi/Moonshot

Long context specialist:

hermes config set providers.moonshot.api_key "YOUR_MOONSHOT_KEY"
hermes model moonshot:kimi

MiniMax

Multimodal capabilities:

hermes config set providers.minimax.api_key "YOUR_MINIMAX_KEY"
hermes model minimax:abab-6

OpenAI

hermes config set providers.openai.api_key "YOUR_OPENAI_KEY"
hermes model openai:gpt-4o

Anthropic

hermes config set providers.anthropic.api_key "YOUR_ANTHROPIC_KEY"
hermes model anthropic:claude-4

Model Switching

Interactive switching

sequenceDiagram
    participant U as User
    participant H as Hermes
    participant P as Provider

    U->>H: /model openrouter:auto
    H->>P: Validate provider
    P->>H: Available
    H->>U: Switched to openrouter:auto

    U->>H: Query
    H->>P: Send to auto-selected model
    P->>H: Response
    H->>U: Display

    style H fill:#e1f5ff

Switch during conversation

You: This is a complex task
Hermes: Using claude-4 for complex reasoning...

/model openai:gpt-4o-mini
Hermes: Switched to gpt-4o-mini

You: Now a simple question
Hermes: Using gpt-4o-mini (cheaper)

Default model configuration

# Set default
hermes config set model.default "openrouter:auto"

# Set fallback (when default fails)
hermes config set model.fallback "openrouter:claude-3-haiku"

Cost Optimization

Strategy 1: Auto-selection

openrouter:auto picks the best value model:

hermes model openrouter:auto

It considers:

  • Current request complexity
  • Available model quotas
  • Historical success rates

Strategy 2: Tiered approach

flowchart TD
    A[Request received] --> B{Complexity?}
    B -->|Simple| C[gpt-4o-mini: $0.15/1M]
    B -->|Medium| D[claude-3-haiku: $0.25/1M]
    B -->|Complex| E[claude-4: $3/1M]

    style C fill:#e8f5e9
    style E fill:#fff3e0

Configure tiered routing:

routing:
  simple: openai:gpt-4o-mini
  medium: openrouter:claude-3-haiku
  complex: anthropic:claude-4
  thresholds:
    simple: 100  # tokens
    medium: 1000

Strategy 3: Quota management

# Set daily limits
hermes config set quota.daily 100000  # tokens

# Set per-model limits
hermes config set quota.models.claude-4 50000

Cost tracking

/usage                          # Current session
/insights --days 7              # Weekly breakdown

Output:

Model               Tokens    Cost
────────────────────────────────────
openrouter:auto     15,420    $0.23
claude-3-haiku      8,200     $0.02
claude-4            3,100     $0.93
────────────────────────────────────
Total               26,720    $1.18

Custom Endpoint Setup

Add custom endpoint

For self-hosted or enterprise models:

hermes config set providers.custom.mycompany.url "https://api.mycompany.com/v1"
hermes config set providers.custom.mycompany.api_key "YOUR_KEY"
hermes model custom:mycompany:model-name

OpenAI-compatible endpoints

Most self-hosted models use OpenAI format:

providers:
  custom:
    local_llama:
      url: "http://localhost:8080/v1"
      api_key: "none"
      format: openai

Switch:

/model custom:local_llama:llama-3

Enterprise gateway

For corporate API gateways:

providers:
  enterprise:
      url: "https://gateway.company.com/ai"
      api_key: "${ENTERPRISE_TOKEN}"
      headers:
        X-Department: engineering

Troubleshooting

Model not available

Cause: Provider quota exhausted or model down.

Fix:

hermes model --fallback    # Use fallback
hermes model openrouter:auto  # Let auto-selection handle

Rate limit hit

Cause: Too many requests to single provider.

Fix:

# Switch provider
hermes model openai:gpt-4o-mini   # Different quota pool

# Or use distributed routing
hermes config set routing.distributed true

API key invalid

Cause: Key expired or wrong.

Fix:

hermes config set providers.PROVIDER.api_key "NEW_KEY"
hermes doctor --provider PROVIDER

Summary

Multi-Model Configuration is Hermes’s flexibility advantage:

  1. 200+ models — OpenRouter alone offers massive selection
  2. Instant switching/model command changes provider live
  3. Cost optimization — Auto-selection, tiered routing, quotas
  4. Custom endpoints — Self-hosted and enterprise integration

Key takeaways

  • ✅ OpenRouter provides 200+ model access
  • /model provider:model switches instantly
  • openrouter:auto optimizes cost automatically
  • ✅ Tiered routing balances cost and quality
  • ✅ Custom endpoints for enterprise deployment

Series navigation:


Graduation Milestone G2: After completing this tutorial (plus Tutorials 4-5), you’ve achieved Operator level — your Hermes agent is deployed with multi-model support and can run on messaging platforms or terminal.