Tutorial Overview
Series index: Hermes Agent Tutorial Series
This tutorial covers Hermes Agent’s Multi-Model Configuration — a power configuration that lets you switch between 200+ LLM providers instantly.
Note: This is labeled “Power Configuration” because it’s an enhancement to your deployed Hermes (CLI or Gateway). Not all users need this immediately, but it’s essential for cost optimization and capability flexibility.
What you will learn
- ✅ Supported providers and their offerings
- ✅ Provider configuration steps
- ✅ Model switching with
hermes model - ✅ Cost optimization strategies
- ✅ Custom endpoint setup
Why Model Flexibility Matters
Single-provider limitation
Most AI tools lock you into one provider:
- Claude Code → Anthropic only
- ChatGPT → OpenAI only
- Gemini → Google only
Problems:
- Rate limits hit? You’re stuck
- Cost spike? No alternatives
- Model down? Can’t switch
Hermes’s multi-provider approach
flowchart TD
A[Hermes Agent] --> B[Nous Portal]
A --> C[OpenRouter 200+]
A --> D[z.ai/GLM]
A --> E[Kimi/Moonshot]
A --> F[MiniMax]
A --> G[OpenAI]
A --> H[Custom Endpoint]
style A fill:#e1f5ff
Switch instantly:
/model openrouter:auto # Auto-select cheapest
/model anthropic:claude-4 # Premium for complex tasks
/model openai:gpt-4o-mini # Fast, cheap for simple tasks
Supported Providers
Provider comparison
| Provider | Models | Free Tier | Rate Limits | Best For |
|---|---|---|---|---|
| Nous Portal | Hermes family | Yes (limited) | Moderate | Native Hermes |
| OpenRouter | 200+ models | Yes | Per-model | Flexibility |
| z.ai/GLM | GLM series | Yes | Moderate | Chinese users |
| Kimi/Moonshot | Kimi models | Yes | Moderate | Long context |
| MiniMax | MiniMax series | Yes | Moderate | Multimodal |
| OpenAI | GPT-4, GPT-4o | No | Strict | Enterprise |
| Anthropic | Claude family | No | Moderate | Quality |
Model recommendations
| Use Case | Recommended Model | Provider |
|---|---|---|
| Quick queries | openrouter:auto |
OpenRouter |
| Complex reasoning | anthropic:claude-4 |
Anthropic |
| Code generation | openai:gpt-4o |
OpenAI |
| Long documents | moonshot:kimi |
Kimi |
| Chinese tasks | z.ai:glm-4 |
z.ai |
| Budget mode | openrouter:claude-3-haiku |
OpenRouter |
Provider Configuration
Nous Portal
Nous Research’s native provider:
hermes config set providers.nous.api_key "YOUR_NOUS_KEY"
hermes model nous:hermes-3
OpenRouter
Access to 200+ models via single API:
# Get key from openrouter.ai
hermes config set providers.openrouter.api_key "YOUR_OPENROUTER_KEY"
# Use auto-selection
hermes model openrouter:auto
# Or specific model
hermes model openrouter:anthropic/claude-3.5-sonnet
Model format: provider/model-name
z.ai/GLM
Chinese LLM provider:
hermes config set providers.zai.api_key "YOUR_ZAI_KEY"
hermes model zai:glm-4
Kimi/Moonshot
Long context specialist:
hermes config set providers.moonshot.api_key "YOUR_MOONSHOT_KEY"
hermes model moonshot:kimi
MiniMax
Multimodal capabilities:
hermes config set providers.minimax.api_key "YOUR_MINIMAX_KEY"
hermes model minimax:abab-6
OpenAI
hermes config set providers.openai.api_key "YOUR_OPENAI_KEY"
hermes model openai:gpt-4o
Anthropic
hermes config set providers.anthropic.api_key "YOUR_ANTHROPIC_KEY"
hermes model anthropic:claude-4
Model Switching
Interactive switching
sequenceDiagram
participant U as User
participant H as Hermes
participant P as Provider
U->>H: /model openrouter:auto
H->>P: Validate provider
P->>H: Available
H->>U: Switched to openrouter:auto
U->>H: Query
H->>P: Send to auto-selected model
P->>H: Response
H->>U: Display
style H fill:#e1f5ff
Switch during conversation
You: This is a complex task
Hermes: Using claude-4 for complex reasoning...
/model openai:gpt-4o-mini
Hermes: Switched to gpt-4o-mini
You: Now a simple question
Hermes: Using gpt-4o-mini (cheaper)
Default model configuration
# Set default
hermes config set model.default "openrouter:auto"
# Set fallback (when default fails)
hermes config set model.fallback "openrouter:claude-3-haiku"
Cost Optimization
Strategy 1: Auto-selection
openrouter:auto picks the best value model:
hermes model openrouter:auto
It considers:
- Current request complexity
- Available model quotas
- Historical success rates
Strategy 2: Tiered approach
flowchart TD
A[Request received] --> B{Complexity?}
B -->|Simple| C[gpt-4o-mini: $0.15/1M]
B -->|Medium| D[claude-3-haiku: $0.25/1M]
B -->|Complex| E[claude-4: $3/1M]
style C fill:#e8f5e9
style E fill:#fff3e0
Configure tiered routing:
routing:
simple: openai:gpt-4o-mini
medium: openrouter:claude-3-haiku
complex: anthropic:claude-4
thresholds:
simple: 100 # tokens
medium: 1000
Strategy 3: Quota management
# Set daily limits
hermes config set quota.daily 100000 # tokens
# Set per-model limits
hermes config set quota.models.claude-4 50000
Cost tracking
/usage # Current session
/insights --days 7 # Weekly breakdown
Output:
Model Tokens Cost
────────────────────────────────────
openrouter:auto 15,420 $0.23
claude-3-haiku 8,200 $0.02
claude-4 3,100 $0.93
────────────────────────────────────
Total 26,720 $1.18
Custom Endpoint Setup
Add custom endpoint
For self-hosted or enterprise models:
hermes config set providers.custom.mycompany.url "https://api.mycompany.com/v1"
hermes config set providers.custom.mycompany.api_key "YOUR_KEY"
hermes model custom:mycompany:model-name
OpenAI-compatible endpoints
Most self-hosted models use OpenAI format:
providers:
custom:
local_llama:
url: "http://localhost:8080/v1"
api_key: "none"
format: openai
Switch:
/model custom:local_llama:llama-3
Enterprise gateway
For corporate API gateways:
providers:
enterprise:
url: "https://gateway.company.com/ai"
api_key: "${ENTERPRISE_TOKEN}"
headers:
X-Department: engineering
Troubleshooting
Model not available
Cause: Provider quota exhausted or model down.
Fix:
hermes model --fallback # Use fallback
hermes model openrouter:auto # Let auto-selection handle
Rate limit hit
Cause: Too many requests to single provider.
Fix:
# Switch provider
hermes model openai:gpt-4o-mini # Different quota pool
# Or use distributed routing
hermes config set routing.distributed true
API key invalid
Cause: Key expired or wrong.
Fix:
hermes config set providers.PROVIDER.api_key "NEW_KEY"
hermes doctor --provider PROVIDER
Summary
Multi-Model Configuration is Hermes’s flexibility advantage:
- 200+ models — OpenRouter alone offers massive selection
- Instant switching —
/modelcommand changes provider live - Cost optimization — Auto-selection, tiered routing, quotas
- Custom endpoints — Self-hosted and enterprise integration
Key takeaways
- ✅ OpenRouter provides 200+ model access
- ✅
/model provider:modelswitches instantly - ✅
openrouter:autooptimizes cost automatically - ✅ Tiered routing balances cost and quality
- ✅ Custom endpoints for enterprise deployment
Series navigation:
- ← Previous: Tutorial 5: CLI and TUI — Terminal Interface
- → Next: Tutorial 7: Terminal Backends — Local and Cloud Deployment
- Back: Series Index
Graduation Milestone G2: After completing this tutorial (plus Tutorials 4-5), you’ve achieved Operator level — your Hermes agent is deployed with multi-model support and can run on messaging platforms or terminal.