Tutorial Overview
Series index: Hermes Agent Tutorial Series
When You Need This: Read this article if you’re managing multiple Hermes instances, need GPU acceleration, or want serverless deployment that costs nearly nothing when idle.
This tutorial covers Hermes Agent’s Terminal Backends — deployment options from local execution to cloud serverless.
What you will learn
- ✅ Six backend options and their tradeoffs
- ✅ Local backend configuration
- ✅ Docker backend setup
- ✅ SSH backend for remote servers
- ✅ Daytona serverless persistence
- ✅ Modal GPU cluster deployment
Backend Overview
Why multiple backends?
Hermes Agent can run in different environments:
flowchart TD
A[Hermes Agent] --> B[Local]
A --> C[Docker]
A --> D[SSH Remote]
A --> E[Daytona]
A --> F[Singularity]
A --> G[Modal Serverless]
B --> H[Your laptop]
C --> I[Container isolation]
D --> J[Remote server]
E --> K[Serverless VPS]
F --> L[HPC cluster]
G --> M[GPU serverless]
style G fill:#e8f5e9
style M fill:#fff3e0
Backend comparison
| Backend | Cost | Persistence | GPU | Ideal For |
|---|---|---|---|---|
| Local | Free | Laptop-only | No | Development |
| Docker | Free | Container | Optional | Isolation |
| SSH | VPS cost | Full | Optional | Remote control |
| Daytona | $5/mo VPS | Serverless | No | Always-available |
| Singularity | HPC cost | Full | Yes | Research |
| Modal | Pay-per-use | Serverless | Yes | GPU tasks |
Local Backend
Default backend
Local backend runs Hermes directly on your machine:
# Default behavior
hermes
# Explicitly select local
hermes --backend local
Configuration
backends:
local:
shell: bash
working_dir: /home/user/projects
timeout: 300 # seconds
Limitations
- Only runs when you’re logged in
- No GPU acceleration
- Memory persists locally only
Docker Backend
Why Docker?
Docker provides:
- Isolation from host system
- Reproducible environment
- Easy cleanup
Setup
Step 1: Pull Hermes Docker image
docker pull nousresearch/hermes-agent:latest
Step 2: Configure Hermes
hermes config set backends.docker.image "nousresearch/hermes-agent:latest"
Step 3: Run with Docker
hermes --backend docker
Docker configuration
backends:
docker:
image: nousresearch/hermes-agent:latest
volumes:
- ~/.hermes:/root/.hermes # Persist memory
- ~/projects:/workspace # Access files
gpu: false
network: host
GPU support
Enable GPU for Docker:
backends:
docker:
gpu: true
gpu_memory: 8GB
Requires NVIDIA Docker runtime.
SSH Backend
Why SSH?
SSH backend lets Hermes run on a remote server:
- Always-available (not tied to laptop)
- More resources (RAM, storage)
- Potential GPU access
Setup
Step 1: Configure SSH
hermes config set backends.ssh.host "your-server.com"
hermes config set backends.ssh.user "hermes"
hermes config set backends.ssh.key "~/.ssh/hermes_key"
Step 2: Deploy Hermes to server
# Install Hermes on remote server first
ssh [email protected] "curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash"
# Then run via SSH backend
hermes --backend ssh
SSH configuration
backends:
ssh:
host: your-server.com
user: hermes
key: ~/.ssh/hermes_key
working_dir: /home/hermes/workspace
timeout: 600
Daytona Backend
What is Daytona?
Daytona provides serverless development environments:
- Environment hibernates when idle
- Wakes on demand
- Costs nearly nothing between sessions
Setup
Step 1: Install Daytona CLI
curl -fsSL https://raw.githubusercontent.com/daytonaio/daytona/main/install.sh | bash
Step 2: Configure Hermes
hermes config set backends.daytona.workspace "hermes-main"
Step 3: Run with Daytona
hermes --backend daytona
Daytona workflow
sequenceDiagram
participant U as User
participant H as Hermes CLI
participant D as Daytona
participant W as Workspace
U->>H: hermes --backend daytona
H->>D: Wake workspace
D->>W: Restore environment
W->>H: Ready
H->>U: Hermes active
Note over D,W: Idle = costs ~$0
U->>H: /exit
H->>D: Hibernate workspace
D->>W: Suspend
W->>D: State saved
style D fill:#fff3e0
Cost model
| State | Cost |
|---|---|
| Active | VPS rate ($5/mo typical) |
| Idle | Near zero (pennies) |
| Hibernate | Zero |
Singularity Backend
What is Singularity?
Singularity is container format for HPC clusters (common in research institutions).
Setup
Step 1: Build Singularity image
singularity build hermes.sif hermes.def
Definition file:
Bootstrap: docker
From: nousresearch/hermes-agent:latest
%files
~/.hermes /root/.hermes
%environment
export HERMES_HOME=/root/.hermes
Step 2: Configure
hermes config set backends.singularity.image "hermes.sif"
Step 3: Run
hermes --backend singularity
GPU cluster usage
backends:
singularity:
image: hermes.sif
gpu: true
gpu_count: 4
bind:
- ~/.hermes:/root/.hermes
Modal Backend
What is Modal?
Modal provides serverless GPU computing:
- No server management
- GPU available instantly
- Pay only for compute time
Setup
Step 1: Install Modal CLI
pip install modal
modal auth
Step 2: Deploy Hermes to Modal
hermes modal deploy
Step 3: Run with Modal
hermes --backend modal
Modal architecture
flowchart LR
A[Local CLI] --> B[Modal API]
B --> C[GPU Container]
C --> D[Hermes Process]
D --> E[Response]
E --> A
style C fill:#fff3e0
Modal configuration
backends:
modal:
gpu: A100
memory: 16GB
timeout: 600
image: nousresearch/hermes-modal:latest
Cost optimization
Modal charges per second of GPU time. Optimize:
# Quick mode (no GPU)
hermes --backend modal --no-gpu
# GPU only when needed
hermes --backend modal --gpu-demand # Spawns GPU on complex tasks
Backend Selection Guide
| Use Case | Recommended Backend | Reason |
|---|---|---|
| Personal development | Local | Free, simple |
| Isolated testing | Docker | Clean environment |
| Team shared agent | SSH + VPS | Always-available |
| Cost-optimized always-on | Daytona | Hibernates when idle |
| Research GPU tasks | Singularity | HPC cluster access |
| Burst GPU needs | Modal | Serverless, pay-per-use |
Troubleshooting
Backend connection failed
Cause: SSH key wrong or server unreachable.
Fix:
hermes doctor --backend ssh
Docker volume not mounting
Cause: Path mismatch.
Fix:
hermes config set backends.docker.volumes "[\"~/.hermes:/root/.hermes\"]"
Modal GPU unavailable
Cause: Quota exhausted.
Fix:
hermes --backend modal --gpu-type T4 # Use smaller GPU
Summary
Terminal Backends provide deployment flexibility:
- Local — Free, simple, laptop-tied
- Docker — Isolated, reproducible
- SSH — Remote, always-available
- Daytona — Serverless, hibernates when idle
- Singularity — HPC cluster, GPU
- Modal — Serverless GPU, pay-per-use
Key takeaways
- ✅ Choose backend based on use case
- ✅ Daytona minimizes idle costs
- ✅ Modal provides burst GPU capacity
- ✅ SSH backend for VPS deployment
Series navigation:
- ← Previous: Tutorial 6: Multi-Model Configuration
- → Next: Tutorial 8: Cron Scheduling — Automation and Reports
- Back: Series Index