Back

Hermes Agent Tutorial 7: Terminal Backends — Local and Cloud Deployment

Deploy Hermes Agent on local, Docker, SSH, Daytona, Singularity, or Modal backends. Learn serverless persistence, GPU cluster options, and hybrid deployment strategies.

Tutorial Overview

Series index: Hermes Agent Tutorial Series

When You Need This: Read this article if you’re managing multiple Hermes instances, need GPU acceleration, or want serverless deployment that costs nearly nothing when idle.

This tutorial covers Hermes Agent’s Terminal Backends — deployment options from local execution to cloud serverless.

What you will learn

  • ✅ Six backend options and their tradeoffs
  • ✅ Local backend configuration
  • ✅ Docker backend setup
  • ✅ SSH backend for remote servers
  • ✅ Daytona serverless persistence
  • ✅ Modal GPU cluster deployment

Backend Overview

Why multiple backends?

Hermes Agent can run in different environments:

flowchart TD
    A[Hermes Agent] --> B[Local]
    A --> C[Docker]
    A --> D[SSH Remote]
    A --> E[Daytona]
    A --> F[Singularity]
    A --> G[Modal Serverless]

    B --> H[Your laptop]
    C --> I[Container isolation]
    D --> J[Remote server]
    E --> K[Serverless VPS]
    F --> L[HPC cluster]
    G --> M[GPU serverless]

    style G fill:#e8f5e9
    style M fill:#fff3e0

Backend comparison

Backend Cost Persistence GPU Ideal For
Local Free Laptop-only No Development
Docker Free Container Optional Isolation
SSH VPS cost Full Optional Remote control
Daytona $5/mo VPS Serverless No Always-available
Singularity HPC cost Full Yes Research
Modal Pay-per-use Serverless Yes GPU tasks

Local Backend

Default backend

Local backend runs Hermes directly on your machine:

# Default behavior
hermes

# Explicitly select local
hermes --backend local

Configuration

backends:
  local:
    shell: bash
    working_dir: /home/user/projects
    timeout: 300  # seconds

Limitations

  • Only runs when you’re logged in
  • No GPU acceleration
  • Memory persists locally only

Docker Backend

Why Docker?

Docker provides:

  • Isolation from host system
  • Reproducible environment
  • Easy cleanup

Setup

Step 1: Pull Hermes Docker image

docker pull nousresearch/hermes-agent:latest

Step 2: Configure Hermes

hermes config set backends.docker.image "nousresearch/hermes-agent:latest"

Step 3: Run with Docker

hermes --backend docker

Docker configuration

backends:
  docker:
    image: nousresearch/hermes-agent:latest
    volumes:
      - ~/.hermes:/root/.hermes  # Persist memory
      - ~/projects:/workspace    # Access files
    gpu: false
    network: host

GPU support

Enable GPU for Docker:

backends:
  docker:
    gpu: true
    gpu_memory: 8GB

Requires NVIDIA Docker runtime.

SSH Backend

Why SSH?

SSH backend lets Hermes run on a remote server:

  • Always-available (not tied to laptop)
  • More resources (RAM, storage)
  • Potential GPU access

Setup

Step 1: Configure SSH

hermes config set backends.ssh.host "your-server.com"
hermes config set backends.ssh.user "hermes"
hermes config set backends.ssh.key "~/.ssh/hermes_key"

Step 2: Deploy Hermes to server

# Install Hermes on remote server first
ssh [email protected] "curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash"

# Then run via SSH backend
hermes --backend ssh

SSH configuration

backends:
  ssh:
    host: your-server.com
    user: hermes
    key: ~/.ssh/hermes_key
    working_dir: /home/hermes/workspace
    timeout: 600

Daytona Backend

What is Daytona?

Daytona provides serverless development environments:

  • Environment hibernates when idle
  • Wakes on demand
  • Costs nearly nothing between sessions

Setup

Step 1: Install Daytona CLI

curl -fsSL https://raw.githubusercontent.com/daytonaio/daytona/main/install.sh | bash

Step 2: Configure Hermes

hermes config set backends.daytona.workspace "hermes-main"

Step 3: Run with Daytona

hermes --backend daytona

Daytona workflow

sequenceDiagram
    participant U as User
    participant H as Hermes CLI
    participant D as Daytona
    participant W as Workspace

    U->>H: hermes --backend daytona
    H->>D: Wake workspace
    D->>W: Restore environment
    W->>H: Ready
    H->>U: Hermes active

    Note over D,W: Idle = costs ~$0

    U->>H: /exit
    H->>D: Hibernate workspace
    D->>W: Suspend
    W->>D: State saved

    style D fill:#fff3e0

Cost model

State Cost
Active VPS rate ($5/mo typical)
Idle Near zero (pennies)
Hibernate Zero

Singularity Backend

What is Singularity?

Singularity is container format for HPC clusters (common in research institutions).

Setup

Step 1: Build Singularity image

singularity build hermes.sif hermes.def

Definition file:

Bootstrap: docker
From: nousresearch/hermes-agent:latest

%files
    ~/.hermes /root/.hermes

%environment
    export HERMES_HOME=/root/.hermes

Step 2: Configure

hermes config set backends.singularity.image "hermes.sif"

Step 3: Run

hermes --backend singularity

GPU cluster usage

backends:
  singularity:
    image: hermes.sif
    gpu: true
    gpu_count: 4
    bind:
      - ~/.hermes:/root/.hermes

What is Modal?

Modal provides serverless GPU computing:

  • No server management
  • GPU available instantly
  • Pay only for compute time

Setup

Step 1: Install Modal CLI

pip install modal
modal auth

Step 2: Deploy Hermes to Modal

hermes modal deploy

Step 3: Run with Modal

hermes --backend modal
flowchart LR
    A[Local CLI] --> B[Modal API]
    B --> C[GPU Container]
    C --> D[Hermes Process]
    D --> E[Response]
    E --> A

    style C fill:#fff3e0
backends:
  modal:
    gpu: A100
    memory: 16GB
    timeout: 600
    image: nousresearch/hermes-modal:latest

Cost optimization

Modal charges per second of GPU time. Optimize:

# Quick mode (no GPU)
hermes --backend modal --no-gpu

# GPU only when needed
hermes --backend modal --gpu-demand  # Spawns GPU on complex tasks

Backend Selection Guide

Use Case Recommended Backend Reason
Personal development Local Free, simple
Isolated testing Docker Clean environment
Team shared agent SSH + VPS Always-available
Cost-optimized always-on Daytona Hibernates when idle
Research GPU tasks Singularity HPC cluster access
Burst GPU needs Modal Serverless, pay-per-use

Troubleshooting

Backend connection failed

Cause: SSH key wrong or server unreachable.

Fix:

hermes doctor --backend ssh

Docker volume not mounting

Cause: Path mismatch.

Fix:

hermes config set backends.docker.volumes "[\"~/.hermes:/root/.hermes\"]"

Cause: Quota exhausted.

Fix:

hermes --backend modal --gpu-type T4  # Use smaller GPU

Summary

Terminal Backends provide deployment flexibility:

  1. Local — Free, simple, laptop-tied
  2. Docker — Isolated, reproducible
  3. SSH — Remote, always-available
  4. Daytona — Serverless, hibernates when idle
  5. Singularity — HPC cluster, GPU
  6. Modal — Serverless GPU, pay-per-use

Key takeaways

  • ✅ Choose backend based on use case
  • ✅ Daytona minimizes idle costs
  • ✅ Modal provides burst GPU capacity
  • ✅ SSH backend for VPS deployment

Series navigation: