Back to Articles

Hermes Agent: The AI Assistant That Runs Without You

[ View on GitHub ]

Hermes Agent: The AI Assistant That Runs Without You

Hook

Most AI agents require constant connection to your machine. Hermes Agent hibernates on Modal or Daytona, wakes on a Telegram message, executes scheduled tasks unattended, and scales cost to near-zero during idle periods.

Context

Traditional AI agent frameworks typically expect to run on your local machine, in your terminal session. This works fine for development, but creates challenges when you want an agent that persists across days, responds from multiple platforms simultaneously, or runs scheduled tasks while you sleep. Containerization helps, but often means paying for compute even when the agent sits idle.

Hermes Agent, built by Nous Research, addresses this with three architectural decisions. First, it treats execution environments as swappable backends—your agent’s “terminal” can be a local shell, a Docker container, an SSH session, or a serverless function that spins down to zero between conversations. Second, it implements a unified messaging gateway that lets a single agent instance respond simultaneously from CLI, Telegram, Discord, Slack, WhatsApp, and Signal with full conversation continuity. Third, it builds a closed learning loop where the agent autonomously creates reusable skills from complex tasks, improves them during subsequent use, and nudges itself to persist knowledge across sessions. The result is an agent that behaves less like a script you run and more like a persistent assistant.

Technical Insight

The execution backend abstraction is the most immediately useful architectural choice. When you run hermes, you’re starting a persistent agent process that communicates with a terminal environment through a well-defined interface. The hermes config set command lets you configure six different terminal backends without touching code:

hermes config set terminal.backend modal  # Serverless, hibernates when idle
hermes config set terminal.backend daytona  # Cloud dev environment with persistence
hermes config set terminal.backend ssh  # Remote VM you already have running
hermes config set terminal.backend docker  # Local container isolation
hermes config set terminal.backend local  # Direct shell access (default)
hermes config set terminal.backend singularity  # HPC environments

The Modal and Daytona backends solve the always-on agent problem by treating your agent’s execution environment as an on-demand resource. According to the README, these backends offer “serverless persistence—your agent’s environment hibernates when idle and wakes on demand, costing nearly nothing between sessions.” When you message the agent from Telegram, the gateway process (which runs as a lightweight Node.js service) wakes the hibernating container, passes your message, streams the response back, then lets the container sleep again. The agent’s filesystem, installed packages, and working directory persist across hibernation cycles.

The messaging gateway architecture keeps the agent core entirely platform-agnostic. The gateway runs as a separate process (hermes gateway start) that maintains connections to messaging platforms and forwards messages to the agent via an internal protocol. This means the agent’s code remains platform-independent—it receives normalized message objects and returns text, which the gateway renders appropriately for each platform. Voice messages get transcribed automatically before reaching the agent. Conversation state lives in SQLite with FTS5 full-text search, enabling the agent to query its own past interactions:

# The agent can use its search_conversations tool
# to find relevant context from previous sessions
{
  "tool": "search_conversations",
  "query": "kubernetes deployment we discussed",
  "max_results": 5
}
# Returns: timestamped excerpts with conversation IDs,
# which can be loaded in full with load_conversation

The learning loop operates through three interacting subsystems. After completing a multi-step task, the agent can create reusable skills that get added to its skill library. Skills are YAML files with metadata and natural-language procedures—compatible with the agentskills.io open standard. During subsequent conversations, when encountering a similar task, the agent searches the skill library and can invoke skills by name. The README describes that “Skills self-improve during use”—when a skill fails or produces suboptimal results, the agent can generate an improved version. The “nudge” system periodically prompts the agent to identify what it should remember from the current session, feeding into Honcho’s dialectic user modeling to build a representation of your preferences and work patterns.

Scheduled tasks lean into the persistent agent model. You can define cron-style automations in natural language:

# In a conversation with the agent:
"Run a backup of my project directories every night at 2am,
then send me a Telegram message with the summary."

# The agent creates a scheduled task that:
# 1. Executes in its terminal backend (even if serverless)
# 2. Uses available tools like file_operations
# 3. Delivers results via the messaging gateway to your chosen platform

The agent spawns subagents for parallel work using delegation capabilities, creating isolated agent instances with their own context and tool access. The README notes you can “Write Python scripts that call tools via RPC, collapsing multi-step pipelines into zero-context-cost turns.”

Model selection happens entirely through configuration. The system supports Nous Portal, OpenRouter (200+ models), z.ai/GLM, Kimi/Moonshot, MiniMax, OpenAI, and custom endpoints. You can switch mid-conversation with /model commands—for example, /model anthropic:claude-3-5-sonnet—and the next turn uses the new model with zero downtime. This provider-agnostic design means you can optimize for cost, latency, or capability without rewriting code or losing conversation state.

Gotcha

The Windows situation is explicit: native support doesn’t exist. The README states: “Native Windows is not supported. Please install WSL2.” This works but introduces friction—tool execution happens inside the Linux subsystem, filesystem paths can be confusing when bridging Windows and WSL, and tools that interact with Windows-native applications may not work properly.

The learning loop features—autonomous skill creation, self-improvement, and memory nudges—are ambitious but come with caveats. The README describes these capabilities but doesn’t claim they’re fully mature production features. In practice, skill quality will likely vary depending on task complexity and the underlying model’s reasoning ability. The agent might create useful skills for straightforward tasks but struggle with complex ones. The self-improvement mechanism may need human oversight to prevent problematic iterations. You’ll likely need to curate the skill library manually—deleting ineffective skills and editing others. This isn’t necessarily a dealbreaker, as the skill system can be disabled entirely, but expectations should be calibrated accordingly.

The complexity overhead is real. Setting up the messaging gateway requires configuring bot tokens for each platform and running a separate gateway process. The terminal backends each have their own setup requirements. The full feature set—memory systems, skill library, user modeling, scheduled tasks, multi-platform delivery—creates a learning curve. The documentation is comprehensive but assumes you’ll engage with the full architecture. For simple automation tasks or embedding tool-calling in a larger application, lighter-weight approaches may get you to working code faster. The README is transparent about the system’s scope, describing it as designed for users who want a persistent, learning agent rather than a simple scripting tool.

Verdict

Use Hermes Agent if you need an AI assistant that lives independently of your development machine—running scheduled tasks, responding from messaging apps, executing workflows on cloud infrastructure while you sleep. The serverless execution backends (Modal, Daytona) are well-suited for personal agents that need high availability at low idle cost, and the unified messaging gateway makes it straightforward to interact from multiple platforms. Researchers training tool-use models will appreciate the trajectory generation features mentioned in the README, and users building complex automations can leverage the skill library and parallel delegation capabilities.

Skip it if you want a lightweight library to embed in an application, need a minimal setup for simple automation tasks, or prefer explicit control over agent behavior rather than autonomous learning features. The learning loop capabilities, while innovative, are presented in the README as part of an evolving system rather than battle-tested production features. If your primary use case is running agents locally during development sessions without the need for cross-platform messaging or serverless execution, simpler approaches may serve you better with less setup complexity. The README’s description of the system as ‘The agent that grows with you’ signals its target audience: users who want a persistent, evolving assistant rather than a one-off automation tool.

// QUOTABLE

Most AI agents require constant connection to your machine. Hermes Agent hibernates on Modal or Daytona, wakes on a Telegram message, executes scheduled tasks unattended, and scales cost to near-ze...

[ Tweet This ]
// ADD TO YOUR README
[![Featured on Starlog](https://starlog.is/api/badge/developer-tools/nousresearch-hermes-agent.svg)](https://starlog.is/api/badge-click/developer-tools/nousresearch-hermes-agent)