releaseMarch 30, 2026

Hermes Agent ships v0.6.0 with multi-agent profiles and tool-call streaming

Nous Research shipped Hermes Agent v0.6.0 with multi-agent profiles, a published changelog, and new OpenWebUI tool-call streaming support. Upgrade if you use Hermes as a local agent, since the release turns it into a multi-profile workspace with a growing plugin and UI ecosystem.

5 min read

Hermes Agent ships v0.6.0 with multi-agent profiles and tool-call streaming

TL;DR

Hermes Agent v0.6.0 is a big platform release, not a cosmetic bump. The headline feature is profile-based multi-instance support, with each agent getting its own config, memory, sessions, skills, and gateway state full changelog profile launch.
Nous also added MCP server mode, so Hermes sessions can be exposed to MCP clients like Claude Desktop, Cursor, and VS Code through hermes mcp serve full changelog.
The same release added an official Docker container, ordered fallback provider chains, and new gateway adapters for Feishu/Lark and WeCom, plus Slack multi-workspace OAuth and Telegram webhook mode full changelog.
Hermes now exposes an OpenAI-compatible API that works with frontends like OpenWebUI, and Teknium showed tool-call streaming running against that endpoint a day after the release OpenWebUI tool-call streaming.

You can read the full GitHub release notes, the official profiles guide, and the API server docs. The release also quietly adds Exa as a search backend, and the docs spell out that OpenAI-compatible frontends can surface inline tool progress while Hermes runs terminal, file, search, memory, and skill actions.

Profiles

Nous finally gave Hermes a clean answer to the multi-agent problem. Profiles turn one install into a workspace for separate agents, each isolated down to its own config.yaml, .env, SOUL.md, memories, sessions, skills, cron jobs, and state database.

The official docs and release notes add a few concrete mechanics:

hermes profile create creates a new isolated environment.
Profiles can be switched with the -p flag or set as the default target.
New profiles can start empty, copy selected config files, or clone a full profile.
Hermes enforces token-lock isolation so two profiles do not grab the same bot credential at once.
Each profile gets its own gateway process, which matters if you run separate bots across Telegram, Discord, or Slack.

That closes one of the most obvious gaps between Hermes and the more orchestration-heavy agent tools. Official profile docs make clear this is not a lightweight preset system, it is process and state isolation.

MCP server mode and deployment primitives

The release notes bundle several operator-facing features together, and they are more interesting in combination than alone. Hermes can now expose conversations and attachments to MCP clients, run in an official Docker container, and fail over across multiple inference providers.

According to the release notes, hermes mcp serve supports both stdio and Streamable HTTP transports, with tools for browsing conversations, reading messages, searching sessions, and managing attachments. The same release adds fallback_providers in config.yaml, so a provider outage can roll traffic to the next configured backend instead of killing the agent loop. Docker support is official now, with volume-mounted config for both CLI and gateway modes in the release.

OpenAI endpoint and tool-call streaming

Hermes' API server already matters because it lets the agent sit behind any frontend that speaks OpenAI's chat format. The docs explicitly name OpenWebUI, LobeChat, LibreChat, NextChat, and ChatBox as compatible clients, with Hermes keeping access to terminal, file ops, web search, memory, and skills through that interface.

Teknium's demo is the missing piece for UI credibility. The clip shows tool-call streaming working through Hermes' OpenAI-compatible endpoint in OpenWebUI, which is exactly the kind of glue code that turns a terminal-first agent into something teams can wire into existing chat surfaces. The API server docs say streaming responses can include inline tool progress indicators so frontends can show what the agent is doing before the final answer lands.

Gateway expansion

The gateway side of Hermes also got much broader in one shot. Nous added Feishu/Lark and WeCom adapters, Slack multi-workspace OAuth, and Telegram webhook mode with mention gating and regex triggers.

From the release notes, the concrete platform additions are:

Feishu/Lark: event subscriptions, message cards, group chat, file and image attachments, interactive card callbacks.
WeCom: text, image, and voice messages, group chats, callback verification.
Slack: one Hermes gateway can connect to multiple workspaces, resolving the right bot token per event.
Telegram: webhook mode for production deployments, plus controls for always reply, mention-only reply, or regex-triggered reply behavior in groups.

That is a lot of plumbing for a single point release, and it pushes Hermes deeper into the "agent as messaging runtime" camp.

Usage jump

The rollout landed into visible usage growth. Teknium posted an OpenRouter analytics screenshot showing Hermes Agent at 302B total tokens, 192 models used, and top-five category rankings across personal agents, coding agents, and CLI agents.

The chart in that post shows a steep rise through late March, peaking near 28B tokens on the 30-day view. That does not prove v0.6.0 caused the whole spike, but it does show Hermes shipping this release with real usage behind it, not as a niche repo sprint.