releaseJune 23, 2026

Latitude launches MIT-licensed agent monitoring with Signals clustering and MCP access

Latitude released an open-source platform for monitoring AI agents in production, with plain-English trace search, repeated-failure clustering, and MCP access from coding agents. That gives teams a self-hostable way to inspect token burn, surface recurring failures, and turn production traces into evals and fixes.

3 min read

Latitude launches MIT-licensed agent monitoring with Signals clustering and MCP access

TL;DR

Latitude shipped an open-source, MIT-licensed platform for monitoring AI agents in production, according to testingcatalog's launch post, with self-hostable deployment highlighted again by kimmonismus's note.
The core product loop is trace search plus failure clustering: testingcatalog says teams can cluster thousands of live conversations and search traces in plain English, while testingcatalog's follow-up says repeated failures become named Signals with frequency, reason, and auto-generated evals.
MCP is part of the pitch, not an afterthought. According to testingcatalog's follow-up, an MCP server exposes signals, traces, and searches inside the coding agent so production failures can become datasets before a fix ships.
Early reactions kept coming back to token visibility. omarsar0's hands-on note said Latitude immediately showed which Claude Code tasks were eating budget, and ai_for_success's walkthrough described per-run cost, token counts, and raw conversation traces.

You can [browse the docs from the launch thread]here, inspect the repo link that testingcatalog pointed to, and the dashboard screenshots in itsPaulAi's post and ai_for_success's walkthrough show Latitude tracking sessions, tools, spans, and cost in one place.

Signals

Latitude's main differentiator is that it does not stop at raw traces. testingcatalog's Signals post says repeated failures are collapsed into a single signal that names the issue, estimates how often it fires, explains the likely reason, and generates an eval for it.

That framing shows up again in commentary. kimmonismus's reaction singled out the failure-collapsing behavior as the part agent observability tools usually miss.

Traces

The trace layer is broader than model-call logging. rohanpaul_ai's breakdown describes a see, catch, fix loop that includes sessions, users, tools, cost, latency, and behavior, and testingcatalog's launch post adds conversation clustering plus plain-English search across every trace.

The screenshots make the scope concrete: navigation for Traces, Behaviours, Users, Tools, Issues, Monitors, and Datasets appears in

and again in

Token telemetry

For early users, cost visibility landed first. omarsar0's hands-on note said pointing Latitude at Claude Code immediately exposed which tasks were consuming the token budget, while ai_for_success's walkthrough described seeing exact cost per run, token counts, and traces for tool-heavy sessions.

That matches the sharper community framing in kimmonismus's reaction, which called out Claude Code token telemetry as the first feature worth trying.

MCP server

The MCP server closes the loop back into development. According to testingcatalog's MCP description, it can pull signals, traces, and saved searches into a coding agent, then turn real failures into datasets to verify a fix before release.

That matters because Latitude is positioning itself around agents in production rather than offline eval dashboards. rohanpaul_ai's breakdown explicitly framed the product around tool use, retries, latency, and the gap between user intent and system behavior.

Self-hosted stack

Several reactions fixated on the licensing. kimmonismus's open-source note called open source, MIT-licensed, self-hostable observability rare in this category, and ai_for_success's walkthrough described the product as running on a team's own infrastructure.

One more useful detail surfaced in the same wave of posts: rohanpaul_ai's reply said the semantic search alone looked install-worthy, which is a narrower clue about what teams may actually adopt first than the broader launch copy.