Skip to content
AI Primer
TOPIC50 stories

Agent Infrastructure

Backend primitives and platform services designed for autonomous agents as the primary consumer — agent-native storage, sandboxes, queues, and runtime infra.

RELEASE5th June
TanStack AI adds MCP support with pooled servers and typegen CLI

TanStack AI added MCP support for single or multiple servers, standalone clients or pooled servers, and a CLI for type generation. The release gives app builders a typed integration path for MCP-managed tools inside chat and agent workflows.

RELEASE5th June
Vercel opens Skills API with 600,000 skills for agents and platforms

Vercel made the skills.sh API generally available, exposing more than 600,000 skills as a registry-style service for agents and platforms. The launch gives teams a discoverable capability layer for reuse across agent surfaces.

NEWS4th June
Browser Use adds cloud profiles and geo proxies, with 484 browsers in <2s

Browser Use launched synced cloud profiles for logged-in sessions, added geo-targeted proxies, and showed a 484-browser startup demo that finished in under two seconds. The update matters because hosted browser agents can now keep authenticated state and regional routing without custom session-management work.

NEWS4th June
Weaviate launches Engram memory service with async writes

Weaviate introduced Engram, a dedicated agent memory service with async writes, semantic topic grouping, tenant scopes, and composable pipelines. It matters because teams can add a hosted memory layer for agent stacks without stitching custom memory workflows into each application.

NEWS3rd June
LangSmith launches Sandbox, LLM Gateway, and Engine for agent execution, spend tracking, and eval triage

LangSmith added sandboxed execution, spend-aware gateway routing, and Engine to surface recurring agent failures from traces. The bundle gives teams one place to run agents, control token spend, and turn production issues into debugging and eval loops.

RELEASE3rd June
CopilotKit releases v1.59.2 with threads, Vue packages, and React Native SDK

CopilotKit shipped v1.59.2 with threads, Vue packages, a React Native SDK, and updated AG-UI building blocks for fullstack agent apps. The release makes it easier to ship Cursor- and Claude-like interfaces, with new work extending generative UI into Slack, Teams, and other chat surfaces.

NEWS2nd June
Turbopuffer, Archil, TigerFS, and LangSmith add branching, snapshots, and rollback for agent runs

Multiple agent-infra vendors shipped copy-on-write branches, checkpoints, snapshots, forks, or rollback primitives on the same day. That matters because long-running agents can now explore, retry, and recover state without relying only on Git or full sandbox rebuilds.

RELEASE1st June
NVIDIA launches Cosmos 3 open 16B and 64B omnimodels with datasets and SGLang support

NVIDIA released Cosmos 3 as an open omnimodel family with 16B and 64B variants, plus code, datasets, and a coalition around physical AI. The release matters because it ships with serving support and top open-weight image and video rankings, so teams can use it beyond a research teaser.

NEWS1st June
OpenAI releases GPT-5.4, GPT-5.5, and Codex on Amazon Bedrock

OpenAI made GPT-5.4, GPT-5.5, and Codex generally available through Amazon Bedrock. AWS shops can now use OpenAI models inside existing IAM, compliance, and procurement workflows instead of adopting a separate vendor stack.

RELEASE1st June
Microsoft and NVIDIA launch RTX Spark PCs with 128GB unified memory and 1 PFLOP FP4

Microsoft and NVIDIA unveiled RTX Spark systems, including Surface Laptop Ultra and DGX-class Windows hardware, with 128GB unified memory and 1 PFLOP FP4 local AI. Day-one support from Hermes Agent, vLLM, Ollama, and Unsloth makes the launch useful for local inference and fine-tuning, not just a PC refresh.

RELEASE1st June
Perplexity launches Search as Code in Agent API with WANDR 0.386 and Python search pipelines

Perplexity replaced one-shot search calls with Search as Code, a Python-based search runtime in its Agent API that is also now the default in Computer. The change matters because agents can batch, rank, filter, and aggregate search steps inside code, and Perplexity says the system scored 0.386 on WANDR versus 0.152 for the next system.

RELEASE1st June
Codex releases Python SDK with thread control, session resume, and sandbox access

OpenAI shipped a Python SDK and app-server support for Codex with thread creation, streamed turns, session resume, image inputs, and sandbox controls. That gives teams a supported way to embed Codex inside internal tools and automation instead of driving it only through the CLI or desktop app.

RELEASE1st June
Browser Use launches browser infrastructure at $0.02/hour with subsecond cold starts

Browser Use rebuilt its runtime around a custom Chromium fork, Firecracker fork, and custom Linux kernel, claiming $0.02 per hour pricing with subsecond cold starts. The shift targets the infrastructure bottlenecks behind browser agents rather than model quality alone.

NEWS31st May
Coding-agent builders add shared memory, provider routing, and app launchers

Independent developers shipped sidecars that let Claude Code, Cursor, and Codex share memory, hot-swap model providers, package local projects as apps, and automate browser QA. Try these reusable tools if you want memory, routing, QA automation, and app packaging outside editor-specific features.

RELEASE31st May
CopilotKit integrates Claude Agent SDK with AG-UI for React and mobile frontends

CopilotKit shipped an AG-UI integration that streams Claude Agent SDK agents into web and mobile frontends with generative UI and approval checkpoints. The adapter lets teams embed terminal-first Claude agents in React, Vue, Angular, and React Native without rewriting transport or state plumbing.

NEWS30th May
Hermes ecosystem ships Web UI, Control Room, and 14% lower read_file tokens

Builders released a chat-first Web UI and a multi-agent Control Room template around Hermes Agent, while core updates cut read_file input tokens by 14% and fixed TUI startup hangs. Use the new controls to manage local multi-agent setups while reducing routine token burn.

RELEASE30th May
Prime Intellect launches Hosted Evaluations with harnesses, sandboxes, and rollouts viewer

Prime Intellect launched Hosted Evaluations to manage harnesses, sandboxes, and rollout inspection for model testing. The service packages eval infrastructure while still supporting local runs against arbitrary engines, so teams can centralize testing without losing flexibility.

RELEASE1w ago
Gemini API adds Managed Agents with sandboxed Linux, web access, and file I/O

Gemini Managed Agents can spin up a sandboxed Linux environment with code execution, web access, and file I/O from one API call, and early examples now include W&B and LlamaIndex workflows. That gives builders a higher-level runtime for long tasks while third-party templates start to define the first production use cases.

RELEASE1w ago
Vercel Sandbox adds Docker support with persistent images and isolated container runs

Vercel Sandbox can now build and run Docker containers, persist images and installs across sessions, and host databases or full apps inside the sandbox. That broadens what coding agents and preview environments can validate without leaving Vercel.

RELEASE1w ago
Hermes Agent v0.15.0 adds skill bundles and makes session search 750x faster

Nous Research released Hermes Agent v0.15.0 with skill bundles, MCP Catalog, new model support, and major performance and security work. The update cuts load times 50%, speeds session search 750x, and adds Bitwarden plus prompt-injection defenses.

NEWS1w ago
Hermes Agent integrates MCP Catalog, Qwen3.7 Max, Venice, and Krea 2 in one window

Hermes Agent added a built-in MCP Catalog while separate builders shipped Qwen3.7 Max support, Venice private-model workflows, and Krea 2 image generation. The cluster shows Hermes moving beyond a single-model assistant toward a broader agent shell with tool, model, and media providers.

RELEASE1w ago
Trajectory launches continual-learning platform with off-policy SDPO

Trajectory launched a platform that turns agent traces and user corrections into post-deployment model updates instead of prompt-only fixes. Baseten and Tinker described live A/B post-training, 397B-model deployment work, and an off-policy recipe for stabilizing the loop.

RELEASE1w ago
Cua Driver supports Windows background computer use over MCP and CLI

Cua Driver said its Windows backend is now stable, letting Claude Code, Codex, Hermes, or custom agents drive real Windows apps through MCP or CLI. The release targets Windows-only line-of-business software while keeping the desktop usable with multi-pointer support.

NEWS1w ago
Firecrawl integrates into Vercel Marketplace with scraping, search, and dynamic-site access

Firecrawl is now available through Vercel Marketplace and Agent Marketplace for apps and agents that need live web data. The integration reduces setup friction for teams adding scraping, search, and structured retrieval to deployed AI workflows.

WORKFLOW1w ago
Researchers and builders ship external memory layers with recipe stores and 33% cheaper updates

A new MeMo paper and several community memory systems converged on keeping knowledge outside the base model through recipe files, semantic and autobiographical stores, and background reconsolidation. The pattern matters because engineers are treating context loss as a systems problem instead of only asking for larger context windows.

RELEASE1w ago
Datasette 1.0a30 adds a slash Jump To menu and plugin hook

Datasette 1.0a30 introduced a slash-triggered Jump To menu plus a hook for plugin-supplied search items. Simon Willison used it in datasette-agent 0.1a4 to start agent chats from the same menu, so plugin authors can wire in their own actions.

RELEASE2w ago
LangChain opens Managed Deep Agents private beta with deepagents deploy and auth proxy

LangChain opened a private beta for Managed Deep Agents, a model-agnostic deployment layer built on deepagents with durable execution, sandboxes, and a context hub. The release turns deep-agent rollout into a single config-and-deploy flow and adds an auth proxy boundary for agent actions.

RELEASE2w ago
Claude Managed Agents adds self-hosted sandboxes and MCP tunnels for private networks

Anthropic added self-hosted sandboxes in public beta and MCP tunnels in research preview to Claude Managed Agents. Use the new options to keep agent execution inside your perimeter or private cloud and reach internal MCP servers without public exposure.

RELEASE2w ago
Google launches Antigravity 2.0 with CLI, SDK, and single-call Managed Agents

Google launched Antigravity 2.0 as a desktop app plus CLI/SDK stack for multi-agent workflows, and added Managed Agents to the Gemini API with persistent Linux sandboxes. Try it for agent orchestration and API-based sandboxing, but verify harness costs and runtime fit.

RELEASE2w ago
Warp Oz launches /orchestrate for Claude Code, Codex, and local-to-cloud handoff

Warp launched Oz orchestration across Claude Code, Codex, and Warp Agent, with subagent delegation, isolated worktrees or containers, and beta multi-harness control. Try the new '&' handoff and Agent Memory if you run long sessions that need cloud continuation.

RELEASE2w ago
Gemini Spark launches with dedicated VMs and MCP support for 24/7 background agents

A day after leaks previewed Spark, Google officially launched Gemini Spark as a persistent personal agent that runs on dedicated cloud VMs and will connect to MCP tools. It matters because Google is moving Gemini from chat responses toward long-running delegated work across consumer and enterprise surfaces.

NEWS2w ago
Anthropic reports Stainless deal for SDKs, CLIs, and MCP servers across TypeScript, Python, Go, Java, and Kotlin

Anthropic said it is acquiring Stainless, the SDK and MCP server platform behind Anthropic’s own official SDKs across major languages. The deal matters because Anthropic is bringing a key part of its API and agent-connectivity toolchain in-house while developers reassess alternative codegen stacks.

RELEASE2w ago
Files SDK 1.4 adds 9 storage adapters, an agent CLI, and optional peer deps

Files SDK 1.4 shipped nine new storage adapters, a CLI for agents, an installable skill, and optional peer dependencies. The update broadens storage coverage while sharply shrinking install weight, though adapter dependencies now need explicit installation.

NEWS3w ago
OpenClaw users report Hermes Agent migrations with clearer approvals, cron jobs, and Telegram UX

Practitioners said skills and workflows were porting from OpenClaw to Hermes Agent with fewer surprises around approvals, job control, and mobile use. That matters because teams choosing a self-hosted agent stack are now comparing operational clarity and migration friction, not just model support.

NEWS3w ago
OpenClaw ships 3.5x RTT tests and Clawpatch guardrails for coding agents

OpenClaw added end-to-end RTT tests and new auditable guardrails while community builders shipped Clawpatch, credential brokers, and ARC harnesses. The stack now has clearer safety and benchmarking primitives for long-lived coding agents.

RELEASE3w ago
Files SDK 1.3 adds 12 storage adapters and exists() checks

Files SDK 1.3 shipped 12 new storage adapters, an exists() helper, and a Files.file(key) handler. It expands the number of storage backends agents and sandboxed jobs can address through one file abstraction.

NEWS3w ago
Anthropic adds $20-$200 monthly Claude Agent SDK credits starting June 15

Anthropic will move Claude Agent SDK, claude -p, GitHub Actions, and third-party agent apps onto separate monthly credits on June 15. Watch the new bucket closely, since it changes the cost model for autonomous runs and subscription-backed harnesses.

NEWS3w ago
LangChain launches SmithDB, LangSmith Engine, and Sandboxes at Interrupt

LangChain unveiled SmithDB, LangSmith Engine, Managed Deep Agents, and GA sandboxes at Interrupt. The stack gives agent teams a purpose-built trace database, autonomous failure triage, and managed execution environments for production workflows.

RELEASE3w ago
Notion launches Developer Platform with External Agents API and Workers

Notion opened a developer platform with an External Agents API plus Workers, webhooks, and a headless CLI. The release lets external agents query Notion, extend workflows, and stay in sync with other systems.

RELEASE3w ago
holaOS launches Beta 0.1 with Multi Workspaces, Sub Agents, and Dashboard

holaOS shipped Beta 0.1, adding Multi Workspaces, Sub Agents, a dashboard, and a kickoff flow on top of its agent-computer base. The release targets long-running workstreams that need persistent context instead of one-chat sessions.

RELEASE3w ago
Cursor launches cloud development environments with rollback and scoped secrets

Cursor added reusable cloud development environments for agents with multi-repo setup, rollback, and scoped secrets. The update moves cloud agents closer to laptop-style setups while keeping long-running work isolated and auditable.

NEWS3w ago
OpenAI launches Deployment Company with $4B backing and 150 forward-deployed engineers

OpenAI launched the OpenAI Deployment Company and tied it to Tomoro’s acquisition, giving the unit 150 forward-deployed engineers and $4 billion in initial backing from 19 partners. It matters because OpenAI is packaging services, deployment help, and organizational integration as part of the product stack instead of leaving enterprise rollout to outside consultancies.

RELEASE3w ago
Anthropic launches Claude Platform on AWS with native billing, IAM, and Managed Agents

Anthropic made Claude Platform on AWS generally available, exposing the native Claude API with AWS authentication, billing, CloudTrail, and commitment retirement. It lets teams use Managed Agents and related Claude features inside existing AWS governance workflows.

WORKFLOW3w ago
Developers launch Agent FM, Mate, and ntm for multi-session Claude Code and Codex control

Independent developers shipped new control-plane tools for long-running coding agents, including Agent FM audio monitoring, Mate phone-first remote control, and ntm for provider-agnostic multi-agent workflows. It matters because teams running many Claude Code and Codex sessions still need better visibility, handoff, and checkpointing than a single built-in session list provides.

RELEASE3w ago
Files SDK launches unified storage API with 18 providers and OpenAI, Vercel AI, and Claude tools

Files SDK launched a unified storage API across 18 backends including S3, R2, Vercel Blob, and Google Drive. It also ships tool bindings for OpenAI, Vercel AI, and Claude agent SDKs across Node, Bun, Deno, edge runtimes, and browsers.

NEWS3w ago
Pi community ships pi-treebase, Miko voice mode, and OpenCode Go guides

Builders shipped pi-treebase, a Miko voice mode for pi-listens, devrage support, and a Japanese OpenCode Go guide after the first Pi extension burst. The releases arrive as Pi’s provider abstraction gets stress-tested by OpenClaw-scale multi-provider use.

RELEASE4w ago
Hyperbrowser launches CLI with under-50ms sandboxes and hx web commands

Hyperbrowser shipped a CLI that exposes sandbox lifecycle, web fetch/search/crawl, and snapshotting from the terminal. The tool matters because it turns browser automation and forkable state into shell primitives for agent workflows.

RELEASE4w ago
Google updates Gemini Interactions API with steps schema and Api-Revision 2026-05-26

Google is replacing the Gemini Interactions API’s older outputs-and-roles structure with a steps schema for multi-step agent workflows. The change matters because SDK upgrades, migration work, and schema assumptions in existing tooling may break before the new interface reaches GA.

RELEASE4w ago
OpenAI Agents SDK adds TypeScript support and sandbox agents

OpenAI updated its Agents SDK with TypeScript support, sandbox agents, and an open-source harness. The release broadens support for JS workflows and gives teams a standard way to run isolated agents.

NEWS4w ago
Raindrop launches Triage for Slack digests and trace search

Raindrop launched Triage, a Slack-based agent that finds traces, summarizes recurring failures, runs recurring briefs, and opens experiments from production conversations. Teams using Claude Code, Cursor, or Devin can plug it into agent ops to shorten debugging loops.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.