Skip to content
AI Primer
TOPIC42 stories

Sandboxing

Isolated execution environments for code-running agents.

RELEASE25th June
Rivet releases agentOS v0.2.0 with WebAssembly sandboxing and 1738x cheaper claim

Rivet released agentOS v0.2.0, a Rust rewrite of its WebAssembly-based sandbox and orchestration stack with multiplayer workflows and one-prompt deployment. The release targets self-hosted and cloud agent runtimes, and Rivet claims 1738x lower cost than SaaS sandboxes.

RELEASE1w ago
Secure Exec v0.3 rewrites in Rust and adds Bun SDK, process trees, and Node-less mode

Secure Exec v0.3 shipped a full Rust rewrite, Bun and Rust SDKs, process-tree support for spawn and exec inside the VM, and a configurable Node-less mode. It matters because agent sandboxes can tighten performance and isolation without depending on a full Node runtime.

RELEASE1w ago
Vercel previews eve with durable execution and sandboxed compute

Vercel introduced eve in public preview with durable workflows, sandboxed compute, subagents, and evals. It also added Connect and Passport for scoped tokens and identity-gated deployments, giving teams one path for runtime, auth, and enterprise access control.

RELEASE2w ago
Anthropic adds scheduled deployments and vaulted env vars to Claude Managed Agents

Anthropic opened scheduled deployments and environment-variable vaults in Claude Managed Agents public beta, and Dynamic Workflows is now generally available in Claude Code. The update adds cron-style jobs, secret injection, and deeper parallel orchestration for long-running agents.

RELEASE3w ago
Microsoft launches OpenClaw Companion for Windows with Microsoft Execution Containers

Microsoft and OpenClaw unveiled a Windows companion app and enterprise integration built on Microsoft Execution Containers. The launch gives OpenClaw a native, sandboxed Windows surface instead of relying on unofficial desktop wrappers.

RELEASE4w ago
Vercel Sandbox adds Docker support with persistent images and isolated container runs

Vercel Sandbox can now build and run Docker containers, persist images and installs across sessions, and host databases or full apps inside the sandbox. That broadens what coding agents and preview environments can validate without leaving Vercel.

RELEASE4w ago
Cursor adds auto-review mode with classifier subagent and fewer approval prompts

Cursor shipped auto-review mode, letting agents run more tool calls with fewer approval prompts and sending unsafe or unsandboxed actions to a classifier subagent. The change lowers review friction while keeping a separate path for higher-risk calls.

RELEASE4w ago
OpenClaw 2026.5.27 fixes runtime boundaries and cuts cold turns 2.9x

OpenClaw 2026.5.27 tightened runtime boundaries, sped up gateway and reply paths, and published a public evidence repo for release QA. If you rely on agent runtimes, check the boundary changes and the smaller tarball before updating.

RELEASE4w ago
Claude Code ships security-guidance plugin with repo-level claude-security-guidance.md rules

Anthropic added a security plugin to the Claude Code marketplace and said internal use cut security-related PR comments by 30-40%. Teams can use it to enforce repo or MDM-distributed policies before human review.

RELEASE1mo ago
LangChain opens Managed Deep Agents private beta with deepagents deploy and auth proxy

LangChain opened a private beta for Managed Deep Agents, a model-agnostic deployment layer built on deepagents with durable execution, sandboxes, and a context hub. The release turns deep-agent rollout into a single config-and-deploy flow and adds an auth proxy boundary for agent actions.

RELEASE1mo ago
Claude Code 2.1.147 adds Workflow tool and `/code-review` effort levels

Claude Code 2.1.147 added a deterministic Workflow tool, renamed `/simplify` to `/code-review`, and tightened sandboxing; 2.1.148 followed with a fix for the Bash 127 regression. The release matters because it changes multi-agent orchestration and review behavior while restoring automation reliability for existing Claude Code setups.

RELEASE1mo ago
Datasette Agent releases 0.1a3 with SQL chat, charts, and Fly sandbox plugins

Simon Willison shipped the first Datasette Agent release and companion chart and Fly sandbox plugins for conversational SQLite workflows. The stack combines live SQL inspection, chart rendering, and optional command execution inside an extensible local data assistant.

RELEASE1mo ago
Claude Managed Agents adds self-hosted sandboxes and MCP tunnels for private networks

Anthropic added self-hosted sandboxes in public beta and MCP tunnels in research preview to Claude Managed Agents. Use the new options to keep agent execution inside your perimeter or private cloud and reach internal MCP servers without public exposure.

NEWS1mo ago
Codex introduces Windows sandbox with firewall rules and write-restricted tokens

OpenAI detailed the Windows sandbox behind Codex, using local user accounts, ACLs, firewall rules, and DPAPI-protected secrets instead of a generic VM wrapper. The design gives Windows developers safer file and network controls without making coding-agent workflows unusable.

RELEASE1mo ago
Cursor launches cloud development environments with rollback and scoped secrets

Cursor added reusable cloud development environments for agents with multi-repo setup, rollback, and scoped secrets. The update moves cloud agents closer to laptop-style setups while keeping long-running work isolated and auditable.

RELEASE1mo ago
Crabbox 0.11.0 adds Google Cloud provider and repo-local job workflows

Crabbox 0.11.0 shipped a Google Cloud provider, repo-local job workflows, AWS Windows WSL2 hydration, and a Blacksmith sync-stall guard. Recent Codex and OpenClaw posts show Crabbox already being used for reproducible bug repro and recorded QA before-and-after runs.

RELEASE1mo ago
Hyperbrowser launches CLI with under-50ms sandboxes and hx web commands

Hyperbrowser shipped a CLI that exposes sandbox lifecycle, web fetch/search/crawl, and snapshotting from the terminal. The tool matters because it turns browser automation and forkable state into shell primitives for agent workflows.

RELEASE1mo ago
OpenAI Agents SDK adds TypeScript support and sandbox agents

OpenAI updated its Agents SDK with TypeScript support, sandbox agents, and an open-source harness. The release broadens support for JS workflows and gives teams a standard way to run isolated agents.

RELEASE1mo ago
Flue introduces `$ flue add url` for remote sandbox connectors

Flue previewed a command that points at docs or SDK URLs and has the agent write a sandbox connector directly into your codebase without extra packages. Follow-on tweaks, a Python port, and Unkey deploy support show the harness is becoming a testbed for self-authored integrations.

RELEASE1mo ago
Crabbox 0.4.0 launches ephemeral agent machines on Spot instances

Crabbox 0.4.0 adds throwaway machines for agent runs and cross-platform reproduction on macOS, Linux, and Windows. Use it to reproduce bugs and validate fixes without keeping long-lived cloud sessions around.

RELEASE1mo ago
Agent Harness Framework launches with Daytona default sandbox

The Agent Harness Framework started rolling out with Daytona as the default sandbox, and Fred Schott reported 35 pull requests on day one. The launch matters because it gives builders a packaged sandbox baseline instead of wiring execution isolation and agent environment management from scratch.

RELEASE2mo ago
Claude Code 2.1.116 adds 67% faster /resume and safer sandbox rm checks

Claude Code 2.1.116 shipped 24 CLI changes, including faster resume on large sessions, stricter guardrails around rm and rmdir, and automatic plugin dependency installs. It also updates terminal input behavior and model surface area for agent workflows, so teams should upgrade if they rely on the CLI.

RELEASE2mo ago
OpenAI Agents SDK adds sandbox execution and memory controls with Vercel, Modal, E2B and Daytona

OpenAI updated the Agents SDK with sandbox execution, memory controls and run snapshotting, and launch partners Vercel, Modal, E2B and Daytona shipped integrations. Long-running agents can now keep files, credentials and execution state in isolated runtimes instead of wiring harness, compute and storage layers together manually.

RELEASE2mo ago
Open Agents launches a browser-based cloud coding platform with parallel sessions

Open Agents open-sources a browser-based cloud coding platform that keeps sessions running in parallel after a laptop closes. Use the reference stack if you want sandboxed VMs, model routing, and durable execution for internal coding-agent systems.

NEWS2mo ago
Vercel Sandbox benchmarks sub-500 ms node -v cold starts

Vercel said Sandbox is now the fastest microVM-based runtime, with fresh node -v cold starts now largely under 500 ms after a month of tuning. The update also puts persistent sandboxes into beta and expands plans for a programmable firewall, so teams should re-check runtime and security settings.

NEWS2mo ago
ClawShop launches OpenClaw resources with SecretRef and PinchBench

Kilo Code’s ClawShop recap bundled a 30-minute KiloClaw setup workshop, SecretRef credential handling, searchable ClawBytes guides, and PinchBench for agentic performance. The event, OpenClaw 2026.4.10, and PetClaw together added new security, memory, budgeting, and desktop layers around the OpenClaw stack.

RELEASE2mo ago
Anthropic launches Claude Managed Agents public beta with hosted sandboxes and outcome-based runs

Anthropic put Claude Managed Agents into public beta with hosted sandboxes, vaults, memory filesystems, and long-running sessions. Use the managed setup if you want explicit controls for tools, credentials, and completion criteria instead of custom harness code.

RELEASE2mo ago
Rivet launches agentOS beta with 6.1 ms cold starts

Rivet introduced agentOS, an embedded agent runtime built on WASM and V8 isolates with backend embedding, mounted filesystems, and built-in orchestration. If you run agents in production, compare it against separate sandbox infrastructure.

WORKFLOW3mo ago
Jai launches casual, strict, and bare sandbox modes for AI agents

Stanford's `jai` package launches casual, strict, and bare Linux containment modes for AI agents, and users pair the idea with Claude Code and OpenClaw hardening tips. The workflow narrows write scope and reduces persistent exploit paths such as hooks, `.venv` files, and startup artifacts.

RELEASE3mo ago
Claude Code releases 2.1.84: PowerShell preview, task hooks, idle-return clearing

Claude Code 2.1.84 adds an opt-in PowerShell tool, new task and worktree hooks, safer MCP limits, and better startup and prompt-cache behavior. Anthropic also documented auto mode’s action classifier and added iMessage as a channel, so teams should review permissions and remote-control workflows.

RELEASE3mo ago
OpenCode adds remote sandboxes and syncs agent state across devices

OpenCode is adding remote sandboxes, synced state across laptop, server, and cloud, and more product surface inside its plugin system. That makes long-running off-laptop workflows more practical, but operators should still review telemetry, sandbox, and exposure defaults.

RELEASE3mo ago
OpenClaw ships 2026.3.22 with ClawHub marketplace and OpenShell SSH sandboxes

OpenClaw shipped version 2026.3.22 with ClawHub, OpenShell plus SSH sandboxes, side-question flows, and more search and model options, then followed with a 2026.3.23 patch. Teams get a broader plugin surface, but should patch quickly and review plugin trust boundaries as the ecosystem grows.

RELEASE3mo ago
Agent Computer launches cloud computers in under 0.5s with SSH access

Agent Computer launched cloud desktops that boot in under half a second and expose persistent disks, shared credentials, SSH access, and ACP control for agents. It gives coding agents a faster place to run tools and reuse auth, but teams still need to design safe session and credential boundaries.

RELEASE3mo ago
Vercel Labs releases Emulate for stateful GitHub, Vercel, and Google API tests

Vercel Labs published a stateful service emulator for GitHub, Vercel, and Google integrations instead of relying on brittle mocks. It is useful when agents or CI need deterministic auth and third-party API flows in local or sandboxed runs.

RELEASE3mo ago
Keycard launches task-scoped credentials for coding agents

Keycard released an execution-time identity layer for coding agents, issuing short-lived credentials tied to user, agent, runtime, and task. It targets the gap between noisy permission prompts and unsafe skip-permissions workflows.

RELEASE3mo ago
Rivet releases Secure Exec SDK with 17.9 ms cold start and 56x cheaper Node.js runs

Rivet released Secure Exec, a V8-isolate runtime for Node.js, Bun, and browsers with deny-by-default permissions and low memory overhead. Agent builders can test it against heavier sandboxes for tool execution, but should verify the isolation model before replacing container or VM controls.

NEWS3mo ago
Research reports OpenClaw prompt-injection flaws and weak defaults

Security coverage around OpenClaw intensified with a report on indirect prompt injection and data exfiltration risks, while KiloClaw published an independent assessment of its hosted isolation layers. Review your default configs and sandbox boundaries before exposing agents to untrusted web or tenant data.

RELEASE3mo ago
NVIDIA launches NemoClaw for OpenClaw: single-command install with OpenShell guardrails

NVIDIA introduced NemoClaw, a reference stack that installs OpenShell and adds sandbox, privacy, and policy controls around OpenClaw. Use it if you want always-on agents on RTX PCs, DGX Spark, or cloud without building the security layer yourself.

RELEASE3mo ago
Claude Code 2.1.77 adds 64K Opus output defaults and allowRead sandboxes

Anthropic shipped Claude Code 2.1.77 with higher default Opus 4.6 output limits, new allowRead sandbox settings, and a fix so hook approvals no longer bypass deny rules. Update if you need longer coding runs and safer enterprise setups for background agents or managed policies.

RELEASE3mo ago
CopilotKit releases Open Generative UI repo for sandboxed charts, diagrams, and 3D widgets

CopilotKit open-sourced a generative UI template that renders agent-created HTML and SVG in a sandboxed iframe, with examples for charts, diagrams, algorithms, and 3D components. Use it to build interactive chat outputs without waiting for vendor-specific platform support.

NEWS3mo ago
OpenAI reports Responses API runtime uses compaction, proxy egress, and reusable skills

OpenAI published runtime details for the Responses API computer environment, including shell loops, capped output, automatic compaction, proxied outbound traffic, and reusable skills folders. Use it as a reference architecture for hosted agents that need state, safety controls, and tool execution patterns.

NEWS3mo ago
Perplexity opens Computer to Pro users with 20+ models and Slack app

Perplexity rolled Computer out to Pro subscribers and added Slack workflows, app connectors, custom skills, and credit-based usage for enterprise teams. Try multi-model agent workflows on real apps, but watch credit usage and local execution tradeoffs.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.