Skip to content
AI Primer
TOPIC29 stories

Sandboxing

Isolated execution environments for code-running agents.

NEWS13th May
Codex introduces Windows sandbox with firewall rules and write-restricted tokens

OpenAI detailed the Windows sandbox behind Codex, using local user accounts, ACLs, firewall rules, and DPAPI-protected secrets instead of a generic VM wrapper. The design gives Windows developers safer file and network controls without making coding-agent workflows unusable.

RELEASE13th May
Cursor launches cloud development environments with rollback and scoped secrets

Cursor added reusable cloud development environments for agents with multi-repo setup, rollback, and scoped secrets. The update moves cloud agents closer to laptop-style setups while keeping long-running work isolated and auditable.

RELEASE10th May
Crabbox 0.11.0 adds Google Cloud provider and repo-local job workflows

Crabbox 0.11.0 shipped a Google Cloud provider, repo-local job workflows, AWS Windows WSL2 hydration, and a Blacksmith sync-stall guard. Recent Codex and OpenClaw posts show Crabbox already being used for reproducible bug repro and recorded QA before-and-after runs.

RELEASE8th May
Hyperbrowser launches CLI with under-50ms sandboxes and hx web commands

Hyperbrowser shipped a CLI that exposes sandbox lifecycle, web fetch/search/crawl, and snapshotting from the terminal. The tool matters because it turns browser automation and forkable state into shell primitives for agent workflows.

RELEASE1w ago
OpenAI Agents SDK adds TypeScript support and sandbox agents

OpenAI updated its Agents SDK with TypeScript support, sandbox agents, and an open-source harness. The release broadens support for JS workflows and gives teams a standard way to run isolated agents.

RELEASE1w ago
Crabbox 0.4.0 launches ephemeral agent machines on Spot instances

Crabbox 0.4.0 adds throwaway machines for agent runs and cross-platform reproduction on macOS, Linux, and Windows. Use it to reproduce bugs and validate fixes without keeping long-lived cloud sessions around.

RELEASE1w ago
Flue introduces `$ flue add url` for remote sandbox connectors

Flue previewed a command that points at docs or SDK URLs and has the agent write a sandbox connector directly into your codebase without extra packages. Follow-on tweaks, a Python port, and Unkey deploy support show the harness is becoming a testbed for self-authored integrations.

RELEASE1w ago
Agent Harness Framework launches with Daytona default sandbox

The Agent Harness Framework started rolling out with Daytona as the default sandbox, and Fred Schott reported 35 pull requests on day one. The launch matters because it gives builders a packaged sandbox baseline instead of wiring execution isolation and agent environment management from scratch.

RELEASE3w ago
Claude Code 2.1.116 adds 67% faster /resume and safer sandbox rm checks

Claude Code 2.1.116 shipped 24 CLI changes, including faster resume on large sessions, stricter guardrails around rm and rmdir, and automatic plugin dependency installs. It also updates terminal input behavior and model surface area for agent workflows, so teams should upgrade if they rely on the CLI.

RELEASE4w ago
OpenAI Agents SDK adds sandbox execution and memory controls with Vercel, Modal, E2B and Daytona

OpenAI updated the Agents SDK with sandbox execution, memory controls and run snapshotting, and launch partners Vercel, Modal, E2B and Daytona shipped integrations. Long-running agents can now keep files, credentials and execution state in isolated runtimes instead of wiring harness, compute and storage layers together manually.

RELEASE4w ago
Open Agents launches a browser-based cloud coding platform with parallel sessions

Open Agents open-sources a browser-based cloud coding platform that keeps sessions running in parallel after a laptop closes. Use the reference stack if you want sandboxed VMs, model routing, and durable execution for internal coding-agent systems.

NEWS4w ago
Vercel Sandbox benchmarks sub-500 ms node -v cold starts

Vercel said Sandbox is now the fastest microVM-based runtime, with fresh node -v cold starts now largely under 500 ms after a month of tuning. The update also puts persistent sandboxes into beta and expands plans for a programmable firewall, so teams should re-check runtime and security settings.

NEWS4w ago
ClawShop launches OpenClaw resources with SecretRef and PinchBench

Kilo Code’s ClawShop recap bundled a 30-minute KiloClaw setup workshop, SecretRef credential handling, searchable ClawBytes guides, and PinchBench for agentic performance. The event, OpenClaw 2026.4.10, and PetClaw together added new security, memory, budgeting, and desktop layers around the OpenClaw stack.

RELEASE1mo ago
Anthropic launches Claude Managed Agents public beta with hosted sandboxes and outcome-based runs

Anthropic put Claude Managed Agents into public beta with hosted sandboxes, vaults, memory filesystems, and long-running sessions. Use the managed setup if you want explicit controls for tools, credentials, and completion criteria instead of custom harness code.

RELEASE1mo ago
Rivet launches agentOS beta with 6.1 ms cold starts

Rivet introduced agentOS, an embedded agent runtime built on WASM and V8 isolates with backend embedding, mounted filesystems, and built-in orchestration. If you run agents in production, compare it against separate sandbox infrastructure.

WORKFLOW1mo ago
Jai launches casual, strict, and bare sandbox modes for AI agents

Stanford's `jai` package launches casual, strict, and bare Linux containment modes for AI agents, and users pair the idea with Claude Code and OpenClaw hardening tips. The workflow narrows write scope and reduces persistent exploit paths such as hooks, `.venv` files, and startup artifacts.

RELEASE1mo ago
Claude Code releases 2.1.84: PowerShell preview, task hooks, idle-return clearing

Claude Code 2.1.84 adds an opt-in PowerShell tool, new task and worktree hooks, safer MCP limits, and better startup and prompt-cache behavior. Anthropic also documented auto mode’s action classifier and added iMessage as a channel, so teams should review permissions and remote-control workflows.

RELEASE1mo ago
OpenCode adds remote sandboxes and syncs agent state across devices

OpenCode is adding remote sandboxes, synced state across laptop, server, and cloud, and more product surface inside its plugin system. That makes long-running off-laptop workflows more practical, but operators should still review telemetry, sandbox, and exposure defaults.

RELEASE1mo ago
OpenClaw ships 2026.3.22 with ClawHub marketplace and OpenShell SSH sandboxes

OpenClaw shipped version 2026.3.22 with ClawHub, OpenShell plus SSH sandboxes, side-question flows, and more search and model options, then followed with a 2026.3.23 patch. Teams get a broader plugin surface, but should patch quickly and review plugin trust boundaries as the ecosystem grows.

RELEASE1mo ago
Agent Computer launches cloud computers in under 0.5s with SSH access

Agent Computer launched cloud desktops that boot in under half a second and expose persistent disks, shared credentials, SSH access, and ACP control for agents. It gives coding agents a faster place to run tools and reuse auth, but teams still need to design safe session and credential boundaries.

RELEASE1mo ago
Vercel Labs releases Emulate for stateful GitHub, Vercel, and Google API tests

Vercel Labs published a stateful service emulator for GitHub, Vercel, and Google integrations instead of relying on brittle mocks. It is useful when agents or CI need deterministic auth and third-party API flows in local or sandboxed runs.

RELEASE1mo ago
Keycard launches task-scoped credentials for coding agents

Keycard released an execution-time identity layer for coding agents, issuing short-lived credentials tied to user, agent, runtime, and task. It targets the gap between noisy permission prompts and unsafe skip-permissions workflows.

RELEASE1mo ago
Rivet releases Secure Exec SDK with 17.9 ms cold start and 56x cheaper Node.js runs

Rivet released Secure Exec, a V8-isolate runtime for Node.js, Bun, and browsers with deny-by-default permissions and low memory overhead. Agent builders can test it against heavier sandboxes for tool execution, but should verify the isolation model before replacing container or VM controls.

NEWS1mo ago
Research reports OpenClaw prompt-injection flaws and weak defaults

Security coverage around OpenClaw intensified with a report on indirect prompt injection and data exfiltration risks, while KiloClaw published an independent assessment of its hosted isolation layers. Review your default configs and sandbox boundaries before exposing agents to untrusted web or tenant data.

RELEASE1mo ago
NVIDIA launches NemoClaw for OpenClaw: single-command install with OpenShell guardrails

NVIDIA introduced NemoClaw, a reference stack that installs OpenShell and adds sandbox, privacy, and policy controls around OpenClaw. Use it if you want always-on agents on RTX PCs, DGX Spark, or cloud without building the security layer yourself.

RELEASE1mo ago
Claude Code 2.1.77 adds 64K Opus output defaults and allowRead sandboxes

Anthropic shipped Claude Code 2.1.77 with higher default Opus 4.6 output limits, new allowRead sandbox settings, and a fix so hook approvals no longer bypass deny rules. Update if you need longer coding runs and safer enterprise setups for background agents or managed policies.

RELEASE2mo ago
CopilotKit releases Open Generative UI repo for sandboxed charts, diagrams, and 3D widgets

CopilotKit open-sourced a generative UI template that renders agent-created HTML and SVG in a sandboxed iframe, with examples for charts, diagrams, algorithms, and 3D components. Use it to build interactive chat outputs without waiting for vendor-specific platform support.

NEWS2mo ago
OpenAI reports Responses API runtime uses compaction, proxy egress, and reusable skills

OpenAI published runtime details for the Responses API computer environment, including shell loops, capped output, automatic compaction, proxied outbound traffic, and reusable skills folders. Use it as a reference architecture for hosted agents that need state, safety controls, and tool execution patterns.

NEWS2mo ago
Perplexity opens Computer to Pro users with 20+ models and Slack app

Perplexity rolled Computer out to Pro subscribers and added Slack workflows, app connectors, custom skills, and credit-based usage for enterprise teams. Try multi-model agent workflows on real apps, but watch credit usage and local execution tradeoffs.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.