Skip to content
AI Primer
release

OpenAI Agents SDK adds sandbox execution and memory controls with Vercel, Modal, E2B and Daytona

OpenAI updated the Agents SDK with sandbox execution, memory controls and run snapshotting, and launch partners Vercel, Modal, E2B and Daytona shipped integrations. Long-running agents can now keep files, credentials and execution state in isolated runtimes instead of wiring harness, compute and storage layers together manually.

5 min read
OpenAI Agents SDK adds sandbox execution and memory controls with Vercel, Modal, E2B and Daytona
OpenAI Agents SDK adds sandbox execution and memory controls with Vercel, Modal, E2B and Daytona

TL;DR

  • OpenAI's launch thread says the Agents SDK now supports controlled sandboxes, harness inspection, and explicit control over when memories are created and where they live.
  • The new OpenAI announcement and sandbox guide split the system into a harness layer for orchestration and a compute layer for file access, shell commands, mounted storage, exposed ports, and snapshots.
  • OpenAI's provider list named Cloudflare, Vercel, Modal, E2B, Daytona, blaxel, Runloop, and Temporal as supported sandbox backends on day one.
  • Modal's launch post positioned sandboxes as the missing computer layer for agents, while E2B's launch thread and Vercel's guide link pushed concrete patterns like parallel workers, live previews, and isolated microVM execution.
  • E2B's follow-up says sandboxing is available in the Python Agents SDK now, with TypeScript support still to come.

You can read OpenAI's announcement, skim the new Sandbox Agents guide, and then compare how Modal, E2B, Daytona, and Vercel each package the same split. OpenAI's architecture screenshot is the cleanest tell: the harness keeps control logic and secrets in a trusted environment, while the sandbox becomes the disposable machine the model actually touches.

Harness and compute

The official docs turn a fuzzy agent pattern into a concrete boundary. The sandbox guide defines the harness as the control plane for approvals, tracing, recovery, and run state, while the sandbox is the execution plane where the model reads and writes files, runs commands, installs packages, exposes ports, mounts storage, and snapshots state.

That boundary is the real feature. According to OpenAI's thread, developers can keep files, credentials, and execution state in their own environment and pass only approved context to the model, which is much closer to how infra teams already separate orchestration from untrusted compute.

OpenAI's docs also make the state model more explicit than most agent demos do. A sandbox is not just a shell, it can carry a filesystem, mounted data, ports for previews, saved RunState, session_state, and snapshots that let later runs reconnect to existing work instead of starting cold every time.

Memory and workspace control

The shortest but most consequential line in OpenAI's main launch tweet is the memory one: developers can now control when memories are created and where they are stored. OpenAI did not spell out the full policy surface in the tweets, but it paired that claim with an open-source harness and provider-managed sandboxes, which suggests memory is being treated as part of application state, not as an opaque side effect inside the model runtime.

The docs frame the same idea through manifests and saved state. In the sandbox guide, the workspace can be declared up front, mounted storage can be attached narrowly, and snapshots can preserve intermediate state between runs. That makes long-running agents look less like stateless chat loops and more like resumable jobs with explicit storage boundaries.

Provider layer

OpenAI shipped the platform hook, but the launch partners filled in the operational shapes.

  • Modal framed itself as the "computers" layer for agents and tied the SDK to scalable sandboxes, including GPU-backed sessions in its companion integration post.
  • Daytona paired sandboxed execution with a manifest-driven workspace. Its guide walks from a basic shell agent to multi-agent handoffs, memory, structured outputs, and human-in-the-loop flows.
  • E2B emphasized persistence, parallel sandboxes, artifact review, and live preview URLs, then showed one-agent-per-sandbox web generation in parallel.
  • Vercel packaged the same model inside isolated microVMs. Its knowledge base guide says only the shell commands run in the microVM, while orchestration stays on your machine or server.

That makes the SDK update look less like one feature than a new contract. OpenAI owns the harness abstraction, and providers compete on the execution substrate underneath it.

Parallel agents and previews

The partner demos converged on the same pattern: one agent, one sandbox, many copies at once. E2B's example spins up multiple landing-page builders in parallel, gives each its own HTML and CSS workspace, serves a separate preview URL for each result, then has them iterate with diffs instead of full rewrites.

Modal's walkthrough pushes the idea further into background compute. Its example harness uses subagents plus sandbox sessions to explore many coding strategies concurrently, and Modal explicitly calls out attaching GPUs to those sandbox sessions when the work needs heavier compute.

The preview angle is new enough to stand out. The OpenAI docs mention exposed ports and resumable work, and E2B turned that into an end-user visible pattern: each sandbox can host a frontend artifact you inspect side by side while the harness keeps coordinating the run.

Python first

The launch is not language-complete yet. E2B's thread says sandboxing works in the Python Agents SDK today and that TypeScript support is still pending.

The provider guides match that status. Vercel's guide starts with Python 3.11+, openai-agents[vercel], and a local .env.local flow, while Daytona's examples import SandboxAgent, Manifest, SandboxRunConfig, and Shell from the Python SDK. For now, the new execution model is real, but it is real first for Python teams.

🧾 More sources

TL;DR1 tweets
Top-line changes and launch status from OpenAI and partner threads.
Provider layer2 tweets
Launch partner integrations that show how the execution layer is being packaged.