Skip to content
AI Primer
TOPIC19 stories

Durable Execution

Checkpointing, resumability, long-running agent workflows.

NEWS13th May
LangChain launches SmithDB, LangSmith Engine, and Sandboxes at Interrupt

LangChain unveiled SmithDB, LangSmith Engine, Managed Deep Agents, and GA sandboxes at Interrupt. The stack gives agent teams a purpose-built trace database, autonomous failure triage, and managed execution environments for production workflows.

RELEASE13th May
holaOS launches Beta 0.1 with Multi Workspaces, Sub Agents, and Dashboard

holaOS shipped Beta 0.1, adding Multi Workspaces, Sub Agents, a dashboard, and a kickoff flow on top of its agent-computer base. The release targets long-running workstreams that need persistent context instead of one-chat sessions.

WORKFLOW10th May
Codex app adds /goal for long-running React Doctor and iOS runs

OpenAI staff said /goal is now available in the Codex app, and users posted long-running runs that fixed React Doctor scores, built iOS features, and queued weekend tasks. The update moves Codex from CLI-only planning to persistent, steerable work sessions.

RELEASE10th May
Crabbox 0.11.0 adds Google Cloud provider and repo-local job workflows

Crabbox 0.11.0 shipped a Google Cloud provider, repo-local job workflows, AWS Windows WSL2 hydration, and a Blacksmith sync-stall guard. Recent Codex and OpenClaw posts show Crabbox already being used for reproducible bug repro and recorded QA before-and-after runs.

NEWS8th May
Codex adds /goal mode for long-running tasks with remote control preview

OpenAI reports Codex can now keep pursuing a goal until an end state and is adding remote control plus a usage tab. The update matters because Codex sessions can span longer tasks and be managed across devices with less manual babysitting.

RELEASE2w ago
Manus launches Cloud Computer for 24/7 bots

Manus introduced Cloud Computer, an always-on cloud machine available on web and mobile for paid personal plans. It lets agents keep running Slack, Discord, and Telegram bots, databases, and scheduled jobs after the user's laptop is offline.

RELEASE2w ago
Electric Agents introduces sync-based multi-agent platform with shared sessions and forking

ElectricSQL launched Electric Agents, treating agents as long-lived data entities that sync across shared coding sessions, swarms, and branches. The release matters for teams building collaborative agent systems that need durable state and coordination primitives, not just one-shot task runners.

RELEASE2w ago
Mistral launches Workflows public preview with durable execution and human approvals

Mistral Studio added a Workflows orchestration layer that tracks state, retries, branches, and human approvals in public preview. That lets long-running agent flows resume after failures instead of restarting from scratch.

RELEASE3w ago
OpenCode 1.4.11 adds workspace support for git worktrees and remote environments

OpenCode 1.4.11 beta lets sessions run inside git worktrees or remote environments, with a remote server that keeps sessions alive and resyncs locally after reconnects. Use it if you run multi-session agent work across machines or plugin-defined runtimes.

RELEASE4w ago
OpenAI Agents SDK adds sandbox execution and memory controls with Vercel, Modal, E2B and Daytona

OpenAI updated the Agents SDK with sandbox execution, memory controls and run snapshotting, and launch partners Vercel, Modal, E2B and Daytona shipped integrations. Long-running agents can now keep files, credentials and execution state in isolated runtimes instead of wiring harness, compute and storage layers together manually.

RELEASE4w ago
Windsurf 2.0 integrates Devin for cloud agents that keep running after the IDE closes

Windsurf 2.0 launched with Devin embedded into the product, combining local agents with cloud agents that can continue across codebases after you close the laptop. The IDE now acts as a handoff layer between interactive edits and long-running remote execution.

NEWS4w ago
Claude Code ships Routines in research preview with API and webhook triggers

Anthropic introduced Claude Code Routines, a cloud-run automation layer that can execute on schedules, API calls, and GitHub events. The rollout moves scheduling from local runs to hosted, persistent automation and adds new trigger surfaces for plan-wide use.

RELEASE4w ago
Open Agents launches a browser-based cloud coding platform with parallel sessions

Open Agents open-sources a browser-based cloud coding platform that keeps sessions running in parallel after a laptop closes. Use the reference stack if you want sandboxed VMs, model routing, and durable execution for internal coding-agent systems.

RELEASE1mo ago
LangChain launches Deep Agents Deploy beta with AGENTS.md and mcp.json

LangChain launched Deep Agents Deploy in beta as a production path for open, model-agnostic agent harnesses configured with AGENTS.md, skills, and mcp.json. Deployments run on LangSmith and can expose MCP, A2A, and agent protocol while teams choose models and sandbox providers.

RELEASE1mo ago
Rivet launches agentOS beta with 6.1 ms cold starts

Rivet introduced agentOS, an embedded agent runtime built on WASM and V8 isolates with backend embedding, mounted filesystems, and built-in orchestration. If you run agents in production, compare it against separate sandbox infrastructure.

NEWS1mo ago
Claude Code adds scheduled cloud tasks for PR reviews and `/schedule` runs

Claude Code can now run recurring prompts and background pull-request work on Anthropic-managed cloud environments from the web, desktop, or `/schedule`. That makes long-running repo tasks less dependent on a local machine, but users report task caps and restricted egress.

RELEASE1mo ago
OpenAI adds container pools to Responses API for 10x faster agent spin-up

OpenAI says Responses API requests can reuse warm containers for skills, shell, and code interpreter, cutting startup times by about 10x. Faster execution matters more now that Codex is spreading to free users, students, and subagent-heavy workflows.

RELEASE1mo ago
Devin adds recurring scheduled tasks for release notes, QA, and cleanup jobs

Cognition now lets Devin turn a one-off task into a recurring workflow on a schedule. It pushes Devin further from ad hoc sessions toward unattended maintenance jobs, which is useful for teams already trusting it with repetitive repo work.

RELEASE1mo ago
Hankweave adds runtime budgets for dollars, tokens, and wall-clock limits

Hankweave shipped budget controls that cap spend, tokens, and elapsed time globally or per step, including loop budgets and shared pools. Use them to prototype or productionize long agent runs without hand-managing model switches and failure states.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.