Skip to content
AI Primer
TOPIC36 stories

Durable Execution

Checkpointing, resumability, long-running agent workflows.

WORKFLOW1w ago
Developers publish loop libraries and control-loop guides for long-running agents

Builders released reusable loop artifacts this week, including a Loop Library Skill, repo templates, and published control-loop definitions for docs sweeps, onboarding checks, and error triage. It matters because teams are turning one-shot prompting into persistent agent runs with explicit stop conditions and shared repo state.

RELEASE1w ago
Codex adds local-to-remote thread handoff with Git worktree transfer

Codex can now hand off an in-progress thread between local and remote machines and bring it back later. It matters because the handoff carries Git history, branches, and uncommitted changes while leaving the destination checkout untouched.

RELEASE1w ago
Vercel previews eve with durable execution and sandboxed compute

Vercel introduced eve in public preview with durable workflows, sandboxed compute, subagents, and evals. It also added Connect and Passport for scoped tokens and identity-gated deployments, giving teams one path for runtime, auth, and enterprise access control.

RELEASE1w ago
Flue releases 1.0 Beta with agents, workflows, and channel connectors

Flue 1.0 Beta reorganizes the framework around workflows, autonomous agents, and channel connectors while keeping model-agnostic deployment. The release gives TypeScript teams a more opinionated base for durable, long-running agents.

RELEASE1w ago
Vercel supports 30-minute Functions in Fluid compute preview

Vercel opened a preview that lets Functions run for up to 30 minutes on its Fluid microVM compute platform. Use it for longer-running server tasks without moving to a separate runtime product.

NEWS2w ago
OpenAI adds Ona-backed persistent runtimes to Codex

OpenAI said it will acquire Ona and fold its secure cloud execution and orchestration stack into Codex. The change targets agent jobs that need to keep running for hours or days after the original laptop session ends.

RELEASE2w ago
Anthropic adds scheduled deployments and vaulted env vars to Claude Managed Agents

Anthropic opened scheduled deployments and environment-variable vaults in Claude Managed Agents public beta, and Dynamic Workflows is now generally available in Claude Code. The update adds cron-style jobs, secret injection, and deeper parallel orchestration for long-running agents.

NEWS3w ago
LangSmith launches Sandbox, LLM Gateway, and Engine for agent execution, spend tracking, and eval triage

LangSmith added sandboxed execution, spend-aware gateway routing, and Engine to surface recurring agent failures from traces. The bundle gives teams one place to run agents, control token spend, and turn production issues into debugging and eval loops.

NEWS3w ago
Turbopuffer, Archil, TigerFS, and LangSmith add branching, snapshots, and rollback for agent runs

Multiple agent-infra vendors shipped copy-on-write branches, checkpoints, snapshots, forks, or rollback primitives on the same day. That matters because long-running agents can now explore, retry, and recover state without relying only on Git or full sandbox rebuilds.

NEWS3w ago
Conductor integrates Vercel Sandboxes for remote parallel coding agents

Conductor moved its parallel coding agents from local-only execution onto Vercel Sandboxes. That matters because teams can run isolated remote agent workspaces with near-local startup and feedback instead of depending on a developer laptop.

RELEASE3w ago
Cognition launches Devin Desktop with ACP support for local and cloud agents

Cognition added a desktop control surface that can run Devin, Codex, Claude, and other ACP-compatible agents across local and cloud contexts. The app turns Devin from a single hosted agent into a broader orchestration surface.

WORKFLOW1mo ago
Codex users ship durable-memory workspaces and auto-triage flows

Independent Codex users published Obsidian memory setups, reusable skill prompts, auto-triage flows, and Cloudflare-backed runners for longer jobs. That matters because Codex is being wrapped into persistent workspaces and operator-defined subagents instead of one-shot chats.

RELEASE1mo ago
LangChain opens Managed Deep Agents private beta with deepagents deploy and auth proxy

LangChain opened a private beta for Managed Deep Agents, a model-agnostic deployment layer built on deepagents with durable execution, sandboxes, and a context hub. The release turns deep-agent rollout into a single config-and-deploy flow and adds an auth proxy boundary for agent actions.

RELEASE1mo ago
Manus introduces Scheduled Tasks 2.0 with task continuation and self-updating web apps

Manus upgraded scheduled work so recurring jobs can continue inside the same task and drive background updates in Manus-built web apps. That matters because long-lived automations can retain context between runs instead of rebuilding state each time.

WORKFLOW1mo ago
Kilo Code introduces Cloud Agent CVE and smoke-test workflows with webhook triggers

Kilo Code posted two cloud-agent automations: a webhook-driven CVE patch flow that opens PRs in parallel and a post-deploy smoke test that checks health, 2xx responses, and latency under 2 seconds. This matters because the examples show coding agents moving into CI-style remediation and production verification loops.

WORKFLOW1mo ago
Codex adds remote connections for Mac mini devboxes in the ChatGPT app

OpenAI documented Codex remote connections, letting the ChatGPT app point at a separate Codex host such as a Mac mini or rented VPS. Try it for long runs that need to stay alive off-device or for phone-first coding sessions.

WORKFLOW1mo ago
Codex users report 2-hour mech-interp runs and 150-hour tasks with `/goal`

Days after `/goal` workflows first surfaced, users showed the command also works in the Codex app and shared runs for SSH setup, mech-interp scripts, and recurring work that lasted hours or days. The evidence points to Codex being used as a long-running research and ops agent, though the app still lacks explicit `/goal` UI.

NEWS1mo ago
LangChain launches SmithDB, LangSmith Engine, and Sandboxes at Interrupt

LangChain unveiled SmithDB, LangSmith Engine, Managed Deep Agents, and GA sandboxes at Interrupt. The stack gives agent teams a purpose-built trace database, autonomous failure triage, and managed execution environments for production workflows.

RELEASE1mo ago
holaOS launches Beta 0.1 with Multi Workspaces, Sub Agents, and Dashboard

holaOS shipped Beta 0.1, adding Multi Workspaces, Sub Agents, a dashboard, and a kickoff flow on top of its agent-computer base. The release targets long-running workstreams that need persistent context instead of one-chat sessions.

WORKFLOW1mo ago
Codex app adds /goal for long-running React Doctor and iOS runs

OpenAI staff said /goal is now available in the Codex app, and users posted long-running runs that fixed React Doctor scores, built iOS features, and queued weekend tasks. The update moves Codex from CLI-only planning to persistent, steerable work sessions.

RELEASE1mo ago
Crabbox 0.11.0 adds Google Cloud provider and repo-local job workflows

Crabbox 0.11.0 shipped a Google Cloud provider, repo-local job workflows, AWS Windows WSL2 hydration, and a Blacksmith sync-stall guard. Recent Codex and OpenClaw posts show Crabbox already being used for reproducible bug repro and recorded QA before-and-after runs.

NEWS1mo ago
Codex adds /goal mode for long-running tasks with remote control preview

OpenAI reports Codex can now keep pursuing a goal until an end state and is adding remote control plus a usage tab. The update matters because Codex sessions can span longer tasks and be managed across devices with less manual babysitting.

RELEASE1mo ago
Manus launches Cloud Computer for 24/7 bots

Manus introduced Cloud Computer, an always-on cloud machine available on web and mobile for paid personal plans. It lets agents keep running Slack, Discord, and Telegram bots, databases, and scheduled jobs after the user's laptop is offline.

RELEASE1mo ago
Electric Agents introduces sync-based multi-agent platform with shared sessions and forking

ElectricSQL launched Electric Agents, treating agents as long-lived data entities that sync across shared coding sessions, swarms, and branches. The release matters for teams building collaborative agent systems that need durable state and coordination primitives, not just one-shot task runners.

RELEASE2mo ago
Mistral launches Workflows public preview with durable execution and human approvals

Mistral Studio added a Workflows orchestration layer that tracks state, retries, branches, and human approvals in public preview. That lets long-running agent flows resume after failures instead of restarting from scratch.

RELEASE2mo ago
OpenCode 1.4.11 adds workspace support for git worktrees and remote environments

OpenCode 1.4.11 beta lets sessions run inside git worktrees or remote environments, with a remote server that keeps sessions alive and resyncs locally after reconnects. Use it if you run multi-session agent work across machines or plugin-defined runtimes.

RELEASE2mo ago
OpenAI Agents SDK adds sandbox execution and memory controls with Vercel, Modal, E2B and Daytona

OpenAI updated the Agents SDK with sandbox execution, memory controls and run snapshotting, and launch partners Vercel, Modal, E2B and Daytona shipped integrations. Long-running agents can now keep files, credentials and execution state in isolated runtimes instead of wiring harness, compute and storage layers together manually.

RELEASE2mo ago
Windsurf 2.0 integrates Devin for cloud agents that keep running after the IDE closes

Windsurf 2.0 launched with Devin embedded into the product, combining local agents with cloud agents that can continue across codebases after you close the laptop. The IDE now acts as a handoff layer between interactive edits and long-running remote execution.

NEWS2mo ago
Claude Code ships Routines in research preview with API and webhook triggers

Anthropic introduced Claude Code Routines, a cloud-run automation layer that can execute on schedules, API calls, and GitHub events. The rollout moves scheduling from local runs to hosted, persistent automation and adds new trigger surfaces for plan-wide use.

RELEASE2mo ago
Open Agents launches a browser-based cloud coding platform with parallel sessions

Open Agents open-sources a browser-based cloud coding platform that keeps sessions running in parallel after a laptop closes. Use the reference stack if you want sandboxed VMs, model routing, and durable execution for internal coding-agent systems.

RELEASE2mo ago
LangChain launches Deep Agents Deploy beta with AGENTS.md and mcp.json

LangChain launched Deep Agents Deploy in beta as a production path for open, model-agnostic agent harnesses configured with AGENTS.md, skills, and mcp.json. Deployments run on LangSmith and can expose MCP, A2A, and agent protocol while teams choose models and sandbox providers.

RELEASE2mo ago
Rivet launches agentOS beta with 6.1 ms cold starts

Rivet introduced agentOS, an embedded agent runtime built on WASM and V8 isolates with backend embedding, mounted filesystems, and built-in orchestration. If you run agents in production, compare it against separate sandbox infrastructure.

NEWS3mo ago
Claude Code adds scheduled cloud tasks for PR reviews and `/schedule` runs

Claude Code can now run recurring prompts and background pull-request work on Anthropic-managed cloud environments from the web, desktop, or `/schedule`. That makes long-running repo tasks less dependent on a local machine, but users report task caps and restricted egress.

RELEASE3mo ago
OpenAI adds container pools to Responses API for 10x faster agent spin-up

OpenAI says Responses API requests can reuse warm containers for skills, shell, and code interpreter, cutting startup times by about 10x. Faster execution matters more now that Codex is spreading to free users, students, and subagent-heavy workflows.

RELEASE3mo ago
Devin adds recurring scheduled tasks for release notes, QA, and cleanup jobs

Cognition now lets Devin turn a one-off task into a recurring workflow on a schedule. It pushes Devin further from ad hoc sessions toward unattended maintenance jobs, which is useful for teams already trusting it with repetitive repo work.

RELEASE3mo ago
Hankweave adds runtime budgets for dollars, tokens, and wall-clock limits

Hankweave shipped budget controls that cap spend, tokens, and elapsed time globally or per step, including loop budgets and shared pools. Use them to prototype or productionize long agent runs without hand-managing model switches and failure states.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.