Skip to content
AI Primer
release

AI SDK adds HarnessAgent for Pi, Claude, Codex, and OpenCode

AI SDK added HarnessAgent as a common interface for Pi, Claude, Codex, OpenCode, and other harnesses. Use it to run local or cloud software-factory jobs through official SDKs while subscriptions cover token usage.

5 min read
AI SDK adds HarnessAgent for Pi, Claude, Codex, and OpenCode
AI SDK adds HarnessAgent for Pi, Claude, Codex, and OpenCode

TL;DR

Christmas came early for coding-agent harness nerds: Vercel's June changelog describes HarnessAgent as a single API for established agent harnesses, the adapter docs already list Claude Code, Codex, Deep Agents, OpenCode, and Pi, and the UI docs say harness sessions own their own conversation state instead of replaying chat history into a model. The buried gotcha is in the sandbox reference: passing experimental_sandbox gives tools an execution environment, but tool code still runs in the app process unless it explicitly delegates work into the sandbox.

HarnessAgent

Vercel's HarnessAgent changelog says AI SDK 7 adds a single API for running established harnesses, with switching framed as the harness-level version of swapping model providers.

The AI SDK Harnesses overview defines a harness as a complete agent runtime, not a model call. The runtime owns workspace access, built-in coding tools, native session state, compaction, permission flows, and runtime-specific configuration.

The docs split the system into four pieces:

  • HarnessAgent: the application-facing AI SDK agent implementation.
  • Harness adapter: the connector for Claude Code, Codex, Pi, OpenCode, or another runtime.
  • Sandbox provider: the isolated filesystem and process environment.
  • Session: the live state for conversation history, workspace, and approvals.

Adapter map

The current adapter docs list five shipped adapters and three coming soon:

  • Claude Code, @ai-sdk/harness-claude-code, sandbox bridge.
  • Codex, @ai-sdk/harness-codex, sandbox bridge.
  • Deep Agents, @ai-sdk/harness-deepagents, sandbox bridge.
  • OpenCode, @ai-sdk/harness-opencode, sandbox bridge.
  • Pi, @ai-sdk/harness-pi, host process.
  • Coming soon: Amp, Goose, Mastra.

The Claude Code adapter docs say it connects through @anthropic-ai/claude-agent-sdk; the Codex adapter docs say it connects through @openai/codex-sdk; the OpenCode adapter docs say it starts an OpenCode server inside the sandbox and streams session events back over WebSocket.

Subscription-backed factories

The token-billing detail is unusually concrete for a harness abstraction. cramforce's Codex reply says a Codex subscription can pay for the tokens a software factory uses.

That matches the shape of the implementation in the docs: the Codex adapter runs through the Codex SDK, and the Claude Code adapter runs through Anthropic's Claude Agent SDK. cramforce's launch note says the official SDK path is why subscriptions "just work."

Sandbox layer

cramforce's deployment reply says HarnessAgent is not Vercel-only and can run local or cloud, though he adds that software factories run better in the cloud.

The local path is already visible. lgrammel's local sandbox example uses createAppleContainerSandbox, experimental_sandbox, and a shell tool to create and run a Node.js file inside a Mac local Docker-style sandbox.

The extension seam is public. lgrammel's sandbox interface reply names Experimental_SandboxSession and HarnessSandboxV1 as public interfaces and calls out Modal, E2B, and Daytona as possible community sandbox packages.

Harnesses as tools

One composition pattern is already possible: cramforce's tool-wrapper reply says a developer can make a tool that uses HarnessAgent directly when the harnesses have isolated work.

A separate cramforce progress reply says related work is in progress, with no promise attached.

Context layer

pauliusztin_ framed the portability boundary above the harness. pauliusztin_'s context-layer post argues that models are commoditizing fast, harnesses already have, and the thing to own is the context layer.

His proposed context layer contains:

  • unified memory, business logic, serving layer, and a disposable harness;
  • MCP servers or skills for tools, resources, prompts, and domain knowledge;
  • a starting point as simple as markdown files;
  • a later path to knowledge graphs, filesystem storage, and semantic plus BM25 retrieval;
  • one store, with MongoDB given as the example.

Scenario tests

zeeg's counterweight is that harnesses are not automatically better than native providers for every agent job. zeeg's harness caveat says the important part is why a model performs differently in a specific scenario, not a table of scores.

zeeg's table reply adds that the result was hard to compress into a table. pauliusztin_'s testing quip gives the same failure mode a shorter name: the AI version of "works on my machine" is testing a few prompts and deciding it looked good.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 5 threads
TL;DR4 posts
Sandbox layer5 posts
Harnesses as tools1 post
Context layer1 post
Scenario tests2 posts
Share on X