releaseMay 28, 2026

Anthropic releases Claude Opus 4.8 with Claude Code dynamic workflows

Anthropic rolled out Claude Opus 4.8 across Claude Code, the API, and partner surfaces, plus a research-preview workflow mode that coordinates large subagent fleets. It keeps 4.7 pricing, but early tests suggest workflow runs can burn very large token budgets, so teams should watch usage closely.

7 min read

Anthropic releases Claude Opus 4.8 with Claude Code dynamic workflows

TL;DR

Anthropic shipped Claude Opus 4.8 into Claude Code, the API, and partner surfaces at the same list price as 4.7, while Boris Cherny's launch post called it the company's strongest coding model yet and ClaudeDevs' availability post said the rollout includes Bedrock, Vertex AI, and Foundry.
The headline product claim is not just higher scores. According to Boris Cherny's launch post, Opus 4.8 is more willing to say when it is unsure, and aakashgupta's read of Anthropic's chart focused on Anthropic openly highlighting a benchmark row where GPT-5.5 still wins.
Claude Code also got a research-preview mode called dynamic workflows, where ClaudeDevs' announcement says Claude writes an orchestration script and fans work out to large fleets of subagents, while Cat Wu's summary says those agents double-check results before returning them.
Early hands-on reports split into two clear themes: better coding and writing results, per Dan Shipper's early vibe check, and very high token burn for workflow runs, per Boris Cherny's warning and Noah Zweben's repost of a 253k-token test.

Anthropic paired the dynamic workflows blog post with workflows docs, linked a refreshed prompting guide, and pushed a one-command claude-api migration flow for people updating model strings. The oddest detail in the launch-day chatter was stevibe's cutoff test, which pegged Opus 4.8's knowledge cutoff in mid-October 2025, later than 4.7's late-July cutoff.

What shipped

Anthropic split the release into a model upgrade and a new orchestration mode.

Opus 4.8 is live in Claude Code, per ClaudeDevs' launch thread.
The built-in claude-api skill now supports /claude-api migrate, which updates model strings and suggests prompt changes tuned for 4.8, according to ClaudeDevs' migration note.
Dynamic workflows shipped in research preview, and ClaudeDevs' feature post says users can trigger it by using the word "workflow" in a prompt.
Availability spans Max, Team, Enterprise, and API access, including Bedrock, Vertex AI, and Foundry, per ClaudeDevs' availability note.
For Claude Code accounts, ClaudeDevs' rollout note says the feature is on by default for Max and Team, while Enterprise admins must opt in.

Benchmarks and honesty

Anthropic's own launch message centered on two deltas: coding went up, and self-reporting got less slippery.

SWE-bench Pro: 64.3 to 69.2, a +4.9 point gain, per Boris Cherny's launch post.
Cursor said its internal CursorBench showed Opus 4.8 working "much more efficiently" than 4.7 and described it as more persistent on harder tasks, according to Cursor's day-one post.
Dan Shipper's early test thread claimed Opus 4.8 beat GPT-5.5 by 1 point on Every's Senior Engineer benchmark, 63 to 62, and outscored GPT-5.5 by 6 points on Every's writing benchmark.

The more interesting launch-day tell was the framing. aakashgupta's post highlighted that Anthropic's own chart bolded GPT-5.5's 78.2% over Opus 4.8's 74.6% on a terminal-coding row, then argued that the product pitch was trust, not scoreboard maximalism. That lines up with Cat Wu's summary, which called 4.8 the recommended daily model in Claude Code because it flags what it does not know and surfaces problems in its own code.

Dynamic workflows

The new workflow mode moves Claude Code from single-run coding toward multi-stage orchestration.

Claude writes an orchestration script on the fly, per ClaudeDevs' announcement.
It then spins up large fleets of coordinated subagents in parallel, according to the same post.
Cat Wu's launch thread says the plan is followed strictly so stages happen in order even across hundreds of agents.
Cat Wu's feature summary says the agents check results before handing them back.
Greg Isenberg's hands-on reaction added two mechanics not emphasized in the short launch tweets: independent attempts at the same problem and adversarial agents trying to break the answer before convergence.

The official examples skew toward work that used to be too broad for a single agent pass. Cat Wu's Bun example says Jarred Sumner used dynamic workflows to port Bun from Zig to Rust across roughly 750,000 lines, reaching 99.8% test-suite pass rate in 11 days, while Cat Wu's internal example says the same system processed hundreds of A/B flags in parallel in under 10 minutes.

Vibe Check

The day-one hands-on reports were unusually specific about where 4.8 feels different.

Writing: Dan Shipper's test thread said Opus 4.8 produced fewer "AI-isms" and handled voice matching better than prior Claude runs in Every's usage.
Reasoning controls: the same Every test thread said coding and writing quality moved around noticeably by effort level, with xhigh looking best for code and high looking best for writing.
UX tone: trq212's first impression described the model as warm and collaborative, not just high-scoring.
Persistence: Cursor's launch note emphasized harder-task persistence, which is one of the failure modes people actually notice in daily agentic use.
Cutoff recency: stevibe's modelclock test estimated a mid-October 2025 knowledge cutoff, later than Opus 4.7's late-July cutoff.

One negative showed up inside the praise. Dan Shipper's thread said Codex still had the stronger harness versus Claude Desktop, even while he rated the model itself much higher.

Token burn and effort levels

Dynamic workflows look expensive by design.

Boris Cherny's workflow note explicitly called the feature token-intensive and framed it for migrations, refactors, performance optimization, and batch bug-fix runs.
Noah Zweben's repost of Pawel Huryn's test cited a workflow run that used 253,000 tokens in 24.5 seconds across 8 agents.
minchoi's demo post said users can surface the mode by switching to Opus 4.8, setting effort to "ultracode," and using "workflow" in the prompt.
LLMJunky's complaint points to adjacent pricing friction around fast mode credits, which became part of the same launch-day cost conversation.

That makes the new effort controls part of the story, not a side setting. Dan Shipper's benchmark notes and Dan Shipper's repost about writing effort levels both reported that output quality shifted materially with reasoning level, so the practical unit is no longer just model choice. It is model plus effort plus whether a task is large enough to justify orchestration.

Where it shows up

Opus 4.8 landed across Anthropic's own surfaces and partner products on day one.

Claude Code and Anthropic's API, per ClaudeDevs' launch thread and the availability post.
Bedrock, Vertex AI, and Foundry, according to ClaudeDevs' rollout note.
Cursor, where Cursor's announcement says Opus 4.8 is already live.
FLORA, via Dan Shipper's repost of FLORA's note, which described the model as live in its canvas.

The interesting split is that the model rollout was broad immediately, while dynamic workflows stayed narrower. ClaudeDevs' docs link places workflows inside Claude Code rather than partner IDEs, so the day-one ecosystem story was mostly "new model everywhere, new orchestration mode mostly at home."

Claude Code's harness got tuned the day before

One day before the 4.8 launch, Anthropic posted a separate Claude Code update about the harness itself.

ClaudeDevs' reliability thread said the team had been working on responsiveness and reliability improvements.
ClaudeDevs' feedback update added a simpler /feedback flow that can send the last day or week of sessions instead of making users hunt for one bad session.
ClaudeDevs' follow-up said more bug-fix work was still in progress.

That timing helps explain why some launch-day reactions, like Dan Shipper's harness comment, separated the model from the product wrapper around it. Anthropic did not just ship a smarter Opus. It also spent the preceding day tightening the tool people were about to run it in.