breakingMay 28, 2026

Agent tools add Claude Opus 4.8 to Cursor, Warp, OpenRouter, and Perplexity on day one

Independent IDEs, gateways, and agent runtimes rolled out Claude Opus 4.8 within hours of launch, including Cursor, Warp, OpenRouter, and Perplexity. That matters because teams can benchmark or swap the model into existing workflows without waiting for connector lag.

4 min read

Agent tools add Claude Opus 4.8 to Cursor, Warp, OpenRouter, and Perplexity on day one

TL;DR

According to cursor_ai's launch post, Claude Opus 4.8 hit Cursor on launch day, and theo's follow-up on updated CursorBench said the early bench update showed better efficiency but slightly worse aggregate performance than 4.7 within margin of error.
OpenRouter's rollout post and vercel_dev's AI Gateway announcement both shipped Opus 4.8 within minutes of the model going live, with OpenRouter also exposing a separate fast variant.
The integration wave was broad: perplexity_ai's launch note added it to Perplexity and Computer, cognition's post put it in Windsurf and Devin CLI, and warpdotdev's post added it to Warp.
The clearest product claim across the rollout was about long-horizon agent runs: AskVenice's announcement stressed better tool use and judgment, while Letta_AI's note said 4.8 kept 4.7-level context management with better token efficiency.

You can jump straight to Vercel's AI Gateway changelog, compare the hosted variants on OpenRouter's Opus 4.8 page and Fast page, and read Simon Willison's launch note, which highlighted Anthropic describing the release itself as a "modest but tangible improvement."

Day-one rollout

The unusual part was not one marquee integration. It was how many agent surfaces updated in the same afternoon.

Within hours, Opus 4.8 was live in Cursor, OpenRouter, Vercel AI Gateway, Perplexity, Windsurf, Devin CLI, Warp, OpenCode, Crush, Letta Code, and Venice, according to vercel_dev's AI Gateway announcement, perplexity_ai's launch note, cognition's post, opencode's post, AskVenice's announcement, warpdotdev's post, charmcli's post, and Letta_AI's note.

That turned launch day into a live compatibility test. Teams could try the same model slug across a gateway, an IDE, a terminal agent, and a browser-computer product without waiting for the usual connector lag.

CursorBench split

Cursor's own launch framing was bullish. cursor_ai's launch post said 4.8 worked more efficiently than 4.7 on CursorBench and looked more persistent on hard tasks.

Then the first visible caveat landed. theo's follow-up on updated CursorBench said the refreshed benchmark showed better efficiency, but slightly worse overall performance than 4.7 within margin of error.

That is a more interesting launch pattern than a clean win. The early story around 4.8 was less "higher score everywhere" and more "same class of model, tuned for longer runs, bug catching, and fewer false passes," which matches claims from OpenRouter's rollout post and warpdotdev's post.

Computer-use surfaces

Some of the fastest adopters were products that already wrap models in tool use, browser control, or multi-step coding flows.

The day-one positioning clustered around a few behaviors:

Longer agentic coding runs, per vercel_dev's AI Gateway announcement.
Planning first, then following through across multi-step tasks, per warpdotdev's post.
Better review judgment, including catching bugs instead of stopping after one passing test, per warpdotdev's post and OpenRouter's rollout post.
Stronger computer use, with Perplexity surfacing it inside Computer and Venice calling it Anthropic's best computer-use model so far, according to AravSrinivas's note and AskVenice's announcement.
Similar context management to 4.7, but better token efficiency, per Letta_AI's note.

The overlap suggests partners were selling the same thing from different angles: fewer brittle agent runs, not a brand-new workflow.

Fast mode and API knobs

The rollout also exposed a few implementation details that would otherwise get buried in launch chatter.

llm-anthropic 0.25.1

Release: llm-anthropic 0.25.1 New model: Claude Opus 4.8 (claude-opus-4.8). New -o fast 1 option for fast mode, for organizations with that feature enabled on their account. Default max_tokens for each model now defaults to that model's maximum output rather than 8,192. #72 See also my notes on Opus 4.8 - I used this new release of llm-anthropic to generate the pelicans.

OpenRouter published both the standard and fast variants on day one, with OpenRouter's rollout post claiming 2.5x speed at 2x cost for Fast Mode and OpenRouter's model page link post noting support for system messages mid-conversation. Warp described the trade differently, with warpdotdev's fast-mode post calling it Opus-class performance at 3x lower cost than Opus 4.7 Fast.

A separate tooling update came from Simon Willison's llm-anthropic release note, which added the claude-opus-4.8 model ID to llm-anthropic 0.25.1, introduced a -o fast 1 switch for accounts with fast mode enabled, and changed default max_tokens behavior to each model's maximum output. That made the ecosystem story slightly bigger than "yet another model slug": the wrappers and gateways were already adapting to 4.8-specific controls.