Skip to content
AI Primer

An unverified multimodal Claude Opus model release name associated with Anthropic; available first-party sources could not be confirmed because Exa access was unavailable.

Pricing

Model profile · Current snapshot
Input / 1M
$6.25
Output / 1M
$25.00
Blended / 1M
$10.94
Output TPS
46.99
TTFT (s)
1.35

Model Intelligence

Arena ranking
52
Benchmarkable
Yes
Model level
release
Intelligence Index
51.8
Coding Index
53.1
GPQA
0.89
HLE
0.31
SciCode
0.5
IFBench
0.44
LCR
0.67
TerminalBench Hard
0.55
TAU2
0.74

Recent stories

18 linked stories
releasePRIMARY2026-05-28
Claude Opus 4.8 ships with 69.2% SWE-Bench Pro and 2.5x Fast mode

Anthropic released Claude Opus 4.8 across Claude, the API, and major clouds with higher coding scores and a cheaper 2.5x-speed Fast mode. Use it for coding workloads that want better benchmark performance without a price increase over 4.7.

releaseSECONDARY2026-05-28
Claude Code 2.1.154 adds Dynamic Workflows for hundreds of parallel subagents

Claude Code 2.1.154 added Dynamic Workflows, a research-preview mode that writes orchestration scripts and runs hundreds of subagents in one session. Anthropic also shipped 2.1.156 to fix Opus 4.8 thinking-block API errors, so teams should watch for workflow and API stability.

releaseSECONDARY2026-05-27
DeepSWE benchmarks GPT-5.5 at 70% on 113 tasks across 91 repos

DeepSWE launched a coding benchmark built from 113 original tasks across 91 repos and five languages, with GPT-5.5 leading at 70%. The setup is meant to better reflect repo search, multi-file edits, and verification in real agent workflows.

newsSECONDARY2026-05-23
Claude Code users report hidden Agent access, empty-string MCP failures, and slower Opus 4.7 runs

Practitioners shared a transcript showing Claude Code invoking Agent despite project allow-lists, a reproducible MCP bug that drops all params when one value is an empty string, and reports of much slower Opus 4.7 runs than in Cursor. That matters because teams are spending real quota debugging harness behavior, retries, and cache invalidation instead of model output.

releaseSECONDARY2026-05-13
Cline launches SDK and hits 74.2% on Terminal-Bench 2.0

Cline open-sourced the runtime behind its extension and CLI as the Cline SDK, then rebuilt the CLI on top with agent teams, cron jobs, connectors, and example apps. The harness score gives teams a new reference point if they want to compare agent tooling on Terminal-Bench 2.0.

newsPRIMARY2026-05-12
Claude Opus 4.7 opens fast mode with ~2.5x speed as Cursor, v0, Droid, and OpenRouter add support

Anthropic rolled fast mode for Opus 4.7 into Claude Code and tools including Cursor, v0, Droid, Conductor, and OpenRouter. Use it where latency matters, but watch pricing: Cursor disclosed a 6x multiplier and others treat it as premium.

newsPRIMARY2026-05-09
GPT-5.5 vs Opus 4.7: users compare plan mode, frontend output, and 120K-context use

User posts and HN threads compared GPT-5.5 and Opus 4.7 across plan mode, frontend work, and 120K-context sessions. The split results mean token burn and instruction discipline matter as much as raw benchmark scores.

newsSECONDARY2026-05-02
Claude Code users report HERMES.md extra billing and ban appeals

Users on Hacker News and Reddit reported a reproduced HERMES.md extra-usage billing bug, plus new ban appeals and repeated blame-shifting complaints. Anthropic says affected users will get refunds and credits, so teams should keep an eye on quota routing and support escalation.

newsSECONDARY2026-05-01
Claude Code users report keyword-trigger billing after Opus 4.7 rollout

Days after Opus 4.7 launched, users reported commit-message triggers tied to OpenClaw or HERMES markers that could route requests into extra billing or refusals, alongside continued throttling complaints. Anthropic says affected users will get refunds, but repo-scanning heuristics may still affect cost and reliability in multi-harness workflows.

newsSECONDARY2026-05-01
ARC Prize reports GPT-5.5 at 0.43% and Opus 4.7 at 0.18% on ARC-AGI-3

ARC Prize published frontier-model results on ARC-AGI-3 and said GPT-5.5 and Opus 4.7 both stayed below 1%, with failures in world modeling, abstraction, and reward reinforcement. That shows strong coding and benchmark models still break on novel interactive reasoning tasks, and follow-up comparisons even had Opus 4.6 slightly ahead of 4.7.

releaseSECONDARY2026-04-30
Claude Security opens public beta with Opus 4.7 repo scans

Anthropic opened Claude Security to Claude Enterprise customers, letting teams scan repositories, validate findings, and review suggested patches inside Claude. The beta also adds scheduled scans, directory targeting, exports, and webhook alerts for recurring codebase reviews.

newsPRIMARY2026-04-29
Opus 4.7 users report OpenClaw refusals, cache TTL spikes, and billing lockouts after launch

A day after Opus 4.7 launched, users reported OpenClaw-linked refusals, cache TTL cost spikes, and billing failures in Claude Code. Anthropic appears to have eased some limits, but behavior and spend still vary sharply across agent-heavy sessions.

newsPRIMARY2026-04-28
Opus 4.7 users report verbose output, weaker 1M context, and 12–27% higher costs

Users reported more verbosity, weaker 1M-context behavior, and little coding gain after Opus 4.7 rolled out. OpenRouter measured 12–27% higher costs, and some teams reverted their default model.

newsSECONDARY2026-04-23
Anthropic reports Claude Code regressions after March 26 thinking bug and xhigh default shift

Anthropic said three harness-side changes degraded Claude Code quality, then reset subscriber limits and rolled out fixes in 2.1.119. The update matters because recent failures came from tool defaults and prompt handling rather than the base model alone.

newsPRIMARY2026-04-19
Opus 4.7 users report 1.46x tokenization and faster limit burn

Four days after the Opus 4.7 launch, independent tests measured about 1.35-1.46x more text tokens than 4.6 while users kept reporting faster limit burn and weaker coding. That can change effective cost and session economics in Claude Code even if list prices stay flat.

newsPRIMARY2026-04-18
Opus 4.7 users report 1.47x token overhead and web-search refusals two days after launch

Users and analysts say Opus 4.7 is using more tokens, refusing web search, and missing orchestration steps in Claude Code-style workflows. Watch token costs and regression reports closely if you rely on xhigh defaults or tokenizer-sensitive prompts.

newsSECONDARY2026-04-17
Anthropic launches Claude Design research preview with codebase-derived design systems

Anthropic launched Claude Design in research preview, turning prompts, files, and codebase context into prototypes, slides, and one-pagers. It can infer a team design system and export to Canva, PDF, or PPTX, or hand off to Claude Code.

newsPRIMARY2026-04-17
Opus 4.7 users report instruction-following misses, refusals, and ~1.3x token burn a day after launch

A day after Opus 4.7 launched, users are surfacing adaptive-thinking misses, surprise refusals, and higher token use. For engineers, recheck prompts, costs, and 4.6 fallbacks while Anthropic patches bugs and lifts limits.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.