releaseMarch 13, 2026

Anthropic launches 1M-token context for Opus 4.6 and Sonnet 4.6 at flat pricing

Anthropic made 1M-token context generally available for Opus 4.6 and Sonnet 4.6, removed the long-context premium, and raised media limits to 600 images or PDF pages. Use it for retrieval-heavy and codebase-scale workflows that previously needed beta headers or special long-context pricing.

3 min read

Anthropic launches 1M-token context for Opus 4.6 and Sonnet 4.6 at flat pricing

TL;DR

Anthropic made 1M-token context generally available for Claude Opus 4.6 and Sonnet 4.6, with the launch thread framing it as production-ready for “entire codebases, large document sets, and long-running agents.”
The API change is as important as the window size: Anthropic’s feature roundup says there is “no more long context price increase,” and requests over 200K tokens no longer need a beta header.
Claude also raised multimodal limits sharply. The launch thread and feature roundup both say a single request can now include up to 600 images or PDF pages.
For Claude Code, 1M context is moving from opt-in to default on paid team plans: Bcherny’s update says Opus 4.6 1M is now the default Opus model on Max, Team, and Enterprise, while trq212’s note says it now counts against normal plan limits.

What actually shipped in the API?

The headline change is that 1M context is now GA for both Opus 4.6 and Sonnet 4.6. In the Anthropic launch, the company says developers can “load entire codebases, large document sets, and long-running agents,” and the linked blog post adds that standard pricing now applies “across the full 1M window” with no long-context premium.

This also simplifies integration. Anthropic’s API changes says there is “no beta header required in the API,” while the blog summary says requests above 200K tokens are now supported automatically and rate limits stay consistent across the full window. For teams that had built special-case paths for long prompts, that removes both a pricing branch and a header gate.

How does Claude Code behavior change?

For Claude Code users, Anthropic is treating long context less like a premium mode and more like the default operating environment. The Claude Code default says “Opus 4.6 1M is now the default Opus model” for Max, Team, and Enterprise plans; Pro and Sonnet users can still opt in with /extra-usage.

That plan change matters because usage accounting changed with it. In normal plan limits, Anthropic says 1M context now “counts against your normal plan limits, no extra usage required.” Practitioner notes in session behavior and config note point to the immediate workflow effect: “fewer compactions,” longer-running sessions, and less need to start over. The same Claude Code docs also expose model-selection and extended-window aliases, while config note adds that CLAUDE_CODE_AUTO_COMPACT_WINDOW can tune when auto-compaction kicks in.

How credible is the long-context performance jump?

Anthropic is pairing the rollout with a specific retrieval benchmark claim: the benchmark claim says Opus 4.6 scores 78.3% on MRCR v2 at 1M tokens, “highest among frontier models.” The chart in [img:0|MRCR chart] shows Sonnet 4.6 at 65.1% at 1M, versus 36.6% for GPT-5.4 and 25.9% for Gemini 3.1 Pro on the same graphic.

The more consequential engineering detail is that Anthropic is no longer asking developers to pay extra to test whether those retrieval numbers translate into real workloads. pricing context calls out “flat pricing across the full context window to 1M tokens,” and price reaction highlights that the 1M tier now uses the same base Opus pricing rather than a separate long-context premium. That makes retrieval-heavy code search, large-spec analysis, and long agent traces easier to evaluate in production settings instead of behind a special cost model.

TL;DR

What actually shipped in the API?

How does Claude Code behavior change?

How credible is the long-context performance jump?

Discussion across the web