Hybrid reasoning model with superior intelligence for agents, featuring a 1M context window
Anthropic's current Claude Sonnet language model release, presented as a hybrid reasoning model for coding, agents, and professional workflows with a 1M token context window.
Prompt caching write pricing is also published by Anthropic: 5-minute cache writes cost $3.75/M tokens and 1-hour cache writes cost $6/M tokens. Cache hits & refreshes cost $0.30/M tokens.
Anthropic's official pricing page lists Claude Sonnet 4.6 at $3 per million input tokens and $15 per million output tokens. The same page lists cache hits & refreshes at $0.30 per million tokens, with prompt caching writes at $3.75/M (5m) and $6/M (1h). The Sonnet product page also states pricing starts at $3/$15 per million tokens.
Claude Code users reported steeper caps and week-long waits while sharing ways to cut usage, including /context audits, /clear, smaller models, and RTK log compression. The posts point to token burn from mounted MCP servers, long chat history, raw logs, and multi-agent concurrency, so teams may need to trim runtime load.
Anthropic made 1M-token context generally available for Opus 4.6 and Sonnet 4.6, removed the long-context premium, and raised media limits to 600 images or PDF pages. Use it for retrieval-heavy and codebase-scale workflows that previously needed beta headers or special long-context pricing.