Claude Code 2.1.88 added fixes for prompt-cache misses, repeated CLAUDE.md reinjection, and a multi-schema StructuredOutput bug after widespread reports of unexpectedly fast quota consumption. Update if you rely on long sessions, because uncached runs can burn through paid limits much faster than intended.

CLAUDE.md reinjection, a StructuredOutput schema cache bug, and a --resume crash, according to the GitHub release and changelog release summary changelog thread.Rate limit reached error when the API actually returned an entitlement error, and tweaked /usage so Pro and Enterprise users no longer see a redundant Sonnet-only weekly bar changelog thread usage screenshot.You can read the release tag, inspect the full changelog entry, and dig into the cache invalidation bug report that described conversation history dropping out mid-session. One of the stranger details is that the fix list spans both prompt caching and UI accounting, including /stats undercounting subagent usage and /usage hiding a plan-specific bar that was confusing people in the middle of the blowup.
The outage pattern was messy, not universal. Some users said they were hitting plan limits after a few prompts, while others running similar workflows were fine.
That matches the shape of the GitHub reports. In issue #40884, one user described active chats dying with API Error: Rate limit reached while the UI showed only 2 percent session usage and 41 percent weekly usage. Opening a new chat worked, but the old one stayed broken.
A separate screenshot circulating the same day showed how model choice changed the burn rate. One Max user reported 38 percent session usage after hours on Sonnet, with only 4 percent of the Sonnet weekly limit used, and contrasted that with Opus sessions that were exhausting the session bar much faster usage screenshot.
The clearest technical diagnosis came from issue #40524, titled "Conversation history invalidated on subsequent turns." The reporter said later turns stopped caching prior history and reverted to reprocessing the system prompt plus large cache writes.
That lines up with the community theory in the main thread: once a session stopped hitting cache, token use jumped hard enough to chew through paid quotas reverse-engineering thread. The thread also linked a Reddit reverse-engineering writeup and claimed two compounding bugs, one involving the packaged Bun binary and another involving --resume, but those specific root-cause claims stayed community analysis rather than release-note language.
Anthropic did at least acknowledge the broader class of problem in public. A follow-up post shared an official response saying the team was investigating this issue and other possible causes official-response tweet.
The official release is unusually blunt about cache-related fixes. The release notes and changelog list four changes that sit directly on the path people were complaining about:
CLAUDE.md files being re-injected dozens of times in long sessions that read many files.StructuredOutput schema cache bug causing about a 50 percent failure rate when multiple schemas were used.--resume crash when transcripts contained a tool result from an older CLI version or an interrupted write.There is a fifth fix nearby that matters for expensive failures: an autocompact thrash loop now stops after three consecutive immediate refills instead of continuing to burn API calls. That is a small line item with a very large blast radius for long-running sessions.
Part of the confusion on March 30 was that the product looked inconsistent even before you got to the underlying cache bug. The changelog says 2.1.88 now shows the actual error, with actionable hints, when the API returns an entitlement error instead of labeling it Rate limit reached.
It also fixes /stats undercounting tokens by excluding subagent usage, and preserves more than 30 days of historical data when the stats cache format changes. Those are not the same bug as prompt-cache invalidation, but they affect how users reconstruct what actually happened after a bad session.
The release also changes /usage for Pro and Enterprise plans by hiding the redundant "Current week (Sonnet only)" bar usage screenshot. In the middle of a quota scare, even small UI cleanup like that matters because people were comparing screenshots line by line.
By early March 31, at least one previously affected user reported that rate-limit behavior seemed fixed again, though they could not tell whether the change came from a harness bug fix, a policy change, or account-level intervention.
That timing leaves an awkward but useful record. The public complaints, the GitHub cache issue, Anthropic's acknowledgement, and the 2.1.88 patch notes all landed inside roughly a day. Even if the exact root cause was more than one bug, the changelog makes clear that Claude Code's prompt path had several ways to waste tokens in long sessions, and multiple of them were patched at once.
PSA: If you've been running out of Claude session quotas on Max tier, you're not alone. Read this. Some insane Redditor reverse engineered the Claude binaries with MITM to find 2 bugs that could have caused cache-invalidation. Tokens that aren't cached are 10x-20x more expensive Show more
My feed is showing me a bunch of folks who tapped out their whole usage limits on Mon/Tue. Is this your experience? Please comment, I want to understand how widespread this is
Full report on Github with details: github.com/anthropics/cla…
Claude Code 2.1.88 has been released. 41 CLI changes, 3 system prompt changes Highlights: • Agent guidance adds 'never delegate understanding': agents must verify comprehension to avoid misdelegation • Fixed StructuredOutput schema cache bug causing ~50% failures in Show more
Claude Code CLI 2.1.88 changelog: New features: • Added CLAUDE_CODE_NO_FLICKER=1 environment variable to opt into flicker-free alt-screen rendering with virtualized scrollback • Added PermissionDenied hook that fires after auto mode classifier denials — return {retry: true} to Show more
Claude Code rate limit issues seem to be fixed for me. Not sure if they fixed a bug in the harness, made a policy change decision for everyone, or just tweaked it for me (someone was in touch with me from Anthropic about it). Either way, I’ll take it! Feels good to be back.