Users tied Claude Code quota spikes to cache misses and schema failures, and 2.1.88 shipped fixes for prompt caching plus multi-schema StructuredOutput bugs. Broken cache reuse can multiply token usage and throttle long coding sessions, so the patch matters for reliability.

--resume crash, and a StructuredOutput schema cache bug that reportedly caused about 50 percent failures in multi-schema workflows 2.1.88 release changelog thread.The interesting part is how fast this story snapped into focus. Users first complained about quotas exploding user report, then a reverse engineered theory tied the spike to broken cache reuse cache bug thread, then Anthropic shipped a release with explicit prompt-cache fixes in the changelog via the official release, the changelog entry, and bug threads like conversation history invalidated on subsequent turns. For engineers using long sessions, subagents, and resume flows, the real story here is simple: cache correctness is product reliability.
The most credible explanation for the quota spike was not that Anthropic suddenly made Claude Code dramatically stingier. It was that cached context stopped being reused consistently.
That theory lines up with the public bug report in issue #40524, where a user described conversation history getting invalidated on later turns, leaving only the system prompt cached and forcing large cache rewrites. It also lines up with a separate issue #40851, where a Max $100 user said a mostly text-only session with about 15 prompts consumed 93 percent of quota.
In practice, that is the kind of bug power users notice immediately. If your workflow depends on long-running sessions, background agents, or --resume, cache misses do not just make the tool slower. They make the quota meter look insane.
Anthropic's 2.1.88 release reads like a direct response to exactly that failure mode. The headline fix is explicit: prompt cache misses in long sessions caused by tool schema bytes changing mid-session are fixed in the release notes.
The rest of the changelog matters too:
--resume crash when transcripts contained tool results from an older CLI version or an interrupted write changelog thread./stats undercounting tokens by excluding subagent and fork usage changelog thread.That is a bigger deal than it sounds. Broken caching inflates cost. Broken error labels send people chasing the wrong root cause. Broken stats make it harder to prove what happened. 2.1.88 hits all three.
The people who felt this earliest were exactly the users pushing Claude Code hardest. One engineer described Claude Code as nearly unusable under new request-per-minute limits for concurrent agents, and said the missing features in Codex that still kept Claude Code in the mix were session search, looping, and hooks power-user workflow complaints.
That context matters because these are the workflows most sensitive to cache correctness:
When those flows break, quota pain compounds fast. A lighter workaround also emerged from the community: switch from Opus to Sonnet, where one user claimed far lower session usage for similar work Sonnet workaround. That is useful tactically, but it is not the fix. The fix is making cached history stable again.
By the morning of March 31, one of the most vocal affected users said the problem seemed resolved, though he was unsure whether Anthropic had fixed a harness bug, changed policy, or adjusted his account specifically user says fixed. That uncertainty is annoying, but the timing is hard to miss.
Users spent March 30 reporting blown quotas user report, the cache invalidation theory gained traction cache bug thread, and the 2.1.88 release landed with cache and schema fixes 2.1.88 release. For engineers deciding whether to trust long Claude Code sessions again, that sequence matters more than the drama did.
PSA: If you've been running out of Claude session quotas on Max tier, you're not alone. Read this. Some insane Redditor reverse engineered the Claude binaries with MITM to find 2 bugs that could have caused cache-invalidation. Tokens that aren't cached are 10x-20x more expensive Show more
My feed is showing me a bunch of folks who tapped out their whole usage limits on Mon/Tue. Is this your experience? Please comment, I want to understand how widespread this is
Claude Code 2.1.88 has been released. 41 CLI changes, 3 system prompt changes Highlights: • Agent guidance adds 'never delegate understanding': agents must verify comprehension to avoid misdelegation • Fixed StructuredOutput schema cache bug causing ~50% failures in Show more
Someone else pointed out that my clankers already did this integration (hard for me to keep up with everything they do across all my projects!): x.com/maksymsherman/…
Claude Code rate limit issues seem to be fixed for me. Not sure if they fixed a bug in the harness, made a policy change decision for everyone, or just tweaked it for me (someone was in touch with me from Anthropic about it). Either way, I’ll take it! Feels good to be back.