updateMay 23, 2026

OpenAI fixes Codex cache-hit bug and resets usage limits

OpenAI said a recent Codex optimization lowered cache-hit rates in long-running sessions, drained limits faster, rolled it back, and reset all accounts. That matters because compaction and cache behavior directly determine quota burn and session reliability.

3 min read

OpenAI fixes Codex cache-hit bug and resets usage limits

TL;DR

OpenAI said a Codex optimization hurt cache hit rates during compaction in long-running sessions, which made usage limits drain faster, according to thsottiaux's bug explanation.
The team said it rolled that optimization back and reset usage limits for all accounts, per thsottiaux's reset notice.
User reports had already zeroed in on the symptom, and mattlam_'s reply framed cache hit rates as a system-design problem that reaches beyond Codex.
HCSolakoglu's checklist turned the incident into a concrete regression-testing complaint: cache hit ratio, context rot, token counts, runtime, tool behavior, and SWE-bench Pro subset scores.

OpenAI's explanation was narrow but revealing. thsottiaux's post tied quota burn to a backend optimization around compaction, while mattlam_'s response highlighted how small changes in cache behavior can show up to users as a rate-limit problem, not a performance bug.

Compaction

The key detail in OpenAI's public note is the phrase "cache hit rates when compacting across long running sessions." That pins the bug on session maintenance, not on a visible product change or a broad pricing update.

In practice, the company described a chain of events with three links:

An optimization shipped inside Codex.
That optimization reduced cache hit rates during compaction on long sessions.
Lower cache reuse made limits drain faster for users.

Because the explanation names compaction specifically, the incident looks less like simple overcounting and more like degraded reuse of prior work. mattlam_'s reply explicitly called cache hit rates a fundamental system-design problem, which matches why users noticed it through quota burn first.

Reset

OpenAI said it had already rolled the optimization back by the time of the public post, then reset usage limits for all accounts. reach_vb's PSA confirmed that users saw the reset land immediately before the weekend.

The sequence matters. The company did not just say the bug was fixed, it also restored lost capacity. dkundel's reply added a short thank-you to users for flagging the issue, which suggests user reports were part of the detection loop.

Regression checks

The most concrete follow-on critique came from HCSolakoglu's post, which argued that a product used at Codex's scale should run release checks on the metrics that would have exposed this kind of failure earlier.

That list was specific:

cache hit ratio
context rot
input and output tokens
average runtime
tool-call stats and behavior
SWE-bench Pro subset score

The useful part of that complaint is its shape. It treats the bug as a regression across system metrics that users actually feel, not as a one-off support issue.

Weekend timing

The reset landed just before a long weekend, and the user reaction turned that into a mini release-valve moment. reach_vb's post celebrated the restored limits, while dkundel's reply closed the loop with "Happy weekend codexing."

That timing is new information because it captures the practical effect of the fix: the incident ended not with a postmortem, but with replenished accounts and users immediately going back to work.

TL;DR

Compaction

Reset

Regression checks

Weekend timing

Discussion across the web