Rate Limits
Provider limits, throttling, and capacity constraints.
Stories
Filter storiesOpenAI said Codex accounts were seeing faster usage draining than intended because abuse and fraud checks were overflagging some sessions, then issued a usage reset for all users. It matters because paid Codex workflows were losing quota unexpectedly mid-run, directly affecting reliability and cost.
Z.ai made GLM-5.2 available to GLM Coding Plan users with High and Max thinking modes, 1M context, and promised API plus MIT open source next week. Early testers reported higher plan pricing, heavy rate limits, and mixed build quality versus Opus and Fable.
Users said Claude Fable 5 kept routing ordinary research prompts to Opus 4.8 after Anthropic’s labeled fallback path appeared. Watch for mid-session model swaps if you rely on Fable for research work.
OpenAI started rolling out bankable Codex resets to Go, Plus, Pro, and Business users, plus a two-week referral program that can add more resets. That lets users save capacity for heavier Browser use and longer Codex sessions instead of losing resets on a fixed clock.
One day after Fable 5 launched, users reported burning through Max quotas in about 90 minutes while Anthropic told subscribers the model will leave Claude plans on June 23 until capacity improves. If you depend on Fable, plan for quota pressure and route critical jobs elsewhere.
Anthropic reset Fable's 5-hour and weekly quotas after launch-day reports of Max users exhausting access in minutes. Access also depended on the latest Claude Code build, and plan messaging said included use ends June 22 before usage credits take over.
A day after Codex users reported outages and caps, OpenAI said the service had three separate incidents and later disclosed a bug that undercounted tokens for some Plus and Pro accounts, while users reported paid-plan quotas reset. The update matters because Codex operators saw both service instability and account-limit changes in the same 24-hour window.
Users reported outages, tighter 5-hour caps, and token availability problems a day after OpenAI launched Codex Sites and plugins. OpenAI reset Codex usage limits after three incidents, so teams should watch quotas and backend reliability as agent workflows ramp up.
Cursor raised usage limits for all Teams users and introduced a Premium seat tier with 5x usage for 3x the price. Teams can now budget coding-agent access around seat quotas instead of raw token meters.
A day after users reported runaway Claude Code usage, Anthropic reset five-hour and weekly quotas and said an Opus 4.8 handling issue was spawning more parallel tool calls than intended. The fix matters because it turns a token-burn complaint into an acknowledged product bug with restored quotas for affected Pro and Max users.
OpenAI restored Codex weekly and hourly quotas across paid ChatGPT plans after Tibo Sottiaux said the product hit 5 million users. Watch for long-running QA loops, migration PRs, and remote desktop sessions that can still burn through quotas fast.
Fresh posts added 600K-to-200K context rollbacks, auto mode breaking human checkpoints, and default session-file deletion to the recent Claude Code complaint stack. Watch long sessions and review loops closely, since recovery got harder when session files disappeared.
Antigravity added a lower-cost Gemini 3.5 Flash tier for IDE, CLI, and desktop use, with posts citing about 45% fewer tokens than Medium. Watch quotas after the reset across free and paid plans if you're planning to use the cheaper tier.
OpenAI said a recent Codex optimization lowered cache-hit rates in long-running sessions, drained limits faster, rolled it back, and reset all accounts. That matters because compaction and cache behavior directly determine quota burn and session reliability.
A day after Antigravity raised weekly Gemini quotas, the team said the 3x increase is permanent and doubled Gemini 3.5 Flash max context in AGY. The same update batch also clarified the IDE split and shipped Windows fixes, changing day-to-day limits and workflow behavior for developers.
Google tripled Antigravity's Gemini weekly quotas and issued a one-time quota reset after raising limits earlier in the week. The change lets teams run more Gemini 3.5 Flash work inside Google's CLI and managed-agent workflows.
OpenAI said a metering bug put many Codex subscribers at the wrong usage level for about two hours, then restored balances and waived usage from that window. This matters because the incident interrupted active sessions and showed how subscription sync failures can halt agent runs mid-task.
OpenAI shipped shortcut customization, restored Git controls, cleaned up panels, and sped up large-repo operations in Codex. Paid-plan usage caps were also reset, though some accounts saw delayed propagation.
Developers published two Claude Code workarounds after users flagged metered -p mode: a tmux-backed claude-p wrapper and a setting to stop attribution headers from breaking prompt caching. Both reduce repeated-token spend in agent-heavy runs.
OpenAI said Codex’s GPT-5.5 degradation over the prior 48 hours came from two issues and it will reset usage limits after the fix. Users had reported looping runs, higher cache burn, and unstable sessions in active coding workflows.
A day after developers flagged Anthropic’s SDK credit split, Claude Code users said -p work had become metered, slower, and harder to run headlessly. Anthropic reset 5-hour and weekly limits, and Claude Code 2.1.143 added projected context-cost estimates.
OpenRouter updated BYOK workspaces so teams can attach multiple provider keys, scope them to specific models or users, and choose prioritized versus fallback use. It changes how rate-limit isolation, dev and prod separation, and failover routing are handled inside one workspace.
Anthropic will move Claude Agent SDK, claude -p, GitHub Actions, and third-party agent apps onto separate monthly credits on June 15. Watch the new bucket closely, since it changes the cost model for autonomous runs and subscription-backed harnesses.
Anthropic increased Claude Code weekly limits 50% for Pro, Max, Team, and seat-based Enterprise users through July 13. The higher cap stacks on last week's 2x five-hour increase and applies across CLI, IDE extensions, desktop, and web.
Anthropic said a SpaceX compute deal will add 300+ MW and 220,000+ NVIDIA GPUs, and it doubled Claude Code 5-hour limits across paid plans. It also raised Opus API ceilings; users should still watch the unchanged weekly caps.
Users on Hacker News and Reddit reported a reproduced HERMES.md extra-usage billing bug, plus new ban appeals and repeated blame-shifting complaints. Anthropic says affected users will get refunds and credits, so teams should keep an eye on quota routing and support escalation.
Days after Opus 4.7 launched, users reported commit-message triggers tied to OpenClaw or HERMES markers that could route requests into extra billing or refusals, alongside continued throttling complaints. Anthropic says affected users will get refunds, but repo-scanning heuristics may still affect cost and reliability in multi-harness workflows.
A day after Opus 4.7 launched, users reported OpenClaw-linked refusals, cache TTL cost spikes, and billing failures in Claude Code. Anthropic appears to have eased some limits, but behavior and spend still vary sharply across agent-heavy sessions.
OpenAI reset Codex rate limits across all paid plans after a week of GPT-5.5 shipping. The temporary bump changes immediate capacity for active teams, but it was announced as a celebratory reset rather than a permanent quota change.
GitHub says Copilot will shift from flat-rate plans to usage-based billing starting June 1 as agentic features expand. The change makes token budgeting a first-order engineering constraint and adds more pressure on teams comparing Copilot with other coding agents.
Anthropic said three harness-side changes degraded Claude Code quality, then reset subscriber limits and rolled out fixes in 2.1.119. The update matters because recent failures came from tool defaults and prompt handling rather than the base model alone.
OpenAI said Codex passed 4 million weekly users less than two weeks after clearing 3 million, and then reset usage limits again. The scale jump matters because it points to rapid coding-agent adoption and likely plan and capacity changes.
A day after Opus 4.7 launched, users are surfacing adaptive-thinking misses, surprise refusals, and higher token use. For engineers, recheck prompts, costs, and 4.6 fallbacks while Anthropic patches bugs and lifts limits.
Anthropic raised Claude subscriber limits and shipped Claude Code 2.1.112 after Opus 4.7's adaptive thinking and tokenizer changes increased token use. Users still report fast quota depletion and inconsistent cache or effort behavior across web and CLI sessions.
GitHub issues and Hacker News threads added fresh evidence that Claude Code sessions still burn quota unexpectedly after the cache TTL change, with some users seeing usage before a prompt is sent and others recovering capacity by rolling back to 2.1.34. Watch cache reuse and metering behavior closely if you rely on long-running sessions.
Anthropic acknowledged a March 6 cache optimization change, and Pro Max users report that the shorter TTL plus hidden session context now burns through Claude Code quota much faster. Watch for 500 errors and stalled streams, and apply the 2.1.105 patch if your UI hangs.
Fresh retests and issue threads point to worse Claude Code behavior, with Opus 4.6 falling to 68.3% on BridgeBench and users surfacing buried reasoning-effort controls. Track quota burn, hidden effort settings, and rollback reports before assigning more coding-agent work.
OpenAI added a $100 ChatGPT Pro tier with 5x more Codex usage than Plus and kept the $200 tier as the highest-capacity option. The new tier resets Codex limits again and temporarily doubles Pro usage through May 31.
OpenAI said Codex reached 3 million weekly users and reset usage limits, with another reset planned for each additional million users up to 10 million. ChatGPT-sign-in Codex will also retire the gpt-5.2 and gpt-5.1-era lineup on April 14, so teams should watch for model-default changes.
Anthropic’s Apr. 4 cutoff for using Claude subscriptions through OpenClaw-class harnesses went live. Users report API-billing fallbacks, ACP workarounds, and restored Claude Code quota, while edge cases around claude -p and Agent SDK use remain unsettled. The change pushes heavy agent loops toward metered access.
Anthropic said Claude subscriptions will stop covering third-party harnesses such as OpenClaw on Apr. 4, with discounted extra-usage bundles, refunds, and one-time plan credits. Heavy Claude-based agent workflows may need to move to API billing or extra-usage bundles because Anthropic cites subscription capacity constraints.
OpenAI reset Codex usage limits across all plans after dashboards showed more users hitting caps and the team said it still did not fully understand the trigger. Use the reset to recheck capacity assumptions, since OpenAI also said it banned abuse accounts and March’s repeated resets point to a broader capacity issue.
Claude Code 2.1.88 added fixes for prompt-cache misses, repeated CLAUDE.md reinjection, and a multi-schema StructuredOutput bug after widespread reports of unexpectedly fast quota consumption. Update if you rely on long sessions, because uncached runs can burn through paid limits much faster than intended.
Users report stricter Claude Code request caps, weeklong cooldowns, and desktop threads disappearing after restarts. Watch quotas closely and shift to lighter models or token-cutting workflows around /context and /clear if the limits hit your workflow.
Users report new request-per-minute caps that trigger after three to four concurrent agents, and Boris Cherny says efficiency work is underway. The issue hits the multi-agent workflows Anthropic has been promoting, separate from five-hour usage buckets.
Anthropic confirmed new peak-time metering that burns through 5-hour Claude sessions faster, and multiple power users posted 529 overloaded errors and early exhaustion. If you rely on Max plans for coding, watch for session limits and consider moving daily work to Codex.
Anthropic said free, Pro, and Max users will hit 5-hour Claude session limits faster on weekdays from 5am to 11am PT, while weekly caps stay the same. Shift long Claude Code jobs off-peak and watch prompt-cache misses.
Z.ai released GLM-5-Turbo as a faster GLM-5 variant for OpenClaw-style tool use, with 202K context, OpenRouter access, and higher off-peak limits. Try it as a cheaper speed tier for agent workflows, but benchmark completion quality on your own tasks before wider use.
Anthropic is doubling Claude usage outside peak hours from Mar. 13 to Mar. 27, with the bonus applied automatically across Free, Pro, Max, Team, and Claude Code. Shift long runs and bulk jobs to off-peak windows to stretch limits without changing plans.
Google AI Studio now lets developers set experimental per-project spend caps for Gemini API usage. Use it as a native billing guardrail, but account for roughly 10-minute enforcement lag and possible batch-job overshoot.