Skip to content
AI Primer
TOPIC50 stories

Rate Limits

Provider limits, throttling, and capacity constraints.

NEWS26th June
Codex fixes quota drain tied to fraud overflagging with an account-wide usage reset

OpenAI said Codex accounts were seeing faster usage draining than intended because abuse and fraud checks were overflagging some sessions, then issued a usage reset for all users. It matters because paid Codex workflows were losing quota unexpectedly mid-run, directly affecting reliability and cost.

RELEASE2w ago
Z.ai releases GLM-5.2 for Coding Plan users with 1M context and Max mode

Z.ai made GLM-5.2 available to GLM Coding Plan users with High and Max thinking modes, 1M context, and promised API plus MIT open source next week. Early testers reported higher plan pricing, heavy rate limits, and mixed build quality versus Opus and Fable.

NEWS2w ago
Fable 5 users report Opus 4.8 fallbacks during research prompts

Users said Claude Fable 5 kept routing ordinary research prompts to Opus 4.8 after Anthropic’s labeled fallback path appeared. Watch for mid-session model swaps if you rely on Fable for research work.

RELEASE2w ago
Codex adds banked rate-limit resets for Go, Plus, Pro, and Business

OpenAI started rolling out bankable Codex resets to Go, Plus, Pro, and Business users, plus a two-week referral program that can add more resets. That lets users save capacity for heavier Browser use and longer Codex sessions instead of losing resets on a fixed clock.

NEWS2w ago
Fable 5 users report 90-minute Max caps and June 23 plan cutoff

One day after Fable 5 launched, users reported burning through Max quotas in about 90 minutes while Anthropic told subscribers the model will leave Claude plans on June 23 until capacity improves. If you depend on Fable, plan for quota pressure and route critical jobs elsewhere.

NEWS2w ago
Anthropic updates Claude Fable 5 limits with 5-hour and weekly resets

Anthropic reset Fable's 5-hour and weekly quotas after launch-day reports of Max users exhausting access in minutes. Access also depended on the latest Claude Code build, and plan messaging said included use ends June 22 before usage credits take over.

NEWS3w ago
Codex fixes token undercounting after three reliability incidents and quota resets

A day after Codex users reported outages and caps, OpenAI said the service had three separate incidents and later disclosed a bug that undercounted tokens for some Plus and Pro accounts, while users reported paid-plan quotas reset. The update matters because Codex operators saw both service instability and account-limit changes in the same 24-hour window.

NEWS3w ago
Codex users report outages, 5-hour caps, and token shortages after Sites launch

Users reported outages, tighter 5-hour caps, and token availability problems a day after OpenAI launched Codex Sites and plugins. OpenAI reset Codex usage limits after three incidents, so teams should watch quotas and backend reliability as agent workflows ramp up.

NEWS3w ago
Cursor raises Teams usage limits and adds Premium seats with 5x usage

Cursor raised usage limits for all Teams users and introduced a Premium seat tier with 5x usage for 3x the price. Teams can now budget coding-agent access around seat quotas instead of raw token meters.

NEWS3w ago
Claude Code resets 5-hour and weekly limits after Opus 4.8 parallel-tool bug

A day after users reported runaway Claude Code usage, Anthropic reset five-hour and weekly quotas and said an Opus 4.8 handling issue was spawning more parallel tool calls than intended. The fix matters because it turns a token-burn complaint into an acknowledged product bug with restored quotas for affected Pro and Max users.

NEWS3w ago
Codex raises weekly and hourly limits to 100% after 5 million users

OpenAI restored Codex weekly and hourly quotas across paid ChatGPT plans after Tibo Sottiaux said the product hit 5 million users. Watch for long-running QA loops, migration PRs, and remote desktop sessions that can still burn through quotas fast.

NEWS4w ago
Claude Code users report 200K context rollbacks and deleted session files

Fresh posts added 600K-to-200K context rollbacks, auto mode breaking human checkpoints, and default session-file deletion to the recent Claude Code complaint stack. Watch long sessions and review loops closely, since recovery got harder when session files disappeared.

RELEASE4w ago
Antigravity adds Gemini 3.5 Flash Low with ~45% fewer tokens

Antigravity added a lower-cost Gemini 3.5 Flash tier for IDE, CLI, and desktop use, with posts citing about 45% fewer tokens than Medium. Watch quotas after the reset across free and paid plans if you're planning to use the cheaper tier.

NEWS1mo ago
OpenAI fixes Codex cache-hit bug and resets usage limits

OpenAI said a recent Codex optimization lowered cache-hit rates in long-running sessions, drained limits faster, rolled it back, and reset all accounts. That matters because compaction and cache behavior directly determine quota burn and session reliability.

RELEASE1mo ago
Antigravity updates Gemini 3.5 Flash with permanent 3x quotas and 2x context

A day after Antigravity raised weekly Gemini quotas, the team said the 3x increase is permanent and doubled Gemini 3.5 Flash max context in AGY. The same update batch also clarified the IDE split and shipped Windows fixes, changing day-to-day limits and workflow behavior for developers.

NEWS1mo ago
Antigravity raises Gemini weekly quotas 3x and resets usage

Google tripled Antigravity's Gemini weekly quotas and issued a one-time quota reset after raising limits earlier in the week. The change lets teams run more Gemini 3.5 Flash work inside Google's CLI and managed-agent workflows.

NEWS1mo ago
Codex fixes usage-limit sync bug after 2-hour subscriber lockout

OpenAI said a metering bug put many Codex subscribers at the wrong usage level for about two hours, then restored balances and waived usage from that window. This matters because the incident interrupted active sessions and showed how subscription sync failures can halt agent runs mid-task.

RELEASE1mo ago
Codex updates app with customizable shortcuts and 10-50x faster Git ops

OpenAI shipped shortcut customization, restored Git controls, cleaned up panels, and sped up large-repo operations in Codex. Paid-plan usage caps were also reset, though some accounts saw delayed propagation.

WORKFLOW1mo ago
Claude Code users report tmux claude-p wrappers and cache fixes after June 15

Developers published two Claude Code workarounds after users flagged metered -p mode: a tmux-backed claude-p wrapper and a setting to stop attribution headers from breaking prompt caching. Both reduce repeated-token spend in agent-heavy runs.

NEWS1mo ago
OpenAI fixes two GPT-5.5 issues in Codex after users report looping runs

OpenAI said Codex’s GPT-5.5 degradation over the prior 48 hours came from two issues and it will reset usage limits after the fix. Users had reported looping runs, higher cache burn, and unstable sessions in active coding workflows.

NEWS1mo ago
Claude Code users report metered -p mode and slower headless sessions after credit split

A day after developers flagged Anthropic’s SDK credit split, Claude Code users said -p work had become metered, slower, and harder to run headlessly. Anthropic reset 5-hour and weekly limits, and Claude Code 2.1.143 added projected context-cost estimates.

RELEASE1mo ago
OpenRouter adds multi-key BYOK routing with fallback tiers

OpenRouter updated BYOK workspaces so teams can attach multiple provider keys, scope them to specific models or users, and choose prioritized versus fallback use. It changes how rate-limit isolation, dev and prod separation, and failover routing are handled inside one workspace.

NEWS1mo ago
Anthropic adds $20-$200 monthly Claude Agent SDK credits starting June 15

Anthropic will move Claude Agent SDK, claude -p, GitHub Actions, and third-party agent apps onto separate monthly credits on June 15. Watch the new bucket closely, since it changes the cost model for autonomous runs and subscription-backed harnesses.

NEWS1mo ago
Anthropic raises Claude Code weekly limits 50% through July 13

Anthropic increased Claude Code weekly limits 50% for Pro, Max, Team, and seat-based Enterprise users through July 13. The higher cap stacks on last week's 2x five-hour increase and applies across CLI, IDE extensions, desktop, and web.

NEWS1mo ago
Anthropic doubles Claude Code 5-hour limits after SpaceX Colossus 1 compute deal

Anthropic said a SpaceX compute deal will add 300+ MW and 220,000+ NVIDIA GPUs, and it doubled Claude Code 5-hour limits across paid plans. It also raised Opus API ceilings; users should still watch the unchanged weekly caps.

NEWS1mo ago
Claude Code users report HERMES.md extra billing and ban appeals

Users on Hacker News and Reddit reported a reproduced HERMES.md extra-usage billing bug, plus new ban appeals and repeated blame-shifting complaints. Anthropic says affected users will get refunds and credits, so teams should keep an eye on quota routing and support escalation.

NEWS1mo ago
Claude Code users report keyword-trigger billing after Opus 4.7 rollout

Days after Opus 4.7 launched, users reported commit-message triggers tied to OpenClaw or HERMES markers that could route requests into extra billing or refusals, alongside continued throttling complaints. Anthropic says affected users will get refunds, but repo-scanning heuristics may still affect cost and reliability in multi-harness workflows.

NEWS1mo ago
Opus 4.7 users report OpenClaw refusals, cache TTL spikes, and billing lockouts after launch

A day after Opus 4.7 launched, users reported OpenClaw-linked refusals, cache TTL cost spikes, and billing failures in Claude Code. Anthropic appears to have eased some limits, but behavior and spend still vary sharply across agent-heavy sessions.

NEWS2mo ago
Codex raises paid-plan limits after GPT-5.5 shipping week

OpenAI reset Codex rate limits across all paid plans after a week of GPT-5.5 shipping. The temporary bump changes immediate capacity for active teams, but it was announced as a celebratory reset rather than a permanent quota change.

NEWS2mo ago
GitHub Copilot introduces usage-based billing on June 1, 2026

GitHub says Copilot will shift from flat-rate plans to usage-based billing starting June 1 as agentic features expand. The change makes token budgeting a first-order engineering constraint and adds more pressure on teams comparing Copilot with other coding agents.

NEWS2mo ago
Anthropic reports Claude Code regressions after March 26 thinking bug and xhigh default shift

Anthropic said three harness-side changes degraded Claude Code quality, then reset subscriber limits and rolled out fixes in 2.1.119. The update matters because recent failures came from tool defaults and prompt handling rather than the base model alone.

NEWS2mo ago
Codex reaches 4 million weekly users and resets rate limits

OpenAI said Codex passed 4 million weekly users less than two weeks after clearing 3 million, and then reset usage limits again. The scale jump matters because it points to rapid coding-agent adoption and likely plan and capacity changes.

NEWS2mo ago
Opus 4.7 users report instruction-following misses, refusals, and ~1.3x token burn a day after launch

A day after Opus 4.7 launched, users are surfacing adaptive-thinking misses, surprise refusals, and higher token use. For engineers, recheck prompts, costs, and 4.6 fallbacks while Anthropic patches bugs and lifts limits.

NEWS2mo ago
Claude Code raises Opus 4.7 subscriber limits after token burn increases

Anthropic raised Claude subscriber limits and shipped Claude Code 2.1.112 after Opus 4.7's adaptive thinking and tokenizer changes increased token use. Users still report fast quota depletion and inconsistent cache or effort behavior across web and CLI sessions.

NEWS2mo ago
Claude Code users report 5-minute cache TTL and quota-meter regressions after March updates

GitHub issues and Hacker News threads added fresh evidence that Claude Code sessions still burn quota unexpectedly after the cache TTL change, with some users seeing usage before a prompt is sent and others recovering capacity by rolling back to 2.1.34. Watch cache reuse and metering behavior closely if you rely on long-running sessions.

NEWS2mo ago
Claude Code users report a 5-minute cache TTL and 5x Pro Max quota burn in 1.5 hours

Anthropic acknowledged a March 6 cache optimization change, and Pro Max users report that the shorter TTL plus hidden session context now burns through Claude Code quota much faster. Watch for 500 errors and stalled streams, and apply the 2.1.105 patch if your UI hangs.

NEWS2mo ago
Claude Code reports Opus 4.6 quality drop as BridgeBench retest falls to 68.3%

Fresh retests and issue threads point to worse Claude Code behavior, with Opus 4.6 falling to 68.3% on BridgeBench and users surfacing buried reasoning-effort controls. Track quota burn, hidden effort settings, and rollback reports before assigning more coding-agent work.

NEWS2mo ago
OpenAI launches $100 ChatGPT Pro tier with 5x more Codex usage

OpenAI added a $100 ChatGPT Pro tier with 5x more Codex usage than Plus and kept the $200 tier as the highest-capacity option. The new tier resets Codex limits again and temporarily doubles Pro usage through May 31.

NEWS2mo ago
OpenAI resets Codex usage limits after 3 million weekly users

OpenAI said Codex reached 3 million weekly users and reset usage limits, with another reset planned for each additional million users up to 10 million. ChatGPT-sign-in Codex will also retire the gpt-5.2 and gpt-5.1-era lineup on April 14, so teams should watch for model-default changes.

NEWS2mo ago
Anthropic cuts Claude subscription access for third-party harnesses in Apr. 4 rollout

Anthropic’s Apr. 4 cutoff for using Claude subscriptions through OpenClaw-class harnesses went live. Users report API-billing fallbacks, ACP workarounds, and restored Claude Code quota, while edge cases around claude -p and Agent SDK use remain unsettled. The change pushes heavy agent loops toward metered access.

NEWS2mo ago
Anthropic cuts Claude subscription access for third-party harnesses on Apr. 4

Anthropic said Claude subscriptions will stop covering third-party harnesses such as OpenClaw on Apr. 4, with discounted extra-usage bundles, refunds, and one-time plan credits. Heavy Claude-based agent workflows may need to move to API billing or extra-usage bundles because Anthropic cites subscription capacity constraints.

NEWS2mo ago
OpenAI resets Codex usage limits across all plans after a rate-limit spike

OpenAI reset Codex usage limits across all plans after dashboards showed more users hitting caps and the team said it still did not fully understand the trigger. Use the reset to recheck capacity assumptions, since OpenAI also said it banned abuse accounts and March’s repeated resets point to a broader capacity issue.

RELEASE2mo ago
Claude Code fixes prompt-cache bugs in 2.1.88 after quota-burn reports

Claude Code 2.1.88 added fixes for prompt-cache misses, repeated CLAUDE.md reinjection, and a multi-schema StructuredOutput bug after widespread reports of unexpectedly fast quota consumption. Update if you rely on long sessions, because uncached runs can burn through paid limits much faster than intended.

NEWS3mo ago
Claude Code limits concurrent work as users report weeklong waits and missing desktop threads

Users report stricter Claude Code request caps, weeklong cooldowns, and desktop threads disappearing after restarts. Watch quotas closely and shift to lighter models or token-cutting workflows around /context and /clear if the limits hit your workflow.

NEWS3mo ago
Claude Code limits concurrent agents as users report RPM caps

Users report new request-per-minute caps that trigger after three to four concurrent agents, and Boris Cherny says efficiency work is underway. The issue hits the multi-agent workflows Anthropic has been promoting, separate from five-hour usage buckets.

NEWS3mo ago
Anthropic limits Claude 5-hour sessions as users report 529 overloads

Anthropic confirmed new peak-time metering that burns through 5-hour Claude sessions faster, and multiple power users posted 529 overloaded errors and early exhaustion. If you rely on Max plans for coding, watch for session limits and consider moving daily work to Codex.

NEWS3mo ago
Anthropic limits Claude 5-hour sessions during 5am-11am PT peak window

Anthropic said free, Pro, and Max users will hit 5-hour Claude session limits faster on weekdays from 5am to 11am PT, while weekly caps stay the same. Shift long Claude Code jobs off-peak and watch prompt-cache misses.

RELEASE3mo ago
Z.ai releases GLM-5-Turbo with 202K context for OpenClaw-style agent workflows

Z.ai released GLM-5-Turbo as a faster GLM-5 variant for OpenClaw-style tool use, with 202K context, OpenRouter access, and higher off-peak limits. Try it as a cheaper speed tier for agent workflows, but benchmark completion quality on your own tasks before wider use.

NEWS3mo ago
Anthropic raises Claude off-peak usage 2x across Free, Pro, Max, and Team through Mar. 27

Anthropic is doubling Claude usage outside peak hours from Mar. 13 to Mar. 27, with the bonus applied automatically across Free, Pro, Max, Team, and Claude Code. Shift long runs and bulk jobs to off-peak windows to stretch limits without changing plans.

NEWS3mo ago
Google adds Gemini API spend caps in AI Studio with project-level dollar limits

Google AI Studio now lets developers set experimental per-project spend caps for Gemini API usage. Use it as a native billing guardrail, but account for roughly 10-minute enforcement lag and possible batch-job overshoot.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.