Skip to content
AI Primer
TOPIC31 stories

Rate Limits

Provider limits, throttling, and capacity constraints.

NEWS13th May
Anthropic adds $20-$200 monthly Claude Agent SDK credits starting June 15

Anthropic will move Claude Agent SDK, claude -p, GitHub Actions, and third-party agent apps onto separate monthly credits on June 15. Watch the new bucket closely, since it changes the cost model for autonomous runs and subscription-backed harnesses.

NEWS13th May
Anthropic raises Claude Code weekly limits 50% through July 13

Anthropic increased Claude Code weekly limits 50% for Pro, Max, Team, and seat-based Enterprise users through July 13. The higher cap stacks on last week's 2x five-hour increase and applies across CLI, IDE extensions, desktop, and web.

NEWS1w ago
Anthropic doubles Claude Code 5-hour limits after SpaceX Colossus 1 compute deal

Anthropic said a SpaceX compute deal will add 300+ MW and 220,000+ NVIDIA GPUs, and it doubled Claude Code 5-hour limits across paid plans. It also raised Opus API ceilings; users should still watch the unchanged weekly caps.

NEWS1w ago
Claude Code users report HERMES.md extra billing and ban appeals

Users on Hacker News and Reddit reported a reproduced HERMES.md extra-usage billing bug, plus new ban appeals and repeated blame-shifting complaints. Anthropic says affected users will get refunds and credits, so teams should keep an eye on quota routing and support escalation.

NEWS1w ago
Claude Code users report keyword-trigger billing after Opus 4.7 rollout

Days after Opus 4.7 launched, users reported commit-message triggers tied to OpenClaw or HERMES markers that could route requests into extra billing or refusals, alongside continued throttling complaints. Anthropic says affected users will get refunds, but repo-scanning heuristics may still affect cost and reliability in multi-harness workflows.

NEWS2w ago
Opus 4.7 users report OpenClaw refusals, cache TTL spikes, and billing lockouts after launch

A day after Opus 4.7 launched, users reported OpenClaw-linked refusals, cache TTL cost spikes, and billing failures in Claude Code. Anthropic appears to have eased some limits, but behavior and spend still vary sharply across agent-heavy sessions.

NEWS2w ago
GitHub Copilot introduces usage-based billing on June 1, 2026

GitHub says Copilot will shift from flat-rate plans to usage-based billing starting June 1 as agentic features expand. The change makes token budgeting a first-order engineering constraint and adds more pressure on teams comparing Copilot with other coding agents.

NEWS2w ago
Codex raises paid-plan limits after GPT-5.5 shipping week

OpenAI reset Codex rate limits across all paid plans after a week of GPT-5.5 shipping. The temporary bump changes immediate capacity for active teams, but it was announced as a celebratory reset rather than a permanent quota change.

NEWS3w ago
Anthropic reports Claude Code regressions after March 26 thinking bug and xhigh default shift

Anthropic said three harness-side changes degraded Claude Code quality, then reset subscriber limits and rolled out fixes in 2.1.119. The update matters because recent failures came from tool defaults and prompt handling rather than the base model alone.

NEWS3w ago
Codex reaches 4 million weekly users and resets rate limits

OpenAI said Codex passed 4 million weekly users less than two weeks after clearing 3 million, and then reset usage limits again. The scale jump matters because it points to rapid coding-agent adoption and likely plan and capacity changes.

NEWS3w ago
Opus 4.7 users report instruction-following misses, refusals, and ~1.3x token burn a day after launch

A day after Opus 4.7 launched, users are surfacing adaptive-thinking misses, surprise refusals, and higher token use. For engineers, recheck prompts, costs, and 4.6 fallbacks while Anthropic patches bugs and lifts limits.

NEWS4w ago
Claude Code raises Opus 4.7 subscriber limits after token burn increases

Anthropic raised Claude subscriber limits and shipped Claude Code 2.1.112 after Opus 4.7's adaptive thinking and tokenizer changes increased token use. Users still report fast quota depletion and inconsistent cache or effort behavior across web and CLI sessions.

NEWS4w ago
Claude Code users report 5-minute cache TTL and quota-meter regressions after March updates

GitHub issues and Hacker News threads added fresh evidence that Claude Code sessions still burn quota unexpectedly after the cache TTL change, with some users seeing usage before a prompt is sent and others recovering capacity by rolling back to 2.1.34. Watch cache reuse and metering behavior closely if you rely on long-running sessions.

NEWS4w ago
Claude Code users report a 5-minute cache TTL and 5x Pro Max quota burn in 1.5 hours

Anthropic acknowledged a March 6 cache optimization change, and Pro Max users report that the shorter TTL plus hidden session context now burns through Claude Code quota much faster. Watch for 500 errors and stalled streams, and apply the 2.1.105 patch if your UI hangs.

NEWS4w ago
Claude Code reports Opus 4.6 quality drop as BridgeBench retest falls to 68.3%

Fresh retests and issue threads point to worse Claude Code behavior, with Opus 4.6 falling to 68.3% on BridgeBench and users surfacing buried reasoning-effort controls. Track quota burn, hidden effort settings, and rollback reports before assigning more coding-agent work.

NEWS1mo ago
OpenAI launches $100 ChatGPT Pro tier with 5x more Codex usage

OpenAI added a $100 ChatGPT Pro tier with 5x more Codex usage than Plus and kept the $200 tier as the highest-capacity option. The new tier resets Codex limits again and temporarily doubles Pro usage through May 31.

NEWS1mo ago
OpenAI resets Codex usage limits after 3 million weekly users

OpenAI said Codex reached 3 million weekly users and reset usage limits, with another reset planned for each additional million users up to 10 million. ChatGPT-sign-in Codex will also retire the gpt-5.2 and gpt-5.1-era lineup on April 14, so teams should watch for model-default changes.

NEWS1mo ago
Anthropic cuts Claude subscription access for third-party harnesses in Apr. 4 rollout

Anthropic’s Apr. 4 cutoff for using Claude subscriptions through OpenClaw-class harnesses went live. Users report API-billing fallbacks, ACP workarounds, and restored Claude Code quota, while edge cases around claude -p and Agent SDK use remain unsettled. The change pushes heavy agent loops toward metered access.

NEWS1mo ago
Anthropic cuts Claude subscription access for third-party harnesses on Apr. 4

Anthropic said Claude subscriptions will stop covering third-party harnesses such as OpenClaw on Apr. 4, with discounted extra-usage bundles, refunds, and one-time plan credits. Heavy Claude-based agent workflows may need to move to API billing or extra-usage bundles because Anthropic cites subscription capacity constraints.

NEWS1mo ago
OpenAI resets Codex usage limits across all plans after a rate-limit spike

OpenAI reset Codex usage limits across all plans after dashboards showed more users hitting caps and the team said it still did not fully understand the trigger. Use the reset to recheck capacity assumptions, since OpenAI also said it banned abuse accounts and March’s repeated resets point to a broader capacity issue.

RELEASE1mo ago
Claude Code fixes prompt-cache bugs in 2.1.88 after quota-burn reports

Claude Code 2.1.88 added fixes for prompt-cache misses, repeated CLAUDE.md reinjection, and a multi-schema StructuredOutput bug after widespread reports of unexpectedly fast quota consumption. Update if you rely on long sessions, because uncached runs can burn through paid limits much faster than intended.

NEWS1mo ago
Claude Code limits concurrent work as users report weeklong waits and missing desktop threads

Users report stricter Claude Code request caps, weeklong cooldowns, and desktop threads disappearing after restarts. Watch quotas closely and shift to lighter models or token-cutting workflows around /context and /clear if the limits hit your workflow.

NEWS1mo ago
Claude Code limits concurrent agents as users report RPM caps

Users report new request-per-minute caps that trigger after three to four concurrent agents, and Boris Cherny says efficiency work is underway. The issue hits the multi-agent workflows Anthropic has been promoting, separate from five-hour usage buckets.

NEWS1mo ago
Anthropic limits Claude 5-hour sessions as users report 529 overloads

Anthropic confirmed new peak-time metering that burns through 5-hour Claude sessions faster, and multiple power users posted 529 overloaded errors and early exhaustion. If you rely on Max plans for coding, watch for session limits and consider moving daily work to Codex.

NEWS1mo ago
Anthropic limits Claude 5-hour sessions during 5am-11am PT peak window

Anthropic said free, Pro, and Max users will hit 5-hour Claude session limits faster on weekdays from 5am to 11am PT, while weekly caps stay the same. Shift long Claude Code jobs off-peak and watch prompt-cache misses.

RELEASE2mo ago
Z.ai releases GLM-5-Turbo with 202K context for OpenClaw-style agent workflows

Z.ai released GLM-5-Turbo as a faster GLM-5 variant for OpenClaw-style tool use, with 202K context, OpenRouter access, and higher off-peak limits. Try it as a cheaper speed tier for agent workflows, but benchmark completion quality on your own tasks before wider use.

NEWS2mo ago
Anthropic raises Claude off-peak usage 2x across Free, Pro, Max, and Team through Mar. 27

Anthropic is doubling Claude usage outside peak hours from Mar. 13 to Mar. 27, with the bonus applied automatically across Free, Pro, Max, Team, and Claude Code. Shift long runs and bulk jobs to off-peak windows to stretch limits without changing plans.

RELEASE2mo ago
xAI releases Grok 4.20 Beta API with 2M context and $2 input pricing

xAI released Grok 4.20 Beta in the API with reasoning, non-reasoning, and multi-agent variants, a 2M-token window, and lower pricing than Grok 4. Test it for long-context and speed-sensitive workloads, but compare coding performance against top rivals on your own evals.

NEWS2mo ago
Google adds Gemini API spend caps in AI Studio with project-level dollar limits

Google AI Studio now lets developers set experimental per-project spend caps for Gemini API usage. Use it as a native billing guardrail, but account for roughly 10-minute enforcement lag and possible batch-job overshoot.

NEWS2mo ago
Codex reports choppy service as demand outpaces added compute

OpenAI says Codex capacity is lagging a demand spike, leaving some sessions choppy while the team adds more compute. If you depend on Codex in production workflows, plan for transient instability and keep fallback review or execution paths ready.

NEWS2mo ago
Codex reports session hang incident and rate-limit reset after fix

OpenAI acknowledged a Codex session hang that left some requests unresponsive, later said the issue had been stable for hours, and promised a rate-limit reset. Teams relying on Codex should re-check long runs and confirm quota restoration after the incident.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.