Users report new request-per-minute caps that trigger after three to four concurrent agents, and Boris Cherny says efficiency work is underway. The issue hits the multi-agent workflows Anthropic has been promoting, separate from five-hour usage buckets.

The reported breakage is on request rate, not just overall usage. In the original complaint, a Claude Code user said the service had become "useless" because it now "kicks in with like 3 or 4 agents going at once," and the attached terminal screenshot shows repeated "API Error: Rate limit reached" messages while four tasks are open, with only one in progress [img:0|Rate limit screenshot].
That distinction matters for engineering workflows because Anthropic's coding product has been pushing multi-agent patterns. In a clarification, the same user said they can work around ordinary usage limits by buying more Max accounts, but not RPM-style caps: "my entire workflow is to maximize the number of concurrent agents." In a later reply, they said the change had "cratered my throughput by like 99%" and argued they were now "paying for usage" they could not use effectively.
Timing is part of the complaint. In another reply in the thread, the user said the new behavior would at least make sense during "peak business hours" but was instead hitting on "Saturday night." Separately, another developer described a "peak hours penalty" strongly enough that they were moving their schedule to avoid it.
Anthropic has not publicly documented the cap in these posts, but it has acknowledged the problem. In Boris Cherny's reply, the response was brief: "Working on improving this. A bunch of efficiency wins incoming." That reads more like backend mitigation than a policy clarification, and it leaves open whether the constraint is temporary load shedding, a permanent concurrency guardrail, or both.
The timing matters because multi-agent coding is exactly where request bursts happen. Even outside the Claude Code thread, one user describing sub-agents said "mini-swarms" make it easier to burn through allotted compute, which matches the failure mode being reported here: not a single long task, but many short overlapping requests from parallel agents. That makes RPM ceilings a much bigger operational constraint than five-hour buckets for teams using Claude Code as a high-concurrency coding surface.
This is crazy, this recent introduction of ridiculously low rate limits has basically rendered Claude Code useless to me. They really need to change this or I'm going to cancel all of my accounts soon. It kicks in with like 3 or 4 agents going at once.
Japan 🇯🇵 is amazing More than 2/3rds of the country is active on X Can you imagine the innovation that must be happening there? 15%~ of my followers are from Japan too! They are the nicest to engage with, so welcoming and kind
If you’re seeing a bunch of Japanese posts, here are some fun facts: Japan has more daily active users and more time spent on X than any other country in the world. Over two thirds of the country is monthly active on X. X in Japan has one of the highest penetration rates of
Working on improving this. A bunch of efficiency wins incoming.