updateMarch 10, 2026

Codex reports choppy service as demand outpaces added compute

OpenAI says Codex capacity is lagging a demand spike, leaving some sessions choppy while the team adds more compute. If you depend on Codex in production workflows, plan for transient instability and keep fallback review or execution paths ready.

3 min read

Codex reports choppy service as demand outpaces added compute

TL;DR

OpenAI says Codex demand has outrun newly added capacity, with product lead Thomas Sottiaux writing that the team is "adding compute as fast as we can" but service is "a little bit choppy for some" because usage is "surging faster than anticipated" capacity update.
A later update from the same team said the Codex "GPU fleet is still melting" and that the team was working "day (and night)" on stability, with improvement expected later that evening.
User reports suggest the strain is showing up in real workflows, not just in background metrics: one practitioner said Codex's multi-agent setup "really improves efficiency" but needs better handling for reusing or shutting down prior agents in long-running sessions multi-agent feedback.
Anecdotal screenshots also point to heavy consumption and visible limits in coding-agent workflows, with one post showing a current-session bar plus weekly plan caps already at 92% for "All models" usage screenshot.

What has OpenAI actually confirmed?

OpenAI has confirmed a straightforward capacity problem: demand for Codex is rising faster than the team can provision compute. In Sottiaux's first update, the capacity note says OpenAI is "adding compute as fast as we can" but that service may be "a little bit choppy for some," which frames the issue as infrastructure saturation rather than a feature rollback or isolated outage.

A few hours later, the follow-up made the bottleneck more explicit by saying the "GPU fleet is still melting" and that the team was working continuously to catch up. That wording matters for engineers operating Codex-backed workflows: the observed degradation appears tied to fleet exhaustion under demand spikes, with stability described as improving but not yet fully restored at the time of posting.

How is the capacity crunch showing up in practice?

User feedback points to stress in longer, more stateful sessions. In one example, a practitioner report says Codex's multi-agent setup is "great" and "really improves efficiency," but also says the agent needs guidance on when to reuse or shut down previous agents. The attached screenshot shows repeated "Agent spawn failed" messages alongside attempts to route work to existing agents, which suggests concurrency and agent-lifecycle behavior are part of the current rough edges under load [img:0|spawn failure logs].

Other posts show how quickly users are running into visible usage ceilings. The shared screenshot shows a session at 31% of its current bucket and weekly usage at 92% for "All models," while another user described checking a weekly token budget as "sticker shock." Those are anecdotal rather than platform-wide metrics, but together they match OpenAI's own description of demand surging faster than capacity and help explain why service quality is feeling uneven in active coding sessions.

TL;DR

What has OpenAI actually confirmed?

How is the capacity crunch showing up in practice?

Discussion across the web