releaseJune 9, 2026

OpenAI launches GPT-5.5 with API rollout and a >20% token-speed claim

OpenAI rolled GPT-5.5 into ChatGPT, Codex, and then the API, while launch discussion focused on benchmark tradeoffs and a claim that custom scheduling heuristics improved generation speed by over 20%. Teams should watch access timing, real task cost, and Codex-based workarounds during evaluation.

4 min read

OpenAI launches GPT-5.5 with API rollout and a >20% token-speed claim

TL;DR

OpenAI shipped GPT-5.5 into ChatGPT and Codex on April 23, then updated the rollout a day later to say both GPT-5.5 and GPT-5.5 Pro were live in the API, according to the launch summary.
The launch positioned GPT-5.5 as a stronger agentic coding model, while the HN discussion digest quickly fixated on a narrower question: how much of the improvement came from model quality versus harness and serving work.
One of the more interesting claims came from infrastructure notes highlighted in the HN core summary, which pointed to Codex using production-traffic analysis and custom scheduling heuristics to improve token generation speed by more than 20%.
Early discussion in the HN discussion digest also split on practical evaluation details, including missing day-one API access, Codex-based workarounds, and whether per-token pricing says much about real per-task cost.

You can read the official launch post, skim the system card, check the API model page, and then drop into the main HN thread. The weirdly useful bit is that OpenAI paired the model announcement with a concrete serving claim about load-balancing heuristics, while the API docs quietly add migration advice like a default medium reasoning setting and a 1M context window in the Responses and Chat Completions APIs.

What shipped

Introducing GPT‑5.5

OpenAI announced GPT‑5.5 as its next model release, describing it as “the smartest and most intuitive to use model yet” and positioning it as a step toward a new way of getting work done on a computer. OpenAI says GPT‑5.5 is being released with its strongest safeguards to date, including evaluations across safety and preparedness frameworks, work with internal and external red-teamers, targeted testing for advanced cybersecurity and biology capabilities, and feedback from nearly 200 trusted early-access partners. Rollout details in the post state GPT‑5.5 is rolling out in ChatGPT and Codex to Plus, Pro, Business, and Enterprise users, with GPT‑5.5 Pro rolling out to Pro, Business, and Enterprise users in ChatGPT. The post also notes an update (April 24, 2026) that GPT‑5.5 and GPT‑5.5 Pro are available in the API, with the system card updated to describe additional safeguards for API deployment; it further provides API availability timing and pricing information for GPT‑5.5 (and GPT‑5.5 Pro) and indicates plans to bring both to the API “very soon.”

OpenAI's launch post said GPT-5.5 was rolling out to Plus, Pro, Business, and Enterprise users in ChatGPT and Codex, while GPT-5.5 Pro was limited to Pro, Business, and Enterprise inside ChatGPT at launch. The same post was updated on April 24 to say both models were available in the API, with additional safeguards documented in the revised system card.

The API pricing disclosed in the launch post was straightforward:

gpt-5.5: $5 per 1M input tokens, $30 per 1M output tokens, per OpenAI's launch post
gpt-5.5-pro: $30 per 1M input tokens, $180 per 1M output tokens, per OpenAI's launch post
Batch and Flex pricing were listed at half price for GPT-5.5, also in the launch post
The API surface was the Responses API and Chat Completions API, with a 1M context window, per the launch post and model page

Benchmarks and throughput

Discussion around GPT-5.5

Thread discussion highlights: - 6thbit on benchmarks vs Anthropic: Posted side-by-side benchmark numbers for GPT-5.5 vs Anthropic’s Mythos, concluding it is “still far from Mythos on SWE-bench but quite comparable otherwise.” - simonw on API availability and tooling: Says GPT-5.5 “doesn't have API access yet,” but notes a Codex-based workaround and shows using a new plugin to access it anyway. - minimaxir on infrastructure optimization: Highlights the claim that Codex used production traffic analysis and custom heuristics to improve token generation speed by over 20%, calling that more interesting than the benchmark claims.

OpenAI's own chart gave GPT-5.5 a clean win on Terminal-Bench 2.0 at 82.7%, and described it as the company's strongest agentic coding model to date in the launch post. The same chart put SWE-Bench Pro at 58.6%, which is where the launch thread got less celebratory.

According to the HN discussion digest, one top commenter summed up the release as still trailing Anthropic's Mythos on SWE-Bench while looking competitive elsewhere. That matches the launch-day pattern: stronger breadth, less of a knockout on the benchmark many coding teams look at first.

GPT-5.5

Relevant for engineers evaluating frontier-model tradeoffs: benchmark positioning, API rollout timing, Codex/tool access, pricing vs actual task cost, and the claim that OpenAI improved throughput with custom scheduling heuristics.

The more novel detail sat under the performance story. As the HN core summary noted, OpenAI said Codex analyzed weeks of production traffic and wrote custom load-balancing and partitioning heuristics, improving token generation speed by more than 20%; the launch post ties that work to matching GPT-5.4's per-token latency while running a more capable model.

API access and pricing

GPT-5.5

The launch thread captured a small rollout snag that matters if you were trying to test on day one. According to the HN discussion digest, one practitioner noted that GPT-5.5 was not initially exposed in the public API even though it was live in Codex, and pointed to a Codex-based workaround before the April 24 API update landed.

Pricing also got an immediate reality check. As the HN core summary recounts, commenters argued that per-token rates are the wrong unit for model comparisons when models produce different amounts of reasoning and output tokens per task. That argument lands harder here because OpenAI marketed GPT-5.5 as more effective in long, tool-heavy runs, which is exactly where token totals get slippery.

Migration defaults

The API docs add a quieter story than the launch post. In OpenAI's GPT-5.5 usage guide, the company says to treat GPT-5.5 as a new family to tune for rather than a drop-in replacement for GPT-5.2 or GPT-5.4.

The concrete changes called out there are worth noting:

reasoning_effort now defaults to medium, per the usage guide
OpenAI says GPT-5.5 reaches strong results with fewer reasoning tokens than prior models at the same effort level, per the same usage guide
The model page lists a 1M context window and 128,000 max output tokens
The model page also notes higher pricing for prompts above 272K input tokens, 2x on input and 1.5x on output for the full request

That last threshold is the kind of caveat that disappears in launch coverage and shows up later in bills.