releaseJune 13, 2026

GPT-5.5 launches with Codex access and SWE-bench gaps

HN discussion contrasted OpenAI's API language with reports that GPT-5.5 was mainly reachable through Codex-based paths, while benchmark tables still showed a SWE-bench gap versus Mythos. Track the access path and per-task cost as closely as raw scores.

4 min read

GPT-5.5 launches with Codex access and SWE-bench gaps

TL;DR

Official launch language said GPT-5.5 was rolling out first in ChatGPT and Codex, while the HN discussion summary quickly centered on the missing direct API path and a Codex-based workaround.
According to the HN launch summary, OpenAI positioned GPT-5.5 around coding, research, and multi-tool work, but the HN discussion summary said benchmark chatter still focused on a SWE-bench gap versus Mythos.
The same-day developer announcement said GPT-5.5 had reached the Responses and Chat Completions APIs with a 1M context window, which is why the launch-day availability story looked muddled in the main HN thread.
OpenAI's API pricing page listed GPT-5.5 at $5 per 1M input tokens and $30 per 1M output tokens, while the main HN thread pushed the more useful question: whether lower token use per task actually offsets the higher sticker price.

You can read the official launch post, the same-day developer announcement, Simon Willison's Codex-backed workaround write-up, and the official Codex pricing page. The weird bit is that OpenAI's launch post initially framed API access as coming "very soon," then the developer forum announced API availability later that day, and the launch post now carries an April 24 update saying GPT-5.5 and GPT-5.5 Pro are in the API.

What shipped

OpenAI Announces the Release of GPT-5.5 Model

Released on April 23, 2026, GPT-5.5 is an AI model designed to perform complex, agentic, real-world tasks such as coding, online research, and multi-tool document creation. The model is characterized by its improved ability to understand tasks with less guidance, its effectiveness in tool utilization, and its self-correction capabilities. OpenAI launched the model with its most comprehensive safety and security safeguards to date, following extensive internal and external red-teaming and feedback from nearly 200 early-access partners. It is available in variants including GPT-5.5 Pro and is accessible via ChatGPT, Codex, and the OpenAI API.

OpenAI's day-one pitch was straightforward: GPT-5.5 for paid ChatGPT tiers, GPT-5.5 in Codex, and GPT-5.5 Pro for higher-end ChatGPT plans, per the official launch post. The same materials tied the model to coding, online research, document work, and heavier agentic tasks, which matches the HN launch summary.

The post now also includes an April 24 note saying GPT-5.5 and GPT-5.5 Pro are available in the API. That matters mainly because it clarifies a launch-day contradiction that showed up everywhere else in the story.

API wording

Discussion around GPT-5.5

Thread discussion highlights: - 6thbit on benchmark comparisons: Shared a side-by-side table of GPT-5.5 vs Mythos on SWE-bench, Terminal-bench, GPQA, HLE, BrowseComp, and OSWorld, concluding GPT-5.5 is still behind on SWE-bench but competitive elsewhere. - simonw on API availability: Says GPT-5.5 does not have direct API access yet, but can be reached through a Codex-based workaround/plugin, and links an example output plus a plugin for LLM. - tedsanders on token efficiency: Argues the model is more expensive per token, but that per-task cost depends on how many tokens different models use to solve the same job, so price comparisons are not linear.

The cleanest way to read launch day is as a moving target. The launch post originally said API deployments required extra safeguards and would arrive "very soon," while the developer community announcement later said GPT-5.5 was live in the Responses and Chat Completions APIs with a 1M context window.

That gap explains why the main HN thread surfaced Simon Willison's Codex route so quickly. In his hands-on post, Willison described GPT-5.5 as available through Codex and paid ChatGPT before direct API access was clearly documented, then showed a plugin-based path that used a Codex subscription for API-like calls.

Benchmarks

GPT-5.5

The useful signal for engineers is less about the headline release and more about deployment reality: benchmark positioning versus Anthropic, whether GPT-5.5 is reachable through the API/Codex path, and how token efficiency changes effective task cost and throughput. The discussion also highlights OpenAI’s own optimization of GPU scheduling and generation speed, which is relevant to inference engineering and agent workloads.

The launch materials emphasized intelligence on long-running, tool-using work, but the early engineer conversation went straight to comparison tables. According to the HN discussion summary, 6thbit's side-by-side table had GPT-5.5 trailing Mythos on SWE-bench while staying competitive on Terminal-bench, GPQA, HLE, BrowseComp, and OSWorld.

That made SWE-bench the one number people kept circling, even in a release framed around broader agentic reliability. On launch day, the benchmark story was less "OpenAI shipped a new frontier model" and more "the coding leaderboard still has a live argument."

Token math and limits

OpenAI's price card set GPT-5.5 at $5 per 1M input tokens and $30 per 1M output tokens, with lower Batch pricing and higher Priority pricing on the API pricing page. The company's defense, repeated in the developer announcement, was that GPT-5.5 needs fewer tokens per task than 5.4.

GPT-5.5

That is also where the HN thread got most useful. The main HN thread linked comments arguing that per-token price is the wrong unit if one model finishes the same coding job in fewer turns, while the official Codex pricing page adds a second constraint: message limits vary with model choice, task size, context length, and whether work runs locally or in the cloud.

So the real launch-day reference point was not just benchmark rank or posted API price. It was whether GPT-5.5's token efficiency and Codex throughput were good enough to overcome both the higher rate card and the tighter-feeling usage envelope engineers were already picking apart in the main HN thread.