GPT-5.5
A new class of intelligence for real work
OpenAI's GPT-5.5 is a frontier language model released on April 23, 2026 for complex professional work, with a 1,050,000-token context window and availability in ChatGPT, Codex, and the API.
Pricing
Model Intelligence
Recent stories
Ahead of GitHub Copilot's June 1 usage-based billing switch, users documented GPT-5.5 sessions hitting 60M tokens and $221 across 15 messages on the legacy per-message plan. The examples show why flat message buckets break once single requests can run for hours and consume extreme token counts.
Developers posted side-by-side reports of faster one-shot fixes, 1.7B-token workdays, and fewer limit warnings with GPT-5.5 fast mode after OpenAI added Claude Code import. The comparisons matter because they turn migration talk into a concrete workflow choice.
ValsAI found that undocumented `tool_choice` behavior was skewing Terminal Bench 2 scores when no native tools were used, then reran the evals. The correction lifted GPT-5.5 by 11% to the top slot and showed how much harness settings can move coding-agent results.
ARC Prize published frontier-model results on ARC-AGI-3 and said GPT-5.5 and Opus 4.7 both stayed below 1%, with failures in world modeling, abstraction, and reward reinforcement. That shows strong coding and benchmark models still break on novel interactive reasoning tasks, and follow-up comparisons even had Opus 4.6 slightly ahead of 4.7.
OpenAI expanded Codex with role-based work-flows, app connections, in-app previews, and the `/goal` command, while also improving browser use by about 20%. The update lets Codex keep working across docs, slides, spreadsheets, and web actions instead of staying in a single coding thread.
Multiple summaries of the UK AISI report say GPT-5.5 roughly matches Claude Mythos Preview on long-horizon cyber tasks, including 2 of 10 end-to-end TLO completions. That matters because the model is broadly usable today, shifting cyber-workflow choices toward availability and mitigations rather than gated access alone.
Codex gained background macOS control, page inspection, image generation, plugins, artifacts, and follow-up automations. That gives it one agent thread for desktop apps, frontend debugging, and recurring work.
New evals and day-three user tests show GPT-5.5 performing well at low or medium reasoning, with benchmark gains over GPT-5.4 in coding-heavy use. That matters because stronger results no longer require xhigh runs, though some users still flag sycophancy.
Independent tools and platforms shipped GPT-5.5 support within a day of the API rollout, spanning IDEs, hosted research agents, enterprise stacks, and coding agents. That shortens evaluation time because teams can test the model inside existing workflows instead of rebuilding around a single OpenAI surface.
Users and third-party evals reported shorter runs, stronger long-context scores, and faster rollout into Cursor and other tools a day after GPT-5.5 hit the API. Higher per-token pricing may be partly offset by lower loop time and fewer tool-call stalls, so watch early bench data before changing defaults.
OpenAI added GPT-5.5 and GPT-5.5 Pro to the API and Playground with 1M context and Responses support. Partners including OpenRouter, Perplexity, GitHub Copilot, Vercel, Warp, and Devin rolled it out the same day, widening access beyond Codex.
Cursor 3.2 added /multitask async subagents, improved worktrees, and multi-root workspaces, then paired the release with GPT-5.5 rollout at 72.8% on CursorBench. The update makes background agent orchestration a first-class IDE workflow instead of a blocking queue.
A day after GPT-5.5 and the new Codex workflows launched, developers reported one-shot bug fixes, longer unattended runs, and lower token use in real coding tasks. The early hands-on comparisons matter because they are already shifting some teams' default agent workflow away from Claude Code.
OpenAI rolled out GPT-5.5 and GPT-5.5 Pro in ChatGPT and Codex, with higher scores on terminal, OS, cyber, and math evals than GPT-5.4. Codex also gained browser, document, and computer-use features for longer agent workflows.