OpenAI Codex
Coding agent
OpenAI's coding agent for software engineering tasks such as generating code, fixing bugs, answering codebase questions, and reviewing changes.

Recent stories
OpenAI launched a 30-day migration offer that grants eligible enterprise customers two free months of Codex usage for new users. The promotion is meant to pull coding teams onto Codex as rival agent workflows get more expensive.
OpenAI detailed the Windows sandbox behind Codex, using local user accounts, ACLs, firewall rules, and DPAPI-protected secrets instead of a generic VM wrapper. The design gives Windows developers safer file and network controls without making coding-agent workflows unusable.
OpenAI showed Codex working across apps in the background without taking over the Mac, and early users applied it to Telegram BotFather setup and front-end testing. That matters because Codex is moving from repo-only work into authenticated desktop workflows and UI-driven task loops.
Artificial Analysis launched a Coding Agent Index for model-and-harness pairs, while OpenHands refreshed its model leaderboard. The results show harness choice matters, with cost varying over 30x and task time over 7x across stacks.
OpenAI launched Daybreak, combining GPT-5.5, Codex workflows, repo scanning, threat modeling, and patch generation for cyber-defense teams. It packages frontier models into a continuous secure-software workflow, so teams can test whether it fits their response pipeline.
Independent developers shipped new control-plane tools for long-running coding agents, including Agent FM audio monitoring, Mate phone-first remote control, and ntm for provider-agnostic multi-agent workflows. It matters because teams running many Claude Code and Codex sessions still need better visibility, handoff, and checkpointing than a single built-in session list provides.
Crabbox 0.11.0 shipped a Google Cloud provider, repo-local job workflows, AWS Windows WSL2 hydration, and a Blacksmith sync-stall guard. Recent Codex and OpenClaw posts show Crabbox already being used for reproducible bug repro and recorded QA before-and-after runs.
OpenAI staff said /goal is now available in the Codex app, and users posted long-running runs that fixed React Doctor scores, built iOS features, and queued weekend tasks. The update moves Codex from CLI-only planning to persistent, steerable work sessions.
Engineers shared fresh measurements on GPT-5.5 cache reuse, /fast pricing, and bug-finding budgets after comparison posts for GPT-5.5 and Opus 4.7 led the coding round-up. The reports suggest Codex cost and quality now swing on cache behavior and effort settings as much as on list prices.
Posts and screenshots from TestingCatalog, Kolt Regaskes, and others say Codex remote access is being prepared inside the ChatGPT app, but OpenAI has not confirmed a release. If real, the feature would extend the recent remote-control push from desktop sessions to phones.
User posts and HN threads compared GPT-5.5 and Opus 4.7 across plan mode, frontend work, and 120K-context sessions. The split results mean token burn and instruction discipline matter as much as raw benchmark scores.
A day after `/goal` and remote-control preview surfaced, Codex 0.130.0 shipped a simpler headless entrypoint while the app’s migration tool added Code and Cowork support. Users also showed Codex handling bug repro, long-running `/goal` sessions, and plugin-driven expense filing, which broadens its role from chat-first coding to delegated workflows.
OpenAI reports Codex can now keep pursuing a goal until an end state and is adding remote control plus a usage tab. The update matters because Codex sessions can span longer tasks and be managed across devices with less manual babysitting.
OpenAI shipped a Chrome extension for Codex on macOS and Windows that can work across logged-in sites and multiple background tabs. It should speed up testing, data entry, and other web app tasks by letting Codex run more parallel browser work.
Users posted long-running Codex `/goal` sessions with auto-continuations, `pause`/`resume`, and file-backed goals. Watch the 4,000-prompt startup cap and early-stop drift if you plan to run longer agent loops.
OpenAI said Auto-Review is now the default inside Codex after an internal rollout cut needed approvals by about 200x. The shift moves more coding-agent work into guarded review loops with policy and egress controls.
Crabbox 0.4.0 adds throwaway machines for agent runs and cross-platform reproduction on macOS, Linux, and Windows. Use it to reproduce bugs and validate fixes without keeping long-lived cloud sessions around.
Independent builders shipped a Codex security-review pack, planning and annotation integration, and `dcg` safety-hook support in the same window. The burst matters because review, guardrail, and workflow tooling is forming around Codex beyond OpenAI’s own releases.
Developers posted side-by-side reports of faster one-shot fixes, 1.7B-token workdays, and fewer limit warnings with GPT-5.5 fast mode after OpenAI added Claude Code import. The comparisons matter because they turn migration talk into a concrete workflow choice.
OpenAI and community posts showed a new Codex pet layer built around `/hatch`, sprite-sheet generation, active-chat replies from the pet UI, and public pet galleries like Petdex. The feature matters because it turns Codex skills into a reusable UI-extension surface, not just a chat interface.
OpenAI added one-click import for settings, plugins, agents, and project config into Codex, and users reported cleaner workflows with visible subagents and in-chat CI status. That reduces setup friction for existing agent stacks, and OpenAI says Codex revenue doubled in under seven days.
OpenAI added an opt-in security mode for ChatGPT and Codex that disables password-based recovery, shortens sessions, and requires passkeys or physical keys. Higher-risk accounts get stronger phishing resistance and automatic exclusion from model training when the mode is enabled.
OpenAI expanded Codex with role-based work-flows, app connections, in-app previews, and the `/goal` command, while also improving browser use by about 20%. The update lets Codex keep working across docs, slides, spreadsheets, and web actions instead of staying in a single coding thread.
OpenAI added WebSocket mode to the Responses API and says it cuts repeated work across Codex tool loops, improving end-to-end speed by up to 40%. The change reduces runtime overhead for agent workflows, not just base-model latency.
AWS and OpenAI moved their expanded partnership into limited preview, bringing OpenAI models, Codex, and Bedrock Managed Agents onto AWS. That gives teams a direct AWS path for OpenAI-backed agent workflows instead of waiting on the earlier coming-soon timeline.
Codex gained background macOS control, page inspection, image generation, plugins, artifacts, and follow-up automations. That gives it one agent thread for desktop apps, frontend debugging, and recurring work.
OpenAI released Symphony, an orchestration layer that turns issue trackers into Codex agent queues for PR generation and review. Early users say it can move many tickets in parallel, but token burn rises quickly when agents fan out.
OpenAI reset Codex rate limits across all paid plans after a week of GPT-5.5 shipping. The temporary bump changes immediate capacity for active teams, but it was announced as a celebratory reset rather than a permanent quota change.
New evals and day-three user tests show GPT-5.5 performing well at low or medium reasoning, with benchmark gains over GPT-5.4 in coding-heavy use. That matters because stronger results no longer require xhigh runs, though some users still flag sycophancy.
OpenAI docs say Codex image generation counts against general usage and burns included limits 3-5x faster, while users showed app-server runs with 32 or 64 parallel workers. The workflow turns bulk image or research jobs into quota-backed batches, so teams should watch usage spikes closely.
Users and third-party evals reported shorter runs, stronger long-context scores, and faster rollout into Cursor and other tools a day after GPT-5.5 hit the API. Higher per-token pricing may be partly offset by lower loop time and fewer tool-call stalls, so watch early bench data before changing defaults.
Users reported higher token use, partial long-document reviews, and rising spend on routine tasks after Claude Code regressions came into focus. Some developers still get strong results in constrained harnesses, but others may want to switch to Codex for long-running work.
Steipete’s maintainer bot ran 50 Codex agents in parallel and closed about 4,000 OpenClaw issues in a day. The cleanup pushed into rate limits, so use the README dashboard and Project Clowfish clustering to track large agent sweeps.
A day after GPT-5.5 and the new Codex workflows launched, developers reported one-shot bug fixes, longer unattended runs, and lower token use in real coding tasks. The early hands-on comparisons matter because they are already shifting some teams' default agent workflow away from Claude Code.
Cua Driver open-sourced a macOS driver that lets agents control apps in the background with multi-player and multi-cursor support. It matters because it turns background computer use from an app-specific feature into a reusable primitive that any agent loop can adopt.
OpenAI rolled out GPT-5.5 and GPT-5.5 Pro in ChatGPT and Codex, with higher scores on terminal, OS, cyber, and math evals than GPT-5.4. Codex also gained browser, document, and computer-use features for longer agent workflows.
OpenAI introduced shared workspace agents in ChatGPT for Business, Enterprise, Edu, and Teachers plans, with Codex-powered background work across tools like Slack and Linear. The launch turns ChatGPT from a single-session assistant into a long-running team workflow surface with approvals, scheduling, and shared context.
OpenAI said Codex passed 4 million weekly users less than two weeks after clearing 3 million, and then reset usage limits again. The scale jump matters because it points to rapid coding-agent adoption and likely plan and capacity changes.
OpenAI released GPT Image 2 in ChatGPT, Codex, and the API with thinking mode and 2K outputs. Early tests and Arena scores suggest it is usable for slides, UI mockups, and dense infographic layouts.
OpenAI added Chronicle, a Codex preview that turns recent screen context into reusable memories for errors, files, docs, and workflows. The macOS Pro-only feature stores local memory unencrypted and can burn rate limits quickly, so watch prompt-injection risk before relying on it.
Practitioners shared repeatable Codex workflows for long-lived threads, background subagents, computer-use access through MCP, and canary rollouts. Codex is being used less as a one-shot assistant and more as a persistent automation harness.
Fresh hands-on reports show Codex controlling minimized apps via macOS APIs, using a DOM-aware browser comment mode, and running for day-long sessions in the desktop app. That gives OpenAI stronger evidence that computer use is usable for daily development, though the rollout remains macOS-first and brittle around working-state changes.
OpenAI expanded Codex with background Mac computer use, an in-app browser, image generation, memory preview, automations, and 90+ plugins. The release moves Codex from terminal coding toward long-running UI and ops workflows, though some features remain macOS-first or alpha.
OpenAI launched GPT-Rosalind for biology, drug discovery, and translational medicine, plus a life sciences plugin for Codex. Access starts as a trusted preview for qualified customers, so near-term use is limited to partner and enterprise workflows.
Fresh retests and issue threads point to worse Claude Code behavior, with Opus 4.6 falling to 68.3% on BridgeBench and users surfacing buried reasoning-effort controls. Track quota burn, hidden effort settings, and rollback reports before assigning more coding-agent work.
Codex 0.120 introduced per-project memory extension files and Realtime V2 progress streaming for background agents. Separate app findings also showed an unreleased Scratchpad view that can start parallel Codex chats from a task list, which may change how teams queue work.
OpenAI said a compromised third-party developer tool affected its macOS app-signing workflow and is rotating certificates for ChatGPT Desktop, the Codex app, Codex CLI, and Atlas. The company said it found no evidence of user-data access or software tampering, and older macOS app versions will stop working after the update window.
OpenAI added a $100 ChatGPT Pro tier with 5x more Codex usage than Plus and kept the $200 tier as the highest-capacity option. The new tier resets Codex limits again and temporarily doubles Pro usage through May 31.
OpenAI said Codex reached 3 million weekly users and reset usage limits, with another reset planned for each additional million users up to 10 million. ChatGPT-sign-in Codex will also retire the gpt-5.2 and gpt-5.1-era lineup on April 14, so teams should watch for model-default changes.
OpenAI rolled out Codex-only seats with pay-as-you-go pricing for ChatGPT Business and Enterprise instead of fixed bundled access. The change lowers pilot friction for teams and ties spend directly to coding usage rather than a full ChatGPT seat.