OpenAI Codex
OpenAI's coding agent
OpenAI's Codex coding product for software development assistance and task execution.

Recent stories
OpenAI said Codex accounts were seeing faster usage draining than intended because abuse and fraud checks were overflagging some sessions, then issued a usage reset for all users. It matters because paid Codex workflows were losing quota unexpectedly mid-run, directly affecting reliability and cost.
OpenAI published usage data showing Codex now generates 99.8% of its internal AI output tokens, with sharp growth in legal, support, recruiting, and finance. The report measures agent adoption as delegated parallel work, not just chat inside engineering.
Codex can now hand off an in-progress thread between local and remote machines and bring it back later. It matters because the handoff carries Git history, branches, and uncommitted changes while leaving the destination checkout untouched.
TryCua brought Cua Driver to Linux, letting Claude Code, Codex, Hermes, and custom agents control real desktop apps via CLI or MCP without taking over the main terminal. The release also adds headless SSH execution and a preview of multi-window Wayland control across supported distros.
OpenHands added Agent Client Protocol support to its Agent Canvas, SDK, and Cloud, letting teams run different coding agents through one interface across local, remote, and cloud backends. The release also underpins new OpenHands Index results, so teams can compare harness-plus-model combinations instead of model-only runs.
OpenAI added Record & Replay to Codex so users can demonstrate a repetitive computer task once and save it as a reusable skill. The first rollout is Mac-only and unavailable in the EEA, UK, and Switzerland, so teams should check access before planning rollout.
Databricks open-sourced Omnigent, a meta-harness that runs Claude Code, Codex, Cursor, Pi, and custom agents in one live session with a collaborative web UI. The release centralizes supervision, cost control, and cross-agent review instead of splitting work across separate tools.
Codex workflows can now run against open-weight models served through compatible Responses API endpoints, with Ollama and vLLM publishing direct paths for GLM-5.2 and Kimi K2.7 Code. That matters because teams can keep the Codex interface while swapping to self-hosted or lower-cost inference backends.
ENPIRE launched a physical autoresearch setup that gives eight Codex agents robots, GPUs, and real-world APIs for tasks like zip ties and part sorting. It matters because it moves long-horizon agent evaluation from browser-only loops into embodied experimentation with explicit safety controls.
OpenAI expanded Codex in Europe with Computer Use, the Chrome extension, Memory, and Chronicle. The rollout broadens browser and desktop automation outside the U.S., though some memory features remain opt-in or preview-only.
Codex users are having the agent write its own `/goal` and sub-agent goals, with OpenAI-side commentary describing that as a built-in meta-prompting pattern. The workflow turns long autonomous runs into a tighter control loop, but users still review goals first so a bad objective does not burn tokens for hours.
AI SDK canary added HarnessAgent, a unified abstraction that runs Claude Code, Codex, and Pi in sandboxed sessions with AI SDK-compatible streams. One integration can now target multiple agent harnesses without separate model-specific plumbing.
OpenAI shipped a docs agent that can hand off guides to Codex, and users published Appshots, browser-control, parallel PR, and multi-tree workflows. Watch the examples for ways to structure Codex around orchestrated tasks, while code-review and plugin gaps remain.
OpenAI started rolling out bankable Codex resets to Go, Plus, Pro, and Business users, plus a two-week referral program that can add more resets. That lets users save capacity for heavier Browser use and longer Codex sessions instead of losing resets on a fixed clock.
OpenAI said it will acquire Ona and fold its secure cloud execution and orchestration stack into Codex. The change targets agent jobs that need to keep running for hours or days after the original laptop session ends.
Users are using Fable 5 as a planner and long-run orchestrator while pushing implementation and heavy reasoning to Opus and Codex. The setup keeps Fable on supervision and planning, so teams can track execution through live status pages on larger tasks.
One day after Fable 5 launched, users reported burning through Max quotas in about 90 minutes while Anthropic told subscribers the model will leave Claude plans on June 23 until capacity improves. If you depend on Fable, plan for quota pressure and route critical jobs elsewhere.
Builders shipped OpenProse workflow files, ghzinga PR tabs, cmux terminal controls, datasette-agent-edit primitives, and an agent-optimized CLI fork. These pieces turn prompt strings into reusable files, panes, and testable edit loops for coding agents.
A community workflow broke long-running Codex goals into six required fields, then added an eight-item preflight checklist and helper tools. The structure is meant to reduce runs that drift, stop early, or claim completion without an objective verification step.
Helmor released an open-source mobile client that exposes Claude Code, Codex, OpenCode, and custom model backends behind a phone-first UI plus one-click Cloudflare Tunnel setup. The launch targets remote coding sessions from a handset instead of a laptop-only agent workflow.
Codex usage moved further into phone-first workflows, with iOS dictation loops, background voice capture, and app updates like searchable settings and restored state. The comparisons still flag rough spots in multi-thread UX, Windows support, and cases where CLI tabs or cloud agents are easier to manage.
MagicPath launched as an official Codex plugin, adding a shared canvas for interactive UI work, repo imports, design-system context, and image generation inside Codex. It matters because Codex now has a native surface for design-and-build loops instead of limiting collaboration to chat and code diffs.
OpenAI expanded the Build iOS Apps plugin so Codex can test apps in an in-app browser, open SwiftUI previews, and hot-reload edits without leaving Codex. It matters because more of the iOS iteration loop stays inside the coding agent instead of bouncing through external simulators and manual preview steps.
A day after Codex users reported outages and caps, OpenAI said the service had three separate incidents and later disclosed a bug that undercounted tokens for some Plus and Pro accounts, while users reported paid-plan quotas reset. The update matters because Codex operators saw both service instability and account-limit changes in the same 24-hour window.
Uber set a $1,500 monthly limit for each AI coding tool an employee uses, covering products such as Cursor and Claude Code. The cap gives enterprises an early benchmark for coding-agent spend as token costs outgrow typical software-seat budgets.
Users reported outages, tighter 5-hour caps, and token availability problems a day after OpenAI launched Codex Sites and plugins. OpenAI reset Codex usage limits after three incidents, so teams should watch quotas and backend reliability as agent workflows ramp up.
OpenAI rolled out Codex Sites, annotations, and role-specific plugins, while weekly users topped 5 million. The release pushes Codex beyond coding into hosted workspace and app workflows for enterprise teams.
Cognition added a desktop control surface that can run Devin, Codex, Claude, and other ACP-compatible agents across local and cloud contexts. The app turns Devin from a single hosted agent into a broader orchestration surface.
OpenAI made GPT-5.4, GPT-5.5, and Codex generally available through Amazon Bedrock. AWS shops can now use OpenAI models inside existing IAM, compliance, and procurement workflows instead of adopting a separate vendor stack.
OpenAI shipped a Python SDK and app-server support for Codex with thread creation, streamed turns, session resume, image inputs, and sandbox controls. That gives teams a supported way to embed Codex inside internal tools and automation instead of driving it only through the CLI or desktop app.
Independent users compared GPT-5.5/Codex with Opus 4.8/Claude Code using DeepSWE cost charts, GBA Eval runs, and long coding sessions. The split matters because engineers choosing a daily coding stack now have external quality-versus-cost evidence instead of only vendor launch claims.
Independent developers shipped sidecars that let Claude Code, Cursor, and Codex share memory, hot-swap model providers, package local projects as apps, and automate browser QA. Try these reusable tools if you want memory, routing, QA automation, and app packaging outside editor-specific features.
OpenAI restored Codex weekly and hourly quotas across paid ChatGPT plans after Tibo Sottiaux said the product hit 5 million users. Watch for long-running QA loops, migration PRs, and remote desktop sessions that can still burn through quotas fast.
Builders added /dynamic orchestration, custom-model routing, and repo runbooks around Codex as users exposed new session lifecycle controls in the app. That makes Codex a better fit for long-running, multi-context coding work.
Codex on iOS now supports side conversations, end-of-turn diff summaries, archived remote threads, model switching, and Spotlight or Shortcuts hooks. The update brings more desktop-style task steering and change review to mobile sessions.
A day after Claude Code introduced Dynamic Workflows, builders shipped ports and clones for Codex, Conductor, and GLM-backed CC Mirror. The rapid ports turn the feature into a reusable orchestration pattern rather than an Anthropic-only runtime.
OpenAI added computer use to Codex on Windows and lets ChatGPT mobile steer tasks running on Windows PCs. The update extends Codex to existing Windows dev machines and adds remote review and debugging from mobile.
OpenAI said ChatGPT-linked Codex will drop GPT-5.2 and GPT-5.3-Codex on June 2, with GPT-5.5 becoming the default frontier model for free users. The API versions stay available, but the in-product model surface is being reduced for compute-fleet management.
OpenAI said ChatGPT, Codex, and the Responses API can reach private MCP servers over outbound-only HTTPS without inbound exposure. The same enterprise update adds workload identity federation plus admin controls for spend alerts, allowlists, retention, and hosted tools.
Cua Driver said its Windows backend is now stable, letting Claude Code, Codex, Hermes, or custom agents drive real Windows apps through MCP or CLI. The release targets Windows-only line-of-business software while keeping the desktop usable with multi-pointer support.
OpenAI and Thrive described Tax AI, a self-improving tax-prep system used across 30+ firms that processed 7,000 returns and reached up to 97% accuracy. The loop turns accountant corrections into eval targets and narrow Codex fixes, showing a concrete path to vertical agents that improve after deployment.
Microsoft Research released SkillOpt, which optimizes external skill files instead of fine-tuning model weights and reports best-or-tied results across 52 evaluation cells. The method matters because it improved Codex and Claude Code accuracy without extra inference-time calls.
Practitioners published tests-first coding-agent workflows built around red-green TDD, Hurl suites, GitHub label actions, and Codex-based execution checks. The pattern matters because verification remains the main bottleneck once generation is fast, especially in longer multi-file sessions.
Practitioners published reusable Codex workflows for project audits, memory-driven skill packaging, mobile delegation, and remote computer use. Try the prompt-and-steps patterns if you want to adapt Codex across repos and devices.
Independent Codex users published Obsidian memory setups, reusable skill prompts, auto-triage flows, and Cloudflare-backed runners for longer jobs. That matters because Codex is being wrapped into persistent workspaces and operator-defined subagents instead of one-shot chats.
Two days after Codex added locked-Mac control and Appshots, users posted end-to-end iPhone simulator debugging, Safari form-filling, and remote-control workflows. That matters because the feature is moving from launch copy into concrete computer-use tasks that can replace manual QA and repetitive UI work.
OpenAI said a recent Codex optimization lowered cache-hit rates in long-running sessions, drained limits faster, rolled it back, and reset all accounts. That matters because compaction and cache behavior directly determine quota burn and session reliability.
New guides, plugins, and reusable libraries show the Agent Skills format moving beyond Claude Code into multiple coding-agent clients and runtimes. That matters because workflows are becoming portable artifacts instead of one-off prompts tied to a single harness.
Developers say Codex v0.133.0 improved compaction, remote-control workflows, and Chrome-driven Colab runs after `/goal` became default. The same update window also brought easier skill discovery and new diff options, though some users saw approval-pause regressions in full-access mode.
OpenAI shipped a Codex update that lets the mobile app control a locked Mac, adds Appshots for screen context, and graduates /goal. It also adds browser annotation tools, team plugin sharing, and expanded analytics for business users.