Skip to content
AI Primer
release

ChatGPT Images 2.0 adds gpt-image-2 with lower output pricing

OpenAI's gpt-image-2 rollout adds multilingual text, floor plans, and up to eight consistent images, while HN testers surfaced mostly unchanged API pricing with a lower output cost. The release matters for comics, infographics, and other layout-heavy work, and its stronger photorealism arrives as platforms tighten disclosure rules for published AI media.

5 min read
ChatGPT Images 2.0 adds gpt-image-2 with lower output pricing
ChatGPT Images 2.0 adds gpt-image-2 with lower output pricing

TL;DR

  • OpenAI says the launch post summary ships ChatGPT Images 2.0 on top of gpt-image-2, with stronger instruction following, improved multilingual text, floor plans, infographics, and up to eight consistent images from one prompt.
  • According to the HN discussion roundup, early API watchers saw pricing that was mostly unchanged from the previous model, while output token pricing moved lower and per-image costs shifted by quality and aspect ratio.
  • For creative work, the main HN thread centered fast on comics, infographics, and other layout-heavy prompts, and that same discussion quickly turned into side-by-side tests against Google Nano Banana.
  • OpenAI's ChatGPT Images 2.0 system card says the new thinking mode can use reasoning and live web search before generating, which is a bigger workflow change than a routine image-model bump.
  • A month later, YouTube's policy summary shows where this kind of photorealistic output lands next: AI labels moved directly under long-form videos and onto Shorts, and YouTube started auto-labeling some undisclosed photorealistic AI content.

You can read OpenAI's launch post, skim the developer-side gpt-image-2 announcement, and check the prompting guide for the production defaults OpenAI wants builders to use. The official system card is where the web-search and reasoning details live. Then the HN thread gets messy in the useful way, with people testing raccoons, comics, and hard comparison prompts almost immediately.

What shipped

OpenAI Launches ChatGPT Images 2.0 with Advanced Reasoning and Text Rendering Capabilities

OpenAI has launched ChatGPT Images 2.0, powered by the new gpt-image-2 model. Key enhancements include superior instruction following, improved multilingual text rendering, and the ability to generate complex visual media such as infographics, floor plans, and consistent image series. The model introduces a "thinking" mode that integrates reasoning and web search to improve output quality, allowing for the generation of up to eight consistent images from a single prompt. ChatGPT Images 2.0 is rolling out to all ChatGPT users and is available to API developers, effectively replacing the previous GPT-Image-1.5 model.

ChatGPT Images 2.0

For creatives, the main takeaway is that ChatGPT Images 2.0 claims better text rendering, stronger instruction following, and better consistency for comics, infographics, and other multi-panel or layout-heavy images. Commenters are already using it to test comics and other tricky prompts, so the thread is a useful first look at how well it handles real creative workflows.

OpenAI frames ChatGPT Images 2.0 as a broader image stack refresh, not just a new model name. The official launch claims better text rendering, stronger instruction following, more reliable series consistency, and support for dense formats like infographics, floor plans, and comics, all rolling out in ChatGPT while gpt-image-2 replaces GPT-Image-1.5 for API developers via the developer announcement.

The developer-side pitch is even more concrete. OpenAI says gpt-image-2 is the default choice for new builds that need text-heavy images, photorealism, editing, compositing, and fewer retries, with quality tiers of low, medium, and high in the prompting guide.

Thinking mode

OpenAI Launches ChatGPT Images 2.0 with Advanced Reasoning and Text Rendering Capabilities

OpenAI has launched ChatGPT Images 2.0, powered by the new gpt-image-2 model. Key enhancements include superior instruction following, improved multilingual text rendering, and the ability to generate complex visual media such as infographics, floor plans, and consistent image series. The model introduces a "thinking" mode that integrates reasoning and web search to improve output quality, allowing for the generation of up to eight consistent images from a single prompt. ChatGPT Images 2.0 is rolling out to all ChatGPT users and is available to API developers, effectively replacing the previous GPT-Image-1.5 model.

The weirdest part of the release is that image generation now has a reasoning layer. OpenAI's system card says thinking mode can use reasoning, tool use, and live web search before producing an image, and it can generate multiple images from a single prompt.

That lines up with the launch summary, which says ChatGPT Images 2.0 can produce up to eight consistent images in one go. For comics, slide sequences, storyboard frames, and campaign variants, that is the part worth bookmarking.

Pricing math

Discussion around ChatGPT Images 2.0

Thread discussion highlights: - minimaxir on API model and pricing: Model card for the API endpoint gpt-image-2 ... API Pricing is mostly unchanged from gpt-image-1.5, the output price is slightly lower ... price per image has changed. - simonw on hands-on prompt testing: I've been trying out the new model like this: ... `-m gpt-image-2` ... 'Where is the raccoon holding a ham radio'. - neom on hard prompt comparisons: Ran a bunch both on the .com and via the api, none of them are nearly as good as Nano Banana.

Hacker News got to the pricing wrinkles faster than the launch post. In the discussion roundup, minimaxir points to a model card that kept API pricing mostly in the same band as the previous release while lowering output token pricing, and the thread then broke that down into per-image comparisons by quality and aspect ratio.

The official developer announcement fills in the token table: image input at $8 per million tokens, cached image input at $2, text input at $5, cached text input at $1.25, text output at $10, and image output at $30 per million tokens in the OpenAI community post. It also positions gpt-image-2 for resolutions up to 2K, which matters more for presentation decks, ads, and product comps than for square social posts.

Hands-on comparisons

Discussion around ChatGPT Images 2.0

Thread discussion highlights: - minimaxir on API model and pricing: Model card for the API endpoint gpt-image-2 ... API Pricing is mostly unchanged from gpt-image-1.5, the output price is slightly lower ... price per image has changed. - simonw on hands-on prompt testing: I've been trying out the new model like this: ... `-m gpt-image-2` ... 'Where is the raccoon holding a ham radio'. - neom on hard prompt comparisons: Ran a bunch both on the .com and via the api, none of them are nearly as good as Nano Banana.

ChatGPT Images 2.0

For creatives, the main takeaway is that ChatGPT Images 2.0 claims better text rendering, stronger instruction following, and better consistency for comics, infographics, and other multi-panel or layout-heavy images. Commenters are already using it to test comics and other tricky prompts, so the thread is a useful first look at how well it handles real creative workflows.

The first useful read on model quality is not the launch copy, it is the argument in the HN thread summary. Simon Willison posted direct prompt tests through the API, while another commenter said the new model was still behind Nano Banana on their comparison set, and a later comment in the main thread added that some edge cases flipped the other way.

That leaves a practical early picture:

  • Text-heavy and layout-heavy work is where OpenAI is making its loudest claim, per the launch summary.
  • OpenAI's own docs position gpt-image-2 as the default for editing, compositing, and identity-sensitive work in the prompting guide.
  • Early HN testers did not converge on a universal win. According to the HN discussion roundup, some hard prompts still favored Nano Banana.
  • The same HN thread also surfaced prompt-specific reversals, where gpt-image-2 solved cases Nano Banana missed, according to the main thread.

That is Christmas-come-early behavior for people making comics, explainers, and client mockups, because the gap between "better on average" and "better on my cursed prompt" is still where tool choice gets decided.

YouTube labels

YouTube Updates AI Labeling for Improved Transparency and Automated Detection

YouTube has announced updates to its AI labeling system to improve transparency and ease of use for viewers and creators. Beginning in May 2026, disclosures for photorealistic or meaningfully AI-altered content will appear in more prominent locations: directly below the video player for long-form videos and as an on-video overlay for Shorts. Additionally, YouTube is introducing internal signals to automatically detect and label significant photorealistic AI content if creators fail to disclose it. While creators retain the ability to update labels if they believe their content was misidentified, labels remain permanent for content created using YouTube's own AI tools (e.g., Veo, Dream Screen) or containing specific generative AI metadata (e.g., C2PA). Content that is unrealistic, animated, or only slightly altered will continue to have disclosures located in the expanded video description.

YouTube to automatically label AI-generated videos

If you make video content, this matters because YouTube is changing how AI disclosure is surfaced and is starting to auto-label some photorealistic AI content. The discussion focuses on creator-facing consequences: false positives, appeal workflows, and whether AI-heavy channels can be detected even when the output is mixed with human editing or archival footage.

A month after the OpenAI rollout, YouTube changed the way AI disclosure appears on finished video. The company's policy post says labels for photorealistic or meaningfully AI-altered long-form videos now sit directly below the player, Shorts get an on-video overlay, and YouTube can automatically add labels when creators do not disclose significant photorealistic AI use.

The same post adds two details with real downstream consequences: creators can dispute mislabels, but labels stay permanent for content made with YouTube's own AI tools such as Veo or Dream Screen, and for content carrying specific generative-AI metadata such as C2PA. Unrealistic, animated, or only lightly altered content stays in the less prominent expanded-description disclosure flow, according to the policy summary.

That is where this launch story connects to publishing reality. As the HN discussion around YouTube's change shows, creators immediately started asking how well automated detection will distinguish fully synthetic photoreal video from the increasingly normal mixed workflow of AI scripts, AI voice, generated b-roll, human editing, and archive footage.

Share on X