Skip to content
AI Primer
workflow

ChatGPT Images 2.0 compares with Nano Banana on hard prompts

HN testers ran gpt-image-2 against Nano Banana on comics and oddball prompts and reported mixed wins and losses. The comparisons matter because stronger text rendering and image-series consistency do not remove failure cases on hard prompts.

4 min read
ChatGPT Images 2.0 compares with Nano Banana on hard prompts
ChatGPT Images 2.0 compares with Nano Banana on hard prompts

TL;DR

  • OpenAI says the launch summary improves text rendering, instruction following, and consistency for infographics, comics, and other layout-heavy images, while the HN discussion highlights show early testers immediately stress-testing those claims on hard prompts.
  • The first HN verdict was mixed: the discussion highlights quote neom saying their prompt set was still nowhere near Nano Banana, but the core HN summary also notes edge cases where gpt-image-2 beat Nano Banana.
  • OpenAI tied the release to a new "thinking" mode that can use reasoning and web search, and the launch summary says it can generate up to eight consistent images from one prompt.
  • For creators, the interesting gap is between structured design work and adversarial weird-prompt work: the HN core summary frames comics and text-heavy layouts as the main upgrade, while HN commenters kept finding failure cases on stranger prompts.

You can read OpenAI's launch post, the API model page, and the developer announcement. The HN thread is where the useful mess starts, including Simon Willison's hands-on tests and a quick round of Nano Banana comparisons. OpenAI also published a system card and a prompting guide, which is where the workflow knobs show up.

What OpenAI says changed

OpenAI Launches ChatGPT Images 2.0 with Advanced Reasoning and Text Rendering Capabilities

OpenAI has launched ChatGPT Images 2.0, powered by the new gpt-image-2 model. Key enhancements include superior instruction following, improved multilingual text rendering, and the ability to generate complex visual media such as infographics, floor plans, and consistent image series. The model introduces a "thinking" mode that integrates reasoning and web search to improve output quality, allowing for the generation of up to eight consistent images from a single prompt. ChatGPT Images 2.0 is rolling out to all ChatGPT users and is available to API developers, effectively replacing the previous GPT-Image-1.5 model.

OpenAI's pitch is straightforward. ChatGPT Images 2.0, powered by gpt-image-2, is supposed to be better at multilingual text, tighter instruction following, and image sets that stay visually consistent across multiple outputs.

The official materials point to the same creative use cases over and over:

  • infographics
  • floor plans
  • charts and posters
  • comics and manga-style panels
  • slide-like or doc-ready visuals
  • image series with up to eight consistent outputs from one prompt

The developer announcement adds two concrete production details missing from most reaction posts: support for more aspect ratios and resolutions up to 2K, plus positioning around assets that need to be readable, localized, and usable without much cleanup.

Hard prompts on HN

Discussion around ChatGPT Images 2.0

Thread discussion highlights: - minimaxir on API model and pricing: Model card for the API endpoint gpt-image-2 ... API Pricing is mostly unchanged from gpt-image-1.5, the output price is slightly lower ... price per image has changed. - simonw on hands-on prompt testing: I've been trying out the new model like this: ... `-m gpt-image-2` ... 'Where is the raccoon holding a ham radio'. - neom on hard prompt comparisons: Ran a bunch both on the .com and via the api, none of them are nearly as good as Nano Banana.

The first day community read was less victory-lap, more prompt cage match. In the HN thread, minimaxir went straight to the model card and pricing, Simon Willison started running oddball prompts through the API, and other commenters compared results against Nano Banana almost immediately.

Three patterns surfaced fast:

  • According to the HN discussion highlights, neom ran prompts on both ChatGPT and the API and said none of them were close to Nano Banana on their test set.
  • According to the core HN summary, justani reported a split result: some prompts still failed in both systems, but at least one case that Nano Banana missed worked in gpt-image-2.
  • In the full HN thread, commenters kept using comics, menus, and weird object-scene prompts because those expose whether better typography also comes with better compositional logic.

That is a more useful first-day frame than the polished demos. The new model looks aimed at real design tasks, but HN testers treated it like a benchmark for ugly edge cases.

Layout versus logic

ChatGPT Images 2.0

For creatives, the main takeaway is that ChatGPT Images 2.0 claims better text rendering, stronger instruction following, and better consistency for comics, infographics, and other multi-panel or layout-heavy images. Commenters are already using it to test comics and other tricky prompts, so the thread is a useful first look at how well it handles real creative workflows.

The sharpest early takeaway is that the upgrade seems strongest where the image needs structure. The HN core summary specifically calls out comics, infographics, and other multi-panel or layout-heavy formats as the main beneficiary of the new text rendering and consistency work.

But the HN thread also produced a cleaner failure description than the launch copy did. One commenter in the discussion argued that Nano Banana often gets the logic and punts on the art, while gpt-image-2 gets the art and punts on the logic. That is not a benchmark result, but it does fit the mixed reports in the HN discussion highlights and the core summary.

For creative workflows, that split matters more than leaderboard language. A model that can hold panel layouts, labels, and repeated characters together is useful in one class of jobs. A model that still breaks on bizarre constraint-heavy prompts has not closed the whole gap.

Pricing and workflow knobs

Discussion around ChatGPT Images 2.0

Thread discussion highlights: - minimaxir on API model and pricing: Model card for the API endpoint gpt-image-2 ... API Pricing is mostly unchanged from gpt-image-1.5, the output price is slightly lower ... price per image has changed. - simonw on hands-on prompt testing: I've been trying out the new model like this: ... `-m gpt-image-2` ... 'Where is the raccoon holding a ham radio'. - neom on hard prompt comparisons: Ran a bunch both on the .com and via the api, none of them are nearly as good as Nano Banana.

The last useful detail is that this was not just a ChatGPT front-end refresh. OpenAI's developer announcement shipped gpt-image-2 into the API and Codex on day one, with posted token pricing and support for both generations and edits.

The prompting guide is also unusually specific about workflow tradeoffs. It recommends gpt-image-2 as the default for new builds, says low, medium, and high outputQuality are all supported, and notes that low is especially strong for latency-sensitive use cases. That gives creators a more practical picture than the launch post alone: OpenAI is selling this as a production image model, not just a prettier demo model.

Share on X