Skip to content
AI Primer
workflow

Curious Refuge compares GPT Image 2 and Nano Banana 2 on 4 reference-image edits

Creators ran new side-by-side tests of ChatGPT Images 2.0 and Nano Banana 2 on reference-image swaps, scene changes, and poster sketches. The split matters because GPT Image 2 held characters better, while Nano Banana stayed favored for environments, natural placement, speed, and cost.

3 min read
Curious Refuge compares GPT Image 2 and Nano Banana 2 on 4 reference-image edits
Curious Refuge compares GPT Image 2 and Nano Banana 2 on 4 reference-image edits

TL;DR

OpenAI's docs say GPT Image 2 is built for high-fidelity edits and multi-turn image workflows. Google's docs say Nano Banana 2 can create images in seconds, keep characters consistent, and make local edits. The fun part is that CuriousRefuge's tests landed almost exactly on that product split, while creators like AmirMushich and 0xInk_ were already turning both models into concrete design workflows.

Reference edits

CuriousRefuge ran four reference-image prompts: a character swap, a forest background change, a 180-degree camera move, and a cinematic park still. Across the set, CuriousRefuge said GPT Image 2.0 was better at character consistency, while Nano Banana 2 was better at environments and at making the subject feel less composited into the scene.

That split is easy to map onto the prompts themselves:

  • Character swap: the first test favored GPT Image 2.0 for keeping the subject on-model.
  • Background replacement: the forest edit favored Nano Banana 2 for scene continuity.
  • Viewpoint change: the camera-rotation prompt stressed spatial reasoning and identity preservation.
  • Style transfer into a new setting: the park still tested whether the model could keep grading and mood while rebuilding the background.

Speed and cost

OpenAI's model page describes GPT Image 2 as a state-of-the-art generation and editing model, and the image-generation guide highlights multi-turn editing and high-fidelity image inputs. Google's Gemini help page describes Nano Banana 2 as an image tool that works in seconds, supports local edits, and can blend multiple images while keeping character consistency.

The interesting bit is that CuriousRefuge's earlier comparison called GPT Image 2.0 the more intelligent and realistic model, but also flagged it as slow and expensive. A day later, their reference-edit follow-up still ended with Nano Banana 2 as the default pick because faster and cheaper kept mattering more than absolute precision.

Poster sketches and refinement

The workflow evidence is already getting specific. In AmirMushich's poster sketch, Nano Banana plus Figma became a typography layout exercise, with font settings and poster framing baked into the prompt instead of treated as cleanup work after the image was made.

A separate path showed up in 0xInk_'s post, where Midjourney handled the base illustration and GPT Image 2 handled refinement without breaking the original graphic style. That is a different use case from the Curious Refuge tests: not raw model-versus-model ranking, but model chaining, where one system generates the look and another tightens detail.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 2 threads
TL;DR1 post
Reference edits2 posts
Share on X