OpenAI Ships gpt-image-2: Reasoning Image Model Replaces DALL-E

OpenAI Ships gpt-image-2: The First Image Model That Thinks Before It Draws

OpenAI launched ChatGPT Images 2.0 on April 21, 2026, shipping gpt-image-2 — the company's first image model with native reasoning capabilities — to all ChatGPT and Codex users, with API access available from day one. The release simultaneously triggers a hard deprecation deadline: DALL-E 2 and DALL-E 3 are being retired on May 12, giving developers less than three weeks to migrate any existing integrations.

The model is available via the API as gpt-image-2, replacing dall-e-3 as the default image endpoint. Pricing varies by output quality and resolution. Outputs above 2K are available in API beta.

What gpt-image-2 Actually Does

Images 2.0 operates in two modes. Instant mode delivers fast, standard generation. Thinking mode — restricted to Plus, Pro, Business, and Enterprise subscribers — enables the model to search the web for real-time information, generate up to eight coherent images from a single prompt, and cross-check its own outputs before delivering results. That self-correction loop is new for an OpenAI image model and meaningfully changes what image generation can be used for in production workflows.

"When a thinking or pro model is selected in ChatGPT, Images 2.0 can search the web for real-time information, create multiple distinct images from one prompt, and double-check its own outputs," OpenAI said in its announcement. "With thinking, the model can take on even more of the heavy lifting between idea and image, especially when accuracy, up-to-date information, consistency, and visual cohesion matter most."

Thinking mode also maintains character and object consistency across all eight images in a batch — a capability that previous models struggled with and that opens up storyboarding, Manga production, and multi-scene design workflows that required stitching together separate generations.

The Developer API Angle

For developers, the headline detail is the API migration deadline. Any existing code calling the dall-e-3 or dall-e-2 endpoints will stop working after May 12. The gpt-image-2 model identifier replaces both. GPT Image 1.5 remains accessible via the API for legacy integrations but is no longer the default.

gpt-image-2 supports:

2K resolution via API (higher than DALL-E 3's 1024×1024 max)
Aspect ratios from 3:1 to 1:3, generating banner, portrait, square, and wide outputs without post-processing
Batch generation of up to 8 images with shared character and object continuity
Improved text rendering, including dense non-Latin scripts: Japanese, Korean, Chinese, Hindi, and Bengali
Knowledge cutoff of December 2025, enabling more accurate outputs for timely explainers and visual summaries

The model also handles the kinds of fine-grained elements that routinely broke DALL-E 3 in production: small inline text, UI elements, iconography, and dense compositions. "It can not only conceptualize more sophisticated images, but it actually brings that vision to life effectively, able to follow instructions, preserve requested details, and render the fine-grained elements that often break image models," OpenAI said.

Codex Integration

Images 2.0 is also available inside the Codex desktop app and environment, enabling visual creation within the same workspace used for app development, slide decks, and prototyping. Codex users do not need a separate API key — access is included with existing ChatGPT subscription tiers. The integration allows developers to generate UI directions and prototypes, compare options, and push results to live products without switching tools.

This makes gpt-image-2 the first OpenAI image model to sit directly inside a developer coding environment rather than requiring a separate ChatGPT session or standalone API call.

Competitive Context

The launch lands in a tight race. On the LM Arena text-to-image leaderboard as of early April, Google's Gemini image model held first place with OpenAI's gpt-image-1.5 in second. gpt-image-2 now claims the top spot with a 242-point lead across all leaderboard categories, according to Arena data. OpenAI is treating image generation as a primary interface layer — not a supplementary feature — with this release. The company notes that more than one billion images have been generated through ChatGPT to date.

Pressure is real from multiple directions: Google Gemini's image quality has been closing the gap; Adobe Firefly, Midjourney, and open-source models like FLUX have pushed text rendering forward through 2025 and 2026. Retiring DALL-E 2 and DALL-E 3 forces the ecosystem off legacy infrastructure and onto a model OpenAI believes can compete directly.

Limitations to Know

OpenAI acknowledges several areas where the model still struggles. Tasks requiring a coherent physical-world model — origami guides, Rubik's Cube diagrams, objects on reversed or angled surfaces — remain unreliable. Very fine or repetitive visual detail (grains of sand, tightly patterned textures) can exceed the model's fidelity limits. Labels and part diagrams may need manual review.

Wharton professor and AI researcher Ethan Mollick flagged a practical limitation for iterative workflows: edits work well for the first round or two, then progress stalls. His documented workaround is to drop the image into a fresh chat to reset context before continuing. For production pipelines built around iterative prompt refinement, this is worth accounting for in workflow design.

What's Not Yet Confirmed

OpenAI has not disclosed the underlying architecture powering Images 2.0. The company declined to answer questions at a press briefing this week about what model type is driving the generation — whether autoregressive, diffusion-based, or hybrid. API rate limits at scale have not been formally published. The full pricing table for quality and resolution tiers is available at openai.com/api/pricing but enterprise volume commitments have not been announced.