News & Updates, AI Agents, API Tools

OpenAI Ships GPT-5.5: Agentic Coding Model Now Live in Codex

OpenAI launched GPT-5.5 on April 23 — its first fully retrained base model since GPT-4.5 — now live for paid ChatGPT and Codex users. With 82.7% on Terminal-Bench 2.0 and state-of-the-art performance on long-horizon coding tasks, it marks a new bar for agentic developer tooling.

4 min read
OpenAI Ships GPT-5.5: Agentic Coding Model Now Live in Codex

Image by OpenAI

OpenAI Ships GPT-5.5: The Agentic Model Designed to Run Your Dev Workflows

OpenAI launched GPT-5.5 on Thursday, releasing its most capable model to date across ChatGPT and Codex for paid subscribers. The release, coming just six weeks after GPT-5.4, is the first fully retrained base model since GPT-4.5 — and unlike its recent predecessors, GPT-5.5 is explicitly built for agentic work: the kind of multi-step, multi-tool workflows where an AI needs to plan, execute, check its own output, and keep going without being micromanaged.

"GPT-5.5 understands what you're trying to do faster and can carry more of the work itself," OpenAI said in the launch announcement. "Instead of carefully managing every step, you can give GPT-5.5 a messy, multi-part task and trust it to plan, use tools, check its work, navigate through ambiguity, and keep going."

The gains are concentrated in four areas OpenAI identified as requiring long-horizon reasoning: agentic coding, computer use, knowledge work, and early scientific research. For developers, that means fewer reprompts, fewer mid-task corrections, and more end-to-end task completion in a single pass.

Benchmark Numbers That Matter for Developers

The headline figure for developers is Terminal-Bench 2.0, which tests a model's ability to handle complex command-line workflows involving planning and iterative tool use. GPT-5.5 scores 82.7% — ahead of Claude Opus 4.7 at 69.4% and Gemini 3.1 Pro at 68.5%.

On SWE-Bench Pro, which measures real-world GitHub issue resolution across four programming languages, GPT-5.5 resolves 58.6% of tasks end-to-end in a single pass. Anthropic's Claude Opus 4.7 scores higher at 64.3%, though OpenAI has flagged that "labs reported signs of memorization on a subset of those problems" — a caveat worth tracking as independent evaluations surface.

OpenAI also reports results on Expert-SWE, an internal benchmark measuring tasks with a median estimated human completion time of 20 hours. GPT-5.5 outperforms GPT-5.4 on that benchmark, which is the more relevant signal for developers running large refactors or multi-session feature builds through Codex agents.

For harder mathematical reasoning, GPT-5.5 Pro scored 39.6% on FrontierMath Tier 4 — postdoctoral-level math problems — nearly double Claude Opus 4.7's 22.9%. OpenAI says a customized version of the model also assisted researchers in discovering a new mathematical proof related to Ramsey numbers.

What Changes Inside Codex

The most immediately relevant change for developers building with Codex is efficiency. OpenAI says GPT-5.5 delivers better results with fewer tokens than GPT-5.4 for most users — and despite being a more capable model, it matches GPT-5.4's per-token latency in real-world serving. Bigger, more capable models are usually slower, so this is a notable engineering result.

OpenAI said 4 million developers are now actively using Codex every week, up from 3 million just two weeks before the announcement. The company also disclosed that 9 million businesses are paying for ChatGPT and that GPT-5.5 has already been put to internal use: one team used it in Codex to analyze six months of data, build a scoring framework, and validate an automated Slack agent, while another used it to review 24,771 K-1 tax forms spanning 71,637 pages.

The model can also automatically figure out how to use an MCP server without the user providing explicit instructions — a concrete improvement over GPT-5.4 for developers who have built tool-use integrations and want agents to navigate them without hand-holding.

Availability and Pricing

As of April 23, GPT-5.5 is live for Plus, Pro, Business, and Enterprise users in ChatGPT and Codex. GPT-5.5 Pro — a more capable, higher-accuracy version — is rolling out to Pro, Business, and Enterprise users in ChatGPT.

API access is not yet live at launch, with OpenAI saying it is coming "very soon" and that API deployments require additional safety safeguards that are still being finalized with partners. When the API launches, pricing will be $5 per million input tokens and $30 per million output tokens — double the cost of GPT-5.4. GPT-5.5 Pro is priced at $30 per million input tokens and $180 per million output tokens.

OpenAI says the higher price is offset by token efficiency gains, framing it as a net wash or better for most Codex workloads. That claim is worth testing against your own use patterns before assuming it holds at scale.

Safety Posture

OpenAI classified GPT-5.5 as meeting a "High" cybersecurity risk threshold — meaning it could amplify existing threats if misused — but said it does not cross the "Critical" threshold associated with severe harm. The company conducted extensive third-party red-teaming and safeguard testing before release, and worked with nearly 200 trusted early-access partners across real use cases. The safety posture comes as AI cybersecurity capabilities have been under scrutiny across the industry following Anthropic's limited rollout of its more powerful Claude Mythos Preview model.

What's Unconfirmed

OpenAI has not disclosed a specific timeline for API availability beyond "very soon," nor published detailed system cards covering the full scope of GPT-5.5's capabilities and failure modes. Independent third-party evaluation of the SWE-Bench Pro and Terminal-Bench 2.0 scores — particularly the memorization caveat raised about Anthropic's competing results — has not yet been completed. Developers should treat the benchmark comparisons as directional signals rather than settled fact until external evaluations replicate them.

Share:

Other Latest News

OpenAI Workspace Agents Replace GPTs: Codex Now Automates Team Dev Workflows
AI Agents, News & Updates, API Tools

OpenAI Workspace Agents Replace GPTs: Codex Now Automates Team Dev Workflows

OpenAI launched Workspace Agents in ChatGPT on April 22 — Codex-powered cloud bots that replace custom GPTs, write code, and automate multi-step team workflows around the clock. Free until May 6.

Apr 23, 2026
OpenAI Ships gpt-image-2: Reasoning Image Model Replaces DALL-E
News & Updates, AI Assistants, API Tools

OpenAI Ships gpt-image-2: Reasoning Image Model Replaces DALL-E

OpenAI launched ChatGPT Images 2.0 and the gpt-image-2 API on April 21 — the first image model with native reasoning, 2K resolution, and multi-image batch generation. DALL-E 2 and DALL-E 3 retire May 12; developers need to migrate.

Apr 22, 2026
SpaceX Seals $60B Option to Acquire Cursor in AI Coding Deal
News & Updates, Industry Analysis, Code Editors

SpaceX Seals $60B Option to Acquire Cursor in AI Coding Deal

SpaceX announced on April 21 that it has struck a deal giving it the right to acquire Cursor for $60 billion — or pay $10 billion for joint AI coding development. Here is what developers need to know about the biggest potential acquisition in AI coding history.

Apr 22, 2026
GitHub Copilot Pauses Sign-Ups as Agentic Costs Spiral
News & Updates, Industry Analysis, Code Editors

GitHub Copilot Pauses Sign-Ups as Agentic Costs Spiral

GitHub halted new Copilot Pro, Pro+, and Student sign-ups on April 20, tightened usage limits, and removed Opus models from Pro plans — because agentic workflows now routinely cost more per user than the plan price covers.

Apr 22, 2026
Vercel Breach Traced to Third-Party AI Tool as ShinyHunters Demands $2M
Security, News & Updates, Deployment

Vercel Breach Traced to Third-Party AI Tool as ShinyHunters Demands $2M

Vercel confirmed unauthorized access to internal systems on April 19 after a third-party AI tool called Context.ai had its Google Workspace OAuth app compromised. ShinyHunters is claiming to sell access keys, source code, and npm tokens for $2 million — here is what developers need to know and do right now.

Apr 20, 2026
Cloudflare Agents Week Closes With 50+ Launches for AI-Native Infra
AI Agents, News & Updates, Infrastructure

Cloudflare Agents Week Closes With 50+ Launches for AI-Native Infra

Cloudflare wrapped its week-long Agents Week with more than 50 product launches — Dynamic Workers open beta, Sandboxes GA, Email Service public beta, and Unweight inference compression — making the biggest platform bet in its history on AI agent infrastructure.

Apr 19, 2026
← Scroll for more →