AI Agents, News & Updates

OpenAI Reasoning Model Cracks 80-Year Math Problem, Signaling Codex Leap

An internal OpenAI general-purpose reasoning model disproved a famous Erdős conjecture open since 1946 — a first for autonomous AI in frontier mathematics, with direct implications for what is coming to Codex and agentic coding tools.

4 min read
OpenAI Reasoning Model Cracks 80-Year Math Problem, Signaling Codex Leap

Image by OpenAI

OpenAI Reasoning Model Cracks 80-Year Math Problem, Signaling Codex Leap

An internal OpenAI reasoning model has disproved a central conjecture in discrete geometry that has gone unsolved for nearly 80 years — and the result is being called the first time AI has autonomously solved a prominent open problem at the frontier of a field of mathematics.

The problem is the planar unit distance problem, first posed by Hungarian mathematician Paul Erdős in 1946. The question asks how many pairs of points, placed on a flat plane, can sit at exactly one unit of distance apart. For decades, mathematicians believed square grid arrangements were essentially the best construction possible. An internal OpenAI model discovered an entirely new infinite family of constructions that outperform the grid — and proved it mathematically. The 125-page proof has been independently verified by a group of external mathematicians, including Fields medalist Tim Gowers and Princeton combinatorialist Noga Alon, who described the unit distance problem as "one of Erdős's favorite problems."

What makes the result unusual is not just what was solved, but how. The model that produced it is a new general-purpose reasoning model — not a system fine-tuned for mathematics or targeted at Erdős problems. OpenAI says it evaluated the model on a set of open Erdős problems as part of broader research work, and this result emerged from that evaluation. The unexpected connection the model identified — bridging algebraic number theory and discrete geometry in a way human experts had not previously explored — is what has the mathematical community paying attention.

What This Means for Codex and AI Coding Tools

The direct relevance to developers lies not in the geometry, but in what this result reveals about general reasoning capability. The same model architecture that produced a verifiable 125-page mathematical proof — through sustained, multi-step reasoning across a novel problem — is the same class of model that powers tools like Codex. When OpenAI's reasoning models can connect ideas across distant domains, hold together complex arguments, and produce work that survives expert scrutiny, those same capabilities flow into the next generation of coding tools.

Codex today runs on GPT-5.5, which already showed significant jumps over prior generations on long-horizon coding tasks and large-scale refactors. If the model behind this mathematical breakthrough is part of the next reasoning stack, developers should expect meaningful capability upgrades in the areas Codex already handles — debugging, codebase migration, and multi-step autonomous task completion — when that model is deployed into production.

OpenAI explicitly noted the broader implication: "If a model can keep a complicated argument coherent, connect ideas across distant areas of knowledge, and produce work that survives expert scrutiny, those are also useful abilities in biology, physics, materials science, engineering, and medicine — and they are part of our longer-term path toward more automated research."

How the Proof Was Found

The model was not scaffolded specifically for the problem. OpenAI describes testing a general-purpose reasoning model across a variety of Erdős problems as part of evaluating whether AI can contribute to frontier research. The unit distance problem was one of them. Princeton mathematician Will Sawin independently refined the model's result, showing that the improvement over the square grid could be expressed with a fixed positive exponent — which means the discovery opens a new branch of geometry, not just a single edge case.

Tim Gowers commented that the result has likely taught the field something new about the role of algebraic number theory in geometric combinatorics, and predicted that algebraic number theorists will now revisit other open problems in discrete geometry. The companion paper written by external mathematicians paints, in Gowers' words, "a substantially richer picture than the original solution alone."

OpenAI's Track Record on Math Claims

It is worth applying appropriate skepticism here. In October 2025, OpenAI's then-VP Kevin Weil claimed on X that GPT-5 had solved ten previously unsolved Erdős problems — a claim quickly challenged by mathematicians, including the same Thomas Bloom who has now vouched for this result. Weil deleted the post. He left OpenAI in April 2026.

This time, OpenAI published the full proof, commissioned companion commentary from multiple respected mathematicians, and had the work verified before the announcement. The presence of Gowers and Bloom — two of the researchers who pushed back hardest on the prior claim — as supporting voices carries real weight. The proof has not yet completed formal peer review, but the external verification process completed before the announcement is a meaningful step.

For developers, the geometry itself is secondary. The signal is that OpenAI's reasoning stack is now producing verifiable novel research results — and those same reasoning capabilities will be coming to your coding agent.

Share:

Other Latest News

SpaceX IPO S-1 Locks In $60B Cursor Acquisition in Stock
News & Updates, Code Editors

SpaceX IPO S-1 Locks In $60B Cursor Acquisition in Stock

SpaceX's IPO prospectus reveals for the first time that the $60B Cursor acquisition will be paid in SPCX Class A stock — not cash — and that SpaceX has no formal obligation to close the deal.

May 21, 2026
Google Launches Gemini 3.5 Flash and Antigravity 2.0 at I/O
AI Agents, News & Updates, Code Editors

Google Launches Gemini 3.5 Flash and Antigravity 2.0 at I/O

Google unveiled Gemini 3.5 Flash and Antigravity 2.0 at I/O 2026 — a 4x-faster agentic model and a new agent-first coding IDE that puts Google in direct competition with Claude Code and OpenAI Codex.

May 21, 2026
Cursor Brings Cloud Agents to Jira With Native Work Item Integration
AI Agents, News & Updates, Code Editors

Cursor Brings Cloud Agents to Jira With Native Work Item Integration

Cursor now lets teams assign Jira tickets directly to a cloud agent or mention @Cursor in any comment to trigger a task — completing the loop between where work is tracked and where it gets done.

May 20, 2026
Google I/O 2026: Gemini 3.5 Flash Tops Pro on Coding Benchmarks
AI Agents, News & Updates

Google I/O 2026: Gemini 3.5 Flash Tops Pro on Coding Benchmarks

Google ships Gemini 3.5 Flash at I/O 2026 — outperforming Gemini 3.1 Pro on coding and agentic benchmarks at 4× the speed and less than half the cost, available now in the Gemini API and Antigravity.

May 20, 2026
Anthropic Launches MCP Tunnels and Self-Hosted Agent Sandboxes
AI Agents, News & Updates

Anthropic Launches MCP Tunnels and Self-Hosted Agent Sandboxes

Anthropic debuts self-hosted sandboxes and MCP tunnels for Claude Managed Agents at Code with Claude London, letting enterprises run agent tool execution inside their own infrastructure perimeter.

May 20, 2026
Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks
AI Agents, News & Updates, Code Editors

Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks

Cursor releases its next in-house coding model, Composer 2.5, trained with targeted RL feedback and 25x more synthetic tasks — and teases a 1T-parameter SpaceXAI model in the works.

May 19, 2026
← Scroll for more →