Google I/O 2026: Gemini 3.5 Flash Tops Pro on Coding Benchmarks

Google opened its annual I/O developer conference on May 19 with the launch of Gemini 3.5 Flash, a model the company is calling its "strongest agentic and coding model yet." It is available now via the Gemini API, Google AI Studio, Android Studio, and Antigravity — and it is now the default model inside the Gemini app and AI Mode in Google Search worldwide.

The announcement is a direct bid for the developer workloads that have shifted toward agentic coding over the past year. Google is explicitly positioning 3.5 Flash for long-horizon task execution, coding agents, MCP-based tool orchestration, and multi-step workflows — the exact terrain where Claude Code, OpenAI Codex, and Windsurf have been competing.

Benchmark Numbers That Matter

Google published three benchmark results alongside the launch:

Terminal-Bench 2.1 (coding): 76.2%
GDPval-AA Elo (real-world agentic tasks): 1,656 — up sharply from Gemini 3.1 Pro's 1,317
MCP Atlas (scaled tool use): 83.6%

The GDPval-AA jump is the most significant number here. A 339-point improvement on the primary agentic benchmark in roughly three months is a step-change, not incremental progress. For developers building agent-driven products on the Gemini API, this translates directly into better task completion rates and fewer broken multi-step workflows in production.

Google also claims 3.5 Flash runs at four times the output token speed of other comparable frontier models and is priced at less than half the cost. The company did not publish exact per-token pricing at launch, but described the efficiency case as meaningful enough that enterprises shifting 80 percent of AI workloads to a Flash-heavy model mix could save more than $1 billion a year at very high token volumes.

Available Now Across Google's Developer Stack

Gemini 3.5 Flash is generally available today through:

Gemini API in Google AI Studio
Android Studio
Google Antigravity — Google's agentic coding harness for orchestrating subagents in parallel across long-running tasks
Gemini Enterprise Agent Platform and Gemini Enterprise for business teams
Gemini app and AI Mode in Search worldwide for all users

Antigravity gets particular attention in Google's launch materials. The 3.5 Flash model is explicitly designed to work with Antigravity's subagent framework, which can deploy multiple agents in parallel and sustain complex workflows over extended periods. Google cited a real-world example of two agents collaborating to transform a legacy codebase to Next.js as a demonstration of what the model enables at the task-planning layer. Developers already using Antigravity are getting a meaningful capability upgrade, not just a model version bump.

Gemini 3.5 Pro Is Coming Next Month

Google confirmed a Gemini 3.5 Pro model is in development and already in internal use. The company said it expects to roll it out next month but gave no benchmark figures or pricing for the Pro tier. Given that 3.5 Flash already surpasses 3.1 Pro on coding and agentic tests, the Pro release will likely push the frontier further. Developers planning model selection for upcoming projects should factor the Pro release into their timeline.

Gemini Spark: Personal Agent Launches Alongside

Google also announced Gemini Spark at I/O — a general-purpose AI agent built on 3.5 Flash that takes action across a user's connected apps. Spark integrates with Gmail, Docs, and other Workspace tools, with MCP-based connections to third-party services arriving later this summer.

Spark is in beta and rolling out next week to Google AI Ultra subscribers in the US. Alongside this, Google repriced its AI Ultra subscription: it now starts at $100 per month, with a higher-end tier at $200 per month.

Scale That Changes the Distribution Math

Google reported that more than 8.5 million developers build with its AI models each month. Gemini's overall user base has grown from 400 million monthly active users at last year's I/O to over 900 million today across 230 countries and 70 languages. That installed base is Google's structural advantage: model upgrades ship instantly to a developer population already embedded in its toolchain.

For teams currently on the Gemini API, the upgrade path is straightforward — better coding and agent performance at lower cost and higher speed, with no migration required. The benchmark gap between 3.5 Flash and 3.1 Pro is wide enough to make the evaluation worth running even if you are not already building on Google's stack.