AI Agents, News & Updates

Google I/O 2026: Gemini 3.5 Flash Tops Pro on Coding Benchmarks

Google ships Gemini 3.5 Flash at I/O 2026 — outperforming Gemini 3.1 Pro on coding and agentic benchmarks at 4× the speed and less than half the cost, available now in the Gemini API and Antigravity.

3 min read
Google I/O 2026: Gemini 3.5 Flash Tops Pro on Coding Benchmarks

Image by Google

Google I/O 2026: Gemini 3.5 Flash Tops Pro on Coding Benchmarks

Google opened its annual I/O developer conference on May 19 with the launch of Gemini 3.5 Flash, a model the company is calling its "strongest agentic and coding model yet." It is available now via the Gemini API, Google AI Studio, Android Studio, and Antigravity — and it is now the default model inside the Gemini app and AI Mode in Google Search worldwide.

The announcement is a direct bid for the developer workloads that have shifted toward agentic coding over the past year. Google is explicitly positioning 3.5 Flash for long-horizon task execution, coding agents, MCP-based tool orchestration, and multi-step workflows — the exact terrain where Claude Code, OpenAI Codex, and Windsurf have been competing.

Benchmark Numbers That Matter

Google published three benchmark results alongside the launch:

  • Terminal-Bench 2.1 (coding): 76.2%
  • GDPval-AA Elo (real-world agentic tasks): 1,656 — up sharply from Gemini 3.1 Pro's 1,317
  • MCP Atlas (scaled tool use): 83.6%

The GDPval-AA jump is the most significant number here. A 339-point improvement on the primary agentic benchmark in roughly three months is a step-change, not incremental progress. For developers building agent-driven products on the Gemini API, this translates directly into better task completion rates and fewer broken multi-step workflows in production.

Google also claims 3.5 Flash runs at four times the output token speed of other comparable frontier models and is priced at less than half the cost. The company did not publish exact per-token pricing at launch, but described the efficiency case as meaningful enough that enterprises shifting 80 percent of AI workloads to a Flash-heavy model mix could save more than $1 billion a year at very high token volumes.

Available Now Across Google's Developer Stack

Gemini 3.5 Flash is generally available today through:

  • Gemini API in Google AI Studio
  • Android Studio
  • Google Antigravity — Google's agentic coding harness for orchestrating subagents in parallel across long-running tasks
  • Gemini Enterprise Agent Platform and Gemini Enterprise for business teams
  • Gemini app and AI Mode in Search worldwide for all users

Antigravity gets particular attention in Google's launch materials. The 3.5 Flash model is explicitly designed to work with Antigravity's subagent framework, which can deploy multiple agents in parallel and sustain complex workflows over extended periods. Google cited a real-world example of two agents collaborating to transform a legacy codebase to Next.js as a demonstration of what the model enables at the task-planning layer. Developers already using Antigravity are getting a meaningful capability upgrade, not just a model version bump.

Gemini 3.5 Pro Is Coming Next Month

Google confirmed a Gemini 3.5 Pro model is in development and already in internal use. The company said it expects to roll it out next month but gave no benchmark figures or pricing for the Pro tier. Given that 3.5 Flash already surpasses 3.1 Pro on coding and agentic tests, the Pro release will likely push the frontier further. Developers planning model selection for upcoming projects should factor the Pro release into their timeline.

Gemini Spark: Personal Agent Launches Alongside

Google also announced Gemini Spark at I/O — a general-purpose AI agent built on 3.5 Flash that takes action across a user's connected apps. Spark integrates with Gmail, Docs, and other Workspace tools, with MCP-based connections to third-party services arriving later this summer.

Spark is in beta and rolling out next week to Google AI Ultra subscribers in the US. Alongside this, Google repriced its AI Ultra subscription: it now starts at $100 per month, with a higher-end tier at $200 per month.

Scale That Changes the Distribution Math

Google reported that more than 8.5 million developers build with its AI models each month. Gemini's overall user base has grown from 400 million monthly active users at last year's I/O to over 900 million today across 230 countries and 70 languages. That installed base is Google's structural advantage: model upgrades ship instantly to a developer population already embedded in its toolchain.

For teams currently on the Gemini API, the upgrade path is straightforward — better coding and agent performance at lower cost and higher speed, with no migration required. The benchmark gap between 3.5 Flash and 3.1 Pro is wide enough to make the evaluation worth running even if you are not already building on Google's stack.

Share:

Other Latest News

Gemini Spark Comes to Mac With Local File Access and MCP Support
AI Agents, News & Updates

Gemini Spark Comes to Mac With Local File Access and MCP Support

Google has added Gemini Spark to the Gemini desktop app for macOS, giving it access to local files, MCP server connections, real-time topic tracking, and new integrations with Canva, Dropbox, and more.

Jul 3, 2026
Google Drops ADK 2.0 and Genkit Agents for AI App Builders
AI Agents, News & Updates

Google Drops ADK 2.0 and Genkit Agents for AI App Builders

Google published three developer-facing launches in 24 hours: ADK 2.0's graph-based workflow runtime, the Genkit Agents API for full-stack conversational AI, and a Google Cloud Workbench VS Code extension — a coordinated push to own the agent-development stack.

Jul 3, 2026
Fable 5 Returns Globally as US Lifts Export Controls on Anthropic
News & Updates, AI Agents, Security

Fable 5 Returns Globally as US Lifts Export Controls on Anthropic

Anthropic restored global access to Claude Fable 5 on July 1 after the US Commerce Department lifted the 19-day export controls. Claude Code and Claude.ai users get the model back with a new 99%+ jailbreak classifier and an industry-wide safety framework now in development.

Jul 2, 2026
Anthropic Launches Claude Sonnet 5 With Near-Opus Coding Performance
AI Agents, News & Updates

Anthropic Launches Claude Sonnet 5 With Near-Opus Coding Performance

Anthropic's new Claude Sonnet 5 brings near-Opus 4.8 agentic coding performance at Sonnet pricing, with a 1M-token context window and introductory rates of $2/$10 per million tokens through August 31.

Jul 1, 2026
GPT-5.6 Is Already Running in Some Codex Sessions
News & Updates, AI Agents

GPT-5.6 Is Already Running in Some Codex Sessions

Developers have found a technique to detect whether GPT-5.6 Sol is already serving their Codex sessions — and some sessions are returning signals consistent with the new model before any public rollout.

Jun 30, 2026
Cursor Launches iOS App in Public Beta for All Paid Plans
Code Editors, News & Updates, AI Agents

Cursor Launches iOS App in Public Beta for All Paid Plans

Cursor's native iOS app is now in public beta on all paid plans, letting developers launch cloud agents, remote-control desktop sessions, and merge PRs directly from their phone.

Jun 30, 2026
← Scroll for more →