AI Agents, News & Updates

Google I/O 2026: Gemini 3.5 Flash Tops Pro on Coding Benchmarks

Google ships Gemini 3.5 Flash at I/O 2026 — outperforming Gemini 3.1 Pro on coding and agentic benchmarks at 4× the speed and less than half the cost, available now in the Gemini API and Antigravity.

3 min read
Google I/O 2026: Gemini 3.5 Flash Tops Pro on Coding Benchmarks

Image by Google

Google I/O 2026: Gemini 3.5 Flash Tops Pro on Coding Benchmarks

Google opened its annual I/O developer conference on May 19 with the launch of Gemini 3.5 Flash, a model the company is calling its "strongest agentic and coding model yet." It is available now via the Gemini API, Google AI Studio, Android Studio, and Antigravity — and it is now the default model inside the Gemini app and AI Mode in Google Search worldwide.

The announcement is a direct bid for the developer workloads that have shifted toward agentic coding over the past year. Google is explicitly positioning 3.5 Flash for long-horizon task execution, coding agents, MCP-based tool orchestration, and multi-step workflows — the exact terrain where Claude Code, OpenAI Codex, and Windsurf have been competing.

Benchmark Numbers That Matter

Google published three benchmark results alongside the launch:

  • Terminal-Bench 2.1 (coding): 76.2%
  • GDPval-AA Elo (real-world agentic tasks): 1,656 — up sharply from Gemini 3.1 Pro's 1,317
  • MCP Atlas (scaled tool use): 83.6%

The GDPval-AA jump is the most significant number here. A 339-point improvement on the primary agentic benchmark in roughly three months is a step-change, not incremental progress. For developers building agent-driven products on the Gemini API, this translates directly into better task completion rates and fewer broken multi-step workflows in production.

Google also claims 3.5 Flash runs at four times the output token speed of other comparable frontier models and is priced at less than half the cost. The company did not publish exact per-token pricing at launch, but described the efficiency case as meaningful enough that enterprises shifting 80 percent of AI workloads to a Flash-heavy model mix could save more than $1 billion a year at very high token volumes.

Available Now Across Google's Developer Stack

Gemini 3.5 Flash is generally available today through:

  • Gemini API in Google AI Studio
  • Android Studio
  • Google Antigravity — Google's agentic coding harness for orchestrating subagents in parallel across long-running tasks
  • Gemini Enterprise Agent Platform and Gemini Enterprise for business teams
  • Gemini app and AI Mode in Search worldwide for all users

Antigravity gets particular attention in Google's launch materials. The 3.5 Flash model is explicitly designed to work with Antigravity's subagent framework, which can deploy multiple agents in parallel and sustain complex workflows over extended periods. Google cited a real-world example of two agents collaborating to transform a legacy codebase to Next.js as a demonstration of what the model enables at the task-planning layer. Developers already using Antigravity are getting a meaningful capability upgrade, not just a model version bump.

Gemini 3.5 Pro Is Coming Next Month

Google confirmed a Gemini 3.5 Pro model is in development and already in internal use. The company said it expects to roll it out next month but gave no benchmark figures or pricing for the Pro tier. Given that 3.5 Flash already surpasses 3.1 Pro on coding and agentic tests, the Pro release will likely push the frontier further. Developers planning model selection for upcoming projects should factor the Pro release into their timeline.

Gemini Spark: Personal Agent Launches Alongside

Google also announced Gemini Spark at I/O — a general-purpose AI agent built on 3.5 Flash that takes action across a user's connected apps. Spark integrates with Gmail, Docs, and other Workspace tools, with MCP-based connections to third-party services arriving later this summer.

Spark is in beta and rolling out next week to Google AI Ultra subscribers in the US. Alongside this, Google repriced its AI Ultra subscription: it now starts at $100 per month, with a higher-end tier at $200 per month.

Scale That Changes the Distribution Math

Google reported that more than 8.5 million developers build with its AI models each month. Gemini's overall user base has grown from 400 million monthly active users at last year's I/O to over 900 million today across 230 countries and 70 languages. That installed base is Google's structural advantage: model upgrades ship instantly to a developer population already embedded in its toolchain.

For teams currently on the Gemini API, the upgrade path is straightforward — better coding and agent performance at lower cost and higher speed, with no migration required. The benchmark gap between 3.5 Flash and 3.1 Pro is wide enough to make the evaluation worth running even if you are not already building on Google's stack.

Share:

Other Latest News

Cursor Brings Cloud Agents to Jira With Native Work Item Integration
AI Agents, News & Updates, Code Editors

Cursor Brings Cloud Agents to Jira With Native Work Item Integration

Cursor now lets teams assign Jira tickets directly to a cloud agent or mention @Cursor in any comment to trigger a task — completing the loop between where work is tracked and where it gets done.

May 20, 2026
Anthropic Launches MCP Tunnels and Self-Hosted Agent Sandboxes
AI Agents, News & Updates

Anthropic Launches MCP Tunnels and Self-Hosted Agent Sandboxes

Anthropic debuts self-hosted sandboxes and MCP tunnels for Claude Managed Agents at Code with Claude London, letting enterprises run agent tool execution inside their own infrastructure perimeter.

May 20, 2026
Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks
AI Agents, News & Updates, Code Editors

Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks

Cursor releases its next in-house coding model, Composer 2.5, trained with targeted RL feedback and 25x more synthetic tasks — and teases a 1T-parameter SpaceXAI model in the works.

May 19, 2026
Google I/O 2026 Opens Today: Agentic Coding and New Gemini on Tap
AI Agents, News & Updates

Google I/O 2026 Opens Today: Agentic Coding and New Gemini on Tap

Google I/O 2026 kicks off today at 10am PT with agentic coding and a major Gemini model update officially on the agenda, as Google challenges Claude Code, Cursor, and OpenAI Codex for developer toolchain dominance.

May 19, 2026
Anthropic Closes In on $900B Valuation as Claude Code Hits $2.5B ARR
AI Agents, News & Updates

Anthropic Closes In on $900B Valuation as Claude Code Hits $2.5B ARR

Anthropic has agreed terms on a new $30B round at a $900B+ valuation led by Dragoneer, Greenoaks, Sequoia, and Altimeter — set to overtake OpenAI — driven by Claude Code's $2.5B ARR and explosive enterprise developer adoption.

May 19, 2026
OpenAI Merges ChatGPT, Codex, and API Into One Agentic Platform
AI Agents, News & Updates

OpenAI Merges ChatGPT, Codex, and API Into One Agentic Platform

OpenAI is collapsing ChatGPT, Codex, and its developer API into a single product team under Greg Brockman, with Codex chief Thibault Sottiaux leading a unified super app ahead of a potential IPO.

May 18, 2026
← Scroll for more →