AI Agents, News & Updates

Anthropic Releases Claude Sonnet 4.6 with Major Coding and Computer Use Improvements

The upgraded Sonnet model approaches Opus-level performance at a lower price point, with developers preferring it over previous flagship models in early testing.

2 min read
Anthropic Releases Claude Sonnet 4.6 with Major Coding and Computer Use Improvements

Image by Anthropic

Anthropic released Claude Sonnet 4.6 on Tuesday, calling it the company's "most capable Sonnet model yet" with significant upgrades across coding, computer use, and long-context reasoning capabilities.

The model is now the default for free and Pro plan users on claude.ai and Claude Cowork, maintaining the same pricing as its predecessor at $3 per million input tokens and $15 per million output tokens.

Developer Reception

Internal testing showed developers with early access preferred Sonnet 4.6 over Sonnet 4.5 roughly 70% of the time. More notably, users preferred the new model over Claude Opus 4.5—Anthropic's frontier model from November—59% of the time.

Users reported that Sonnet 4.6 "more effectively read the context before modifying code and consolidated shared logic rather than duplicating it," according to Anthropic. Testing also showed the model was "significantly less prone to overengineering and 'laziness,' and meaningfully better at instruction following."

Computer Use Advances

The model shows marked improvement on OSWorld, the standard benchmark for AI computer use that tests tasks across real software like Chrome, LibreOffice, and VS Code. Anthropic noted that early users are "seeing human-level capability in tasks like navigating a complex spreadsheet or filling out a multi-step web form."

Pace, an AI-powered insurance company, reported that "Claude Sonnet 4.6 hit 94% on our insurance benchmark, making it the highest-performing model we've tested for computer use."

Benchmark Performance

The model achieved 79.6% on SWE-bench Verified, 89.9% on GPQA Diamond, and 58.3% on ARC-AGI-2. On Anthropic's benchmarks for agentic financial analysis and office tasks, Sonnet 4.6 outperformed competitors including Google's Gemini 3 Pro and OpenAI's GPT 5.2.

Replit noted that "the performance-to-cost ratio of Claude Sonnet 4.6 is extraordinary—it's hard to overstate how fast Claude models have been evolving in recent months."

Technical Features

Sonnet 4.6 includes a 1 million token context window in beta—sufficient to hold entire codebases or dozens of research papers in a single request. The model supports adaptive thinking, extended thinking, and context compaction, which automatically summarizes older context as conversations approach limits.

Anthropic's safety evaluations concluded that Sonnet 4.6 shows "a broadly warm, honest, prosocial, and at times funny character, very strong safety behaviors, and no signs of major concerns around high-stakes forms of misalignment."

The company acknowledged the model "still lags behind the most skilled humans at using computers" but emphasized that the rate of progress suggests "substantially more capable models are within reach."

Share:

Other Latest News

Gemini Spark Comes to Mac With Local File Access and MCP Support
AI Agents, News & Updates

Gemini Spark Comes to Mac With Local File Access and MCP Support

Google has added Gemini Spark to the Gemini desktop app for macOS, giving it access to local files, MCP server connections, real-time topic tracking, and new integrations with Canva, Dropbox, and more.

Jul 3, 2026
Google Drops ADK 2.0 and Genkit Agents for AI App Builders
AI Agents, News & Updates

Google Drops ADK 2.0 and Genkit Agents for AI App Builders

Google published three developer-facing launches in 24 hours: ADK 2.0's graph-based workflow runtime, the Genkit Agents API for full-stack conversational AI, and a Google Cloud Workbench VS Code extension — a coordinated push to own the agent-development stack.

Jul 3, 2026
Fable 5 Returns Globally as US Lifts Export Controls on Anthropic
News & Updates, AI Agents, Security

Fable 5 Returns Globally as US Lifts Export Controls on Anthropic

Anthropic restored global access to Claude Fable 5 on July 1 after the US Commerce Department lifted the 19-day export controls. Claude Code and Claude.ai users get the model back with a new 99%+ jailbreak classifier and an industry-wide safety framework now in development.

Jul 2, 2026
Anthropic Launches Claude Sonnet 5 With Near-Opus Coding Performance
AI Agents, News & Updates

Anthropic Launches Claude Sonnet 5 With Near-Opus Coding Performance

Anthropic's new Claude Sonnet 5 brings near-Opus 4.8 agentic coding performance at Sonnet pricing, with a 1M-token context window and introductory rates of $2/$10 per million tokens through August 31.

Jul 1, 2026
GPT-5.6 Is Already Running in Some Codex Sessions
News & Updates, AI Agents

GPT-5.6 Is Already Running in Some Codex Sessions

Developers have found a technique to detect whether GPT-5.6 Sol is already serving their Codex sessions — and some sessions are returning signals consistent with the new model before any public rollout.

Jun 30, 2026
Cursor Launches iOS App in Public Beta for All Paid Plans
Code Editors, News & Updates, AI Agents

Cursor Launches iOS App in Public Beta for All Paid Plans

Cursor's native iOS app is now in public beta on all paid plans, letting developers launch cloud agents, remote-control desktop sessions, and merge PRs directly from their phone.

Jun 30, 2026
← Scroll for more →