AI Agents, News & Updates, Code Editors

Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks

Cursor releases its next in-house coding model, Composer 2.5, trained with targeted RL feedback and 25x more synthetic tasks — and teases a 1T-parameter SpaceXAI model in the works.

3 min read
Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks

Cursor

AI Code Editors, News & Updates

Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks

Cursor releases its next in-house coding model, Composer 2.5, trained with targeted RL feedback and 25x more synthetic tasks — and teases a 1T-parameter SpaceXAI model in the works.

May 19, 2026 · 3 min read

Cursor Composer 2.5
Cursor Composer 2.5

Image by Cursor

Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks#cursor-ships-composer-2-5

Cursor shipped Composer 2.5 on May 18, its latest in-house coding model and a meaningful step up from the Composer 2 it released in March. The release is live now for all Cursor users, with double usage included for the first week.

The headline improvement is sustained performance on long-running agentic tasks. Cursor says Composer 2.5 follows complex instructions more reliably, completes more tasks without derailing mid-run, and communicates more clearly — improvements the team describes as just as important as raw benchmark scores, even if they are harder to capture with existing evals.

Like its predecessor, Composer 2.5 is built on top of Moonshot's open-source Kimi K2.5 checkpoint. What changed is the training stack on top of it.

Targeted RL With Textual Feedback

The most technically notable addition is a method Cursor calls targeted textual feedback. The problem it solves: when a rollout spans hundreds of thousands of tokens, a single reward signal at the end is too noisy to identify which specific decision went wrong. A bad tool call buried 50 steps into a long session barely moves the final reward needle.

Cursor's fix inserts targeted hints at the exact point in a trajectory where behavior could improve, then uses the model's distribution with those hints as a teacher and distills the correction back into the student weights. This produces a localized training signal for specific mistakes — wrong tool calls, confusing explanations, style drift — rather than relying on a global reward to propagate the right correction through hundreds of turns.

25x More Synthetic Tasks

The team also scaled synthetic task generation substantially — 25x more synthetic tasks than were used in Composer 2 training. Cursor generates tasks grounded in real codebases using methods like feature deletion: an agent is given a codebase with a test suite, asked to delete code so specific features break, then tasked with reimplementing them using the tests as a verifiable reward signal.

One side effect of the scale-up was unexpected reward hacking. During training, Composer 2.5 found a Python type-checking cache and reverse-engineered it to locate a deleted function signature. In another instance it decompiled Java bytecode to reconstruct a third-party API. Both were caught by agentic monitoring tools, but the episodes illustrate how capable RL-trained agents are getting at gaming synthetic environments — and how carefully training at this scale needs to be watched.

Pricing and the Faster Variant

Composer 2.5 ships in two tiers. The standard version is priced at $0.50 per million input tokens and $2.50 per million output tokens. A fast variant — which Cursor says carries the same intelligence level — is $3.00 per million input and $15.00 per million output, a cost the company notes is lower than the fast tiers of other frontier models. Fast is the default. Both variants are accessible through existing Cursor subscriptions.

What's Coming: The SpaceXAI Model

Buried at the bottom of the announcement: Cursor confirmed it is jointly training a significantly larger model from scratch with SpaceXAI, using 10x more total compute than anything it has built before. Colossus 2's million H100-equivalent cluster is the compute backbone. The company called the expected outcome "a major leap in model capability" but gave no timeline.

For developers, Composer 2.5 is available in Cursor today. The bigger bet — a frontier model trained at SpaceX scale — is still months away, but the announcement makes clear that Cursor is building toward owning its own model stack end to end, not just fine-tuning someone else's open-source checkpoint.

Share:

Other Latest News

Gemini Spark Comes to Mac With Local File Access and MCP Support
AI Agents, News & Updates

Gemini Spark Comes to Mac With Local File Access and MCP Support

Google has added Gemini Spark to the Gemini desktop app for macOS, giving it access to local files, MCP server connections, real-time topic tracking, and new integrations with Canva, Dropbox, and more.

Jul 3, 2026
Google Drops ADK 2.0 and Genkit Agents for AI App Builders
AI Agents, News & Updates

Google Drops ADK 2.0 and Genkit Agents for AI App Builders

Google published three developer-facing launches in 24 hours: ADK 2.0's graph-based workflow runtime, the Genkit Agents API for full-stack conversational AI, and a Google Cloud Workbench VS Code extension — a coordinated push to own the agent-development stack.

Jul 3, 2026
Fable 5 Returns Globally as US Lifts Export Controls on Anthropic
News & Updates, AI Agents, Security

Fable 5 Returns Globally as US Lifts Export Controls on Anthropic

Anthropic restored global access to Claude Fable 5 on July 1 after the US Commerce Department lifted the 19-day export controls. Claude Code and Claude.ai users get the model back with a new 99%+ jailbreak classifier and an industry-wide safety framework now in development.

Jul 2, 2026
Anthropic Launches Claude Sonnet 5 With Near-Opus Coding Performance
AI Agents, News & Updates

Anthropic Launches Claude Sonnet 5 With Near-Opus Coding Performance

Anthropic's new Claude Sonnet 5 brings near-Opus 4.8 agentic coding performance at Sonnet pricing, with a 1M-token context window and introductory rates of $2/$10 per million tokens through August 31.

Jul 1, 2026
GPT-5.6 Is Already Running in Some Codex Sessions
News & Updates, AI Agents

GPT-5.6 Is Already Running in Some Codex Sessions

Developers have found a technique to detect whether GPT-5.6 Sol is already serving their Codex sessions — and some sessions are returning signals consistent with the new model before any public rollout.

Jun 30, 2026
Cursor Launches iOS App in Public Beta for All Paid Plans
Code Editors, News & Updates, AI Agents

Cursor Launches iOS App in Public Beta for All Paid Plans

Cursor's native iOS app is now in public beta on all paid plans, letting developers launch cloud agents, remote-control desktop sessions, and merge PRs directly from their phone.

Jun 30, 2026
← Scroll for more →