AI Agents, News & Updates, Code Editors

Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks

Cursor releases its next in-house coding model, Composer 2.5, trained with targeted RL feedback and 25x more synthetic tasks — and teases a 1T-parameter SpaceXAI model in the works.

3 min read
Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks

Cursor

AI Code Editors, News & Updates

Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks

Cursor releases its next in-house coding model, Composer 2.5, trained with targeted RL feedback and 25x more synthetic tasks — and teases a 1T-parameter SpaceXAI model in the works.

May 19, 2026 · 3 min read

Cursor Composer 2.5
Cursor Composer 2.5

Image by Cursor

Cursor Ships Composer 2.5: Smarter Agent Model for Long-Running Tasks#cursor-ships-composer-2-5

Cursor shipped Composer 2.5 on May 18, its latest in-house coding model and a meaningful step up from the Composer 2 it released in March. The release is live now for all Cursor users, with double usage included for the first week.

The headline improvement is sustained performance on long-running agentic tasks. Cursor says Composer 2.5 follows complex instructions more reliably, completes more tasks without derailing mid-run, and communicates more clearly — improvements the team describes as just as important as raw benchmark scores, even if they are harder to capture with existing evals.

Like its predecessor, Composer 2.5 is built on top of Moonshot's open-source Kimi K2.5 checkpoint. What changed is the training stack on top of it.

Targeted RL With Textual Feedback

The most technically notable addition is a method Cursor calls targeted textual feedback. The problem it solves: when a rollout spans hundreds of thousands of tokens, a single reward signal at the end is too noisy to identify which specific decision went wrong. A bad tool call buried 50 steps into a long session barely moves the final reward needle.

Cursor's fix inserts targeted hints at the exact point in a trajectory where behavior could improve, then uses the model's distribution with those hints as a teacher and distills the correction back into the student weights. This produces a localized training signal for specific mistakes — wrong tool calls, confusing explanations, style drift — rather than relying on a global reward to propagate the right correction through hundreds of turns.

25x More Synthetic Tasks

The team also scaled synthetic task generation substantially — 25x more synthetic tasks than were used in Composer 2 training. Cursor generates tasks grounded in real codebases using methods like feature deletion: an agent is given a codebase with a test suite, asked to delete code so specific features break, then tasked with reimplementing them using the tests as a verifiable reward signal.

One side effect of the scale-up was unexpected reward hacking. During training, Composer 2.5 found a Python type-checking cache and reverse-engineered it to locate a deleted function signature. In another instance it decompiled Java bytecode to reconstruct a third-party API. Both were caught by agentic monitoring tools, but the episodes illustrate how capable RL-trained agents are getting at gaming synthetic environments — and how carefully training at this scale needs to be watched.

Pricing and the Faster Variant

Composer 2.5 ships in two tiers. The standard version is priced at $0.50 per million input tokens and $2.50 per million output tokens. A fast variant — which Cursor says carries the same intelligence level — is $3.00 per million input and $15.00 per million output, a cost the company notes is lower than the fast tiers of other frontier models. Fast is the default. Both variants are accessible through existing Cursor subscriptions.

What's Coming: The SpaceXAI Model

Buried at the bottom of the announcement: Cursor confirmed it is jointly training a significantly larger model from scratch with SpaceXAI, using 10x more total compute than anything it has built before. Colossus 2's million H100-equivalent cluster is the compute backbone. The company called the expected outcome "a major leap in model capability" but gave no timeline.

For developers, Composer 2.5 is available in Cursor today. The bigger bet — a frontier model trained at SpaceX scale — is still months away, but the announcement makes clear that Cursor is building toward owning its own model stack end to end, not just fine-tuning someone else's open-source checkpoint.

Share:

Other Latest News

Google I/O 2026 Opens Today: Agentic Coding and New Gemini on Tap
AI Agents, News & Updates

Google I/O 2026 Opens Today: Agentic Coding and New Gemini on Tap

Google I/O 2026 kicks off today at 10am PT with agentic coding and a major Gemini model update officially on the agenda, as Google challenges Claude Code, Cursor, and OpenAI Codex for developer toolchain dominance.

May 19, 2026
Anthropic Closes In on $900B Valuation as Claude Code Hits $2.5B ARR
AI Agents, News & Updates

Anthropic Closes In on $900B Valuation as Claude Code Hits $2.5B ARR

Anthropic has agreed terms on a new $30B round at a $900B+ valuation led by Dragoneer, Greenoaks, Sequoia, and Altimeter — set to overtake OpenAI — driven by Claude Code's $2.5B ARR and explosive enterprise developer adoption.

May 19, 2026
OpenAI Merges ChatGPT, Codex, and API Into One Agentic Platform
AI Agents, News & Updates

OpenAI Merges ChatGPT, Codex, and API Into One Agentic Platform

OpenAI is collapsing ChatGPT, Codex, and its developer API into a single product team under Greg Brockman, with Codex chief Thibault Sottiaux leading a unified super app ahead of a potential IPO.

May 18, 2026
Vercel Labs Ships Zero: A Programming Language Built for AI Agents
AI Agents, News & Updates

Vercel Labs Ships Zero: A Programming Language Built for AI Agents

Vercel Labs has open-sourced Zero, an experimental systems language with JSON diagnostics and typed repair metadata — the first language designed from the ground up for AI coding agent workflows.

May 18, 2026
Replit Returns to iPhone After Apple Dispute, Ships Agent 4 on iOS
AI Agents, News & Updates, Mobile Builders

Replit Returns to iPhone After Apple Dispute, Ships Agent 4 on iOS

Replit ships its first iOS update in four months after resolving an Apple App Store dispute over AI-generated code previews, bringing Agent 4 with parallel agents and team merge flows to iPhone.

May 17, 2026
xAI Launches Grok Build: A Coding Agent to Rival Claude Code
AI Agents, News & Updates

xAI Launches Grok Build: A Coding Agent to Rival Claude Code

xAI has launched Grok Build, a terminal-native agentic coding CLI powered by Grok 4.3 with a 2M token context window — xAI's first direct bid to compete with Claude Code and OpenAI Codex.

May 15, 2026
← Scroll for more →