DeepSeek Drops V4: 1.6T-Parameter Open Model Targets Frontier

DeepSeek Releases V4: The Most Capable Open-Source Model Yet

Chinese AI lab DeepSeek today published preview versions of its long-awaited V4 model family on Hugging Face, marking the company's most significant release since its R1 reasoning model shook global markets in January 2025. The launch introduces two variants: V4-Pro, with 1.6 trillion parameters — the company's largest model by that metric — and V4-Flash at 284 billion parameters.

Both models ship with a 1 million-token context window, a dramatic jump from the 128,000-token limit on V3. DeepSeek says the extended window was achieved with "world-leading" cost efficiency, a claim that directly challenges OpenAI's and Google's larger proprietary frontier models. Like its predecessors, V4 is fully open-source, meaning developers can download, run locally, and modify both variants.

The Architecture That Matters

DeepSeek's technical report highlights a new Hybrid Attention Architecture that improves the model's ability to retain context across long conversations — a key upgrade for agentic and multi-step coding workflows where coherence over many tool calls has historically degraded. The V4-Pro model also leverages a Mixture-of-Experts (MoE) design that activates only a fraction of its 1.6 trillion parameters per token, keeping inference costs manageable despite the scale.

The company touted top-tier performance on coding benchmarks and marked improvements in reasoning and agent-based tasks. Both Huawei and AI chipmaker Cambricon Technologies moved quickly to announce compatibility with the new models, with Huawei declaring "full support" for its Ascend AI processors to serve V4 for inference.

The Developer Impact

For developers, V4 arrives with two headline properties: a context window large enough to fit entire mid-size codebases in a single prompt, and pricing that undercuts most proprietary alternatives. V4-Flash's token pricing is reportedly set to match DeepSeek's V2 model from June 2024 — among the cheapest rates for any frontier-grade model. V4-Pro pricing is currently constrained by hardware supply, but DeepSeek said costs will "drop significantly" in the second half of 2026 once Huawei Ascend 950PR super nodes scale up.

The 1M-token context window is especially significant for AI-assisted development. Tools like Cursor, Windsurf, and Claude Code — which compete on deep codebase understanding — benefit directly from longer context. Developers running self-hosted or API-based coding assistants on DeepSeek models will gain significantly improved coherence on large-repo tasks, refactors, and long agent runs.

The open-source licensing also means that third-party integrations — MCP servers, local inference setups via Ollama, and custom API proxies — will likely surface within days of the Hugging Face publication, as happened with V3.

Hardware and Geopolitical Context

V4's chip story is conspicuously incomplete. DeepSeek did not disclose what hardware was used to train the model. Its technical report acknowledges writing GPU kernels adapted to both Nvidia and Huawei chips — a careful framing that sidesteps accusations from US officials who had alleged the company used banned Nvidia Blackwell chips in prior training runs.

Huawei's fast confirmation of Ascend 950PR support signals that this model was at least partially designed to run on domestic Chinese silicon, reinforcing DeepSeek's positioning as the leading AI lab capable of competing with US frontier labs without relying on US hardware.

What's Unconfirmed

DeepSeek has not released independently verified benchmark scores for V4 as of publication. The company's internal claims of competitive performance with closed models from OpenAI and Google DeepMind have not been reproduced by third parties. Benchmark figures from DeepSeek's own report — including cited coding and reasoning scores — should be treated as preliminary until external evaluations confirm them.

The full parameter architecture and training compute details also remain undisclosed. For developers evaluating V4 for production use, independent evals on representative tasks remain the more reliable signal than launch-day claims.