DeepSeek Launches V4 with 1 Million Token Context as Standard

DeepSeek has released its V4 models, featuring a groundbreaking 1 million token context as the new baseline instead of a premium feature. The V4-Pro model boasts 1.6 trillion parameters with 49 billion active, while the V4-Flash offers 284 billion parameters with 13 billion active, both open and accessible via API and chat.deepseek.com.

The key innovation is DeepSeek’s redesigned attention mechanism with token compression and DeepSeek Sparse Attention, making long context usage truly affordable. Pricing for V4-Pro is $0.145 per million tokens input and $3.48 for output, while the Flash version is much cheaper at $0.028 input and $0.28 output.

These models support OpenAI ChatCompletions and Anthropic API formats with Thinking and Non-Thinking modes. DeepSeek plans to leverage Huawei’s Ascend-powered infrastructure later in 2026 to further reduce costs.

For technical details and model access, see the DeepSeek V4 Tech Report and Open Weights on Hugging Face.

Source: @ai_machinelearning_big_data Twitter thread