DeepSeek is
about to make noise again. The company that rattled the AI industry with R1 back in early 2025 is reportedly weeks away from launching V4 — and the specs being discussed are genuinely ambitious. Late April is the current target. Whether it actually hits that window is another question, given V4 has already been delayed twice.
5 Key Takeaways:
- DeepSeek V4 is expected to launch in late April 2026, according to Reuters citing The Information — but the model has already missed two earlier target windows
- The model is reported to feature roughly 1 trillion total parameters using a Mixture-of-Experts architecture, with only 32–37 billion parameters active per token — keeping inference costs manageable
- A 1 million token context window is widely reported but not yet officially confirmed by DeepSeek — treat it as a strong rumor, not a guaranteed spec
- Reuters confirmed V4 will run on Huawei's Ascend chips, making it the first frontier-class AI model built around Chinese domestic semiconductor infrastructure
- DeepSeek deliberately denied NVIDIA and AMD early optimization access while giving Chinese chipmakers that window — a strategic signal about the direction of China's AI hardware ecosystem
What a Trillion Parameters Actually Means in Practice
Big parameter counts sound impressive. They're also frequently misunderstood.
V4 is reportedly built on a Mixture-of-Experts architecture. That means the full model has roughly 1 trillion parameters in total — but only around 32–37 billion are active for any given task. It's the same approach
DeepSeek used with V3, which had 671 billion total parameters.
The practical result? Inference costs stay relatively flat even as the total model size grows dramatically. You get the capability gains from a much larger model without the proportional compute bill. That's the whole point of MoE at this scale — and it's why DeepSeek's pricing has been so aggressive compared to competitors.
The Huawei Chip Detail Is the Real Story
Reuters confirmed this directly: V4 runs on Huawei's Ascend processors.
Not NVIDIA. Not AMD. Huawei. DeepSeek reportedly gave Chinese domestic chipmakers early optimization access while deliberately denying that window to Western silicon suppliers. That's not an accident. That's a statement about where China's AI infrastructure is heading.
US export controls have cut off Chinese AI developers from the most advanced NVIDIA hardware. V4 being built and optimized for Huawei's Ascend chips — and performing at frontier level — would be a direct counter-argument to the idea that export controls are slowing China's AI progress down meaningfully.
If the benchmarks hold up under independent testing, this matters well beyond DeepSeek itself.
The 1 Million Token Context Window — Real, but Unverified
The 1 million token context window is the most-discussed spec and the least confirmed.
DeepSeek hasn't published official V4 documentation yet. The API still lists V3.2 as the current model. The 1M token claim comes from leaks, research paper trails, and community analysis — not an official spec sheet.
That said, the underlying architecture supports it. DeepSeek published research on Engram — a conditional memory system designed specifically for long-context retrieval — in January 2026. V4-Lite, a smaller variant already live on some API nodes, is showing dramatically improved context recall in early testing. The full model extending that to 1 million tokens is plausible. Just not confirmed.
Two Delays Already. Will Late April Actually Happen?
Honestly? Maybe. The original target was February. Then it slipped again. Reuters now says "next few weeks" as of early April — which points to the last two weeks of the month as the most likely window.
V4-Lite has been live-tested on API infrastructure since early April, which usually signals the full model is close. When DeepSeek starts stress-testing infrastructure, a launch tends to follow quickly.
The AI industry watches every DeepSeek release closely now. V4 arriving on Huawei chips, at frontier performance, at DeepSeek's typically aggressive pricing — that combination would send another shockwave through the market. The only question is exactly when the clock runs out on "soon."