DeepSeek Releases V4 Open-Source Model Series with 1.6T Parameters and MIT License

Gate News message, April 24 — DeepSeek has released the V4 series of open-source models under the MIT License, with weights now available on Hugging Face and ModelScope. The series includes two mixture-of-experts (MoE) models: V4-Pro with 1.6 trillion total parameters and 49 billion activated per token, and V4-Flash with 284 billion total parameters and 13 billion activated per token. Both support a 1 million token context window.

The architecture features three key upgrades: a hybrid attention mechanism combining compressed sparse attention (CSA) and heavily compressed attention (HCA) that significantly reduces long-context overhead—V4-Pro’s inference FLOPs for 1M context is just 27% of V3.2’s, and KV cache (VRAM for storing historical information during inference) is only 10% of V3.2’s; manifold-constrained hyperconnections (mHC) replacing traditional residual connections to enhance cross-layer signal propagation stability; and the Muon optimizer for faster training convergence. Pre-training used over 32 trillion tokens of data.

Post-training employs a two-stage approach: first training domain-specific experts via supervised fine-tuning (SFT) and GRPO reinforcement learning, then merging them into a single model through online distillation. V4-Pro-Max (highest inference mode) claims to be the strongest open-source model with top-tier coding benchmarks and significantly narrowed gaps with closed-source frontier models on reasoning and agent tasks. V4-Flash-Max achieves Pro-level reasoning performance with sufficient compute budget but is limited by parameter scale on pure knowledge and complex agent tasks. Weights are stored in mixed FP4+FP8 precision.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

DeepSeek Seeks $1.8B Funding at $20B Valuation Amid Talent Exodus

Gate News message, April 25 — DeepSeek is planning to raise $1.8 billion, valuing the company at approximately $20 billion, according to sources familiar with the matter. The fundraising push comes as the AI startup faces significant talent attrition, with multiple core researchers departing to

GateNews1h ago

Judge Dismisses Fraud Claims in Elon Musk's OpenAI Lawsuit; Case Advances to Trial with Two Remaining Allegations

Gate News message, April 24 — A federal judge has dismissed fraud claims from Elon Musk's lawsuit against OpenAI, Sam Altman, Greg Brockman, and Microsoft, clearing the way for the case to proceed to trial on two remaining allegations: breach of charitable trust and unjust enrichment. U.S.

GateNews4h ago

OpenAI CEO Sam Altman Apologizes for Failing to Report School Shooter's Banned Account to Police

Gate News message, April 25 — OpenAI Chief Executive Officer Sam Altman apologized to the Tamborine community in Canada for the company's failure to notify police about a banned account linked to Jesse Van Rootselaar, who killed eight people at a school in February before taking his own life. OpenAI

GateNews4h ago

UAE Announces Shift Toward AI Government Model in the Next Two Years

His Highness Sheikh Mohammed bin Rashid Al Maktoum stated that the goal was for 50% of government sectors to operate through autonomous agentic AI. The transition will also include the training of federal employees to “master AI” and will be overseen by Sheikh Mansour bin Zayed. Key Takeaways:

Coinpedia5h ago

AI Trading Platform Fere AI Raises $1.3M in Funding Led by Ethereal Ventures

Gate News message, April 25 — Fere AI, an AI-powered digital asset trading platform, announced the completion of a $1.3 million funding round led by Ethereal Ventures, with participation from Galaxy Vision Hill and Kosmos Ventures, according to Globenewswire. The platform supports cross-chain

GateNews5h ago

Google adds another $40 billion investment in Anthropic: first pays $10 billion, then releases $30 billion based on performance, with 5GW of TPU compute power

Alphabet increases its investment in Anthropic to $40 billion, to be carried out in two phases: the first tranche of $10 billion as a cash injection, with a valuation of $380 billion; the remaining $30 billion will be released in stages after performance targets are met. Over the same five-year period, Google Cloud will provide 5 GW of TPU computing resources; in parallel, Amazon also announced an investment of up to $25 billion, indicating that Anthropic’s compute and capital support are being strengthened in tandem.

ChainNewsAbmedia6h ago
Comment
0/400
No comments