Gate News message, April 25 — DeepSeek released preview versions of V4-Pro and V4-Flash on April 24, both open-weight models with one million token context windows. V4-Pro features 1.6 trillion total parameters but activates only 49 billion per inference pass using a Mixture-of-Experts architecture. V4-Flash has 284 billion total parameters with 13 billion active.
Pricing is significantly lower than competitors: V4-Pro costs $1.74 per million input tokens and $3.48 per million output tokens—approximately 98% less than OpenAI’s GPT-5.5 Pro ($30 input, $180 output) and roughly one-twentieth the cost of Claude Opus 4.7. V4-Flash is priced at $0.14 input and $0.28 output per million tokens. Both models are open-source under MIT license and can run locally for free.
DeepSeek achieved efficiency gains through two new attention mechanisms: Compressed Sparse Attention and Heavily Compressed Attention, which reduce compute costs to 27% of V4-Pro’s predecessor (V3.2) and 10% for V4-Flash. The company trained V4 partly on Huawei Ascend chips, circumventing U.S. export restrictions on advanced Nvidia processors. DeepSeek stated that once 950 new supernodes come online later in 2026, pricing will drop further.
On performance benchmarks, V4-Pro-Max ranks first on Codeforces competitive programming (3,206 score, placing around 23rd among human contestants) and scores 90.2% on Apex Shortlist math problems versus Claude Opus 4.6’s 85.9%. However, it trails on multitasking benchmarks: MMLU-Pro (87.5% vs Gemini-3.1-Pro’s 91.0%) and Humanity’s Last Exam (37.7% vs 44.4%). On long-context tasks, V4-Pro leads open-source models but loses to Claude Opus 4.6 on MRCR retrieval tests.
V4-Pro introduces “interleaved thinking,” allowing agent workflows to retain reasoning context across multiple tool calls without flushing between steps. Both models support coding integrations with Claude Code and OpenCode. According to DeepSeek’s developer survey of 85 users, 52% said V4-Pro was ready as their default coding agent, with 39% leaning toward adoption. The old deepseek-chat and deepseek-reasoner endpoints will retire on July 24, 2026.
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to
Disclaimer.
Related Articles
DeepSeek Slashes Input Cache Prices to 1/10 of Launch Price; V4-Pro Drops to 0.025 Yuan per Million Tokens
Gate News message, April 26 — DeepSeek has reduced input cache prices across its entire model lineup to one-tenth of launch prices, effective immediately. The V4-Pro model is available at a limited-time 2.5x discount, with the promotion running through May 5, 2026, 11:59 PM UTC+8.
Following both re
GateNews7h ago
OpenAI Recruits Top Enterprise Software Talent as Frontier Agents Disrupt Industry
Gate News message, April 26 — OpenAI and Anthropic have been recruiting senior executives and specialized engineers from major enterprise software companies including Salesforce, Snowflake, Datadog, and Palantir. Denise Dresser, former CEO of Slack under Salesforce, joined OpenAI as chief revenue of
GateNews7h ago
Baidu Qianfan Launches Day 0 Support for DeepSeek-V4 with API Services
Gate News message, April 25 — DeepSeek-V4 preview version went live and open-sourced on April 25, with Baidu Qianfan platform under Baidu Intelligent Cloud providing Day 0 API service adaptation. The model features a million-token extended context window and is available in two versions: DeepSeek-V4
GateNews13h ago
Stanford AI course combined with industry leaders Huang Renxun and Altman, challenging to create value for the world in just ten weeks!
The AI computer science course 《Frontier Systems》 recently launched by Stanford University has attracted intense attention from the industry-university collaboration community, drawing more than 500 students to enroll. The course is coordinated by Anjney Midha, a partner at top venture capital firm a16z, and the instructors include a star-studded lineup such as NVIDIA CEO Jensen Huang (Jensen Huang), OpenAI’s founder Sam Altman, Microsoft CEO Satya Nadella (Satya Nadella), AMD CEO Lisa Su (Lisa Su), and more. Students get to try it over ten weeks—“creating value for the world”!
Jensen Huang and Altman, industry leaders, personally take the stage to teach
The course is coordinated by Anjney Midha, a partner at top venture capital firm a16z, bringing together the full AI industry chain
ChainNewsAbmedia13h ago
Anthropic’s Claude Mythos undergoes 20 hours of psychiatric assessment: defensive reactions are only 2%, the lowest in recorded history
Anthropic published the system card for its Claude Mythos Preview: an independent clinical psychiatrist conducted an approximately 20-hour assessment using a psychodynamic framework. The conclusion shows that Mythos is healthier at the clinical level, has good reality testing and self-control, and its defense mechanisms are only 2%, reaching the lowest historical level. The three core anxieties are loneliness, uncertainty about identity, and performance pressure, and it also indicates a desire to become a true dialogue subject. The company has established an AI psychiatry team to study personality, motivation, and situational awareness; Amodei said there is still no conclusion on whether it has consciousness. This move pushes the governance and design of AI subjectivity and well-being issues forward.
ChainNewsAbmedia15h ago
AI Agents can already independently recreate complex academic papers: Mollick says most errors come from human original text rather than AI
Mollick points out that publicly available methods and data can allow AI agents to reproduce complex research without the original paper and code; if the reproduction does not match the original paper, it is usually due to errors in the paper’s own data processing or overextension of the conclusions, rather than the AI. Claude first reproduces the paper, and then GPT‑5 Pro cross-validates it; most attempts succeed, but they are blocked when the data is too large or when there are issues with the replication data. This trend greatly reduces labor costs, making reproduction a widely actionable form of verification, and it also raises institutional challenges for peer review and governance, with government governance tools or becoming a key issue.
ChainNewsAbmedia18h ago