Gate News message, April 27 — Xiaomi’s MiMo team has open-sourced the MiMo-V2.5 series of large language models under MIT license, supporting commercial deployment, continued training, and fine-tuning. Both models feature a 1 million token context window. MiMo-V2.5-Pro is a pure-text mixture-of-experts (MoE) model with 1.02 trillion total parameters and 42 billion active parameters, while MiMo-V2.5 is a native multimodal model with 310 billion total parameters and 15 billion active parameters, supporting text, image, video, and audio understanding.
MiMo-V2.5-Pro targets complex agent and programming tasks. In ClawEval benchmarks, it achieved 64% Pass@3 while consuming approximately 70,000 tokens per task trajectory—40% to 60% fewer tokens than Claude Opus, Gemini 3.1 Pro, and GPT-5.4. The model scored 78.9 on SWE-bench Verified. In a demonstration, V2.5-Pro independently implemented a complete SysY-to-RISC-V compiler for a Peking University compiler course project in 4.3 hours with 672 tool calls, achieving a perfect score of 233/233 on hidden test sets.
MiMo-V2.5 is designed for multimodal agent scenarios, equipped with a dedicated vision encoder (729 million parameters) and audio encoder (261 million parameters), scoring 62.3 on the Claw-Eval general subset. Both models employ a hybrid architecture combining sliding window attention (SWA) and global attention (GA), paired with a 3-layer multi-token prediction (MTP) module for accelerated inference. Model weights are available on Hugging Face.
Alongside the open-source release, the MiMo team launched the “Orbit Quadrillion Token Creator Incentive Program,” offering 100 quadrillion tokens free over 30 days to global users. Individual developers, teams, and enterprises can apply via the program page with an evaluation cycle of approximately 3 business days; approved benefits are distributed as Token Plans or direct credits, compatible with tools like Claude Code and Cursor.
相关文章