Xiaomi Launches Powerful AI Model MiMo-V2 Pro With 1 Trillion Parametres, 1 Million Token Context Window

The MiMo-V2 lineup includes three models: MiMo-V2-Pro, MiMo-V2-Omni, and MiMo-V2-TTS.

Madhur Chaturvedi
Technology
Mar 19, 2026 13:02 pm IST
- Published On Mar 19, 2026 11:47 am IST
- Last Updated On Mar 19, 2026 13:02 pm IST

Read Time: 2 mins

Twitter
WhatsApp
Facebook
Reddit
Email

Xiaomi launched its self-developed MiMo-V2 series of large AI models.

Xiaomi

A mysterious artificial intelligence model, which surfaced anonymously on a developer platform last week under the codename “Hunter Alpha,” is none other than the second iteration of Chinese tech major Xiaomi's MiMo AI model. The company launched its self-developed MiMo-V2 series of large AI models (succeeding the 2025 release of Xiaomi MiMo) in a surprise release, marking an aggressive entry into the “agent era” of AI.

The MiMo-V2 lineup includes three models: MiMo-V2-Pro, MiMo-V2-Omni, and MiMo-V2-TTS. While some native integrations in apps like Xiaomi Browser and Kingsoft Office are limited to China, the models are browser-based and accessible worldwide through Xiaomi MiMo Studio or the official API website.

MiMo-V2-Pro

The MiMo-V2-Pro is the flagship model that targets complex workflows with minimal human input and focuses on logic reasoning and task planning. It features 1 trillion total parametres and uses a mixed-attention architecture to support a massive 1M token context window.

Tested as “Hunter Alpha,” it achieved a 75.7 average score on the Claw-Eval benchmark (top three globally, behind Anthropic's Claude Opus 4.6) and scored 49 on the Artificial Analysis Intelligence Index (second in China, eighth worldwide, beating models like Gemini 3 Flash and Grok 4.20). The model reportedly matches the coding capabilities of Claude Opus 4 but at lower costs.

MiMo-V2-Omni

The MiMo-V2-Omni handles image, video, audio, and text inputs. Codenamed “Healer Alpha,” it topped the PinchBench leaderboard and reportedly beat competitors like Claude Opus 4.6 and Gemini 3 Pro in speech reasoning (scored 94 on BigBench Audio), audio understanding (69.4 on MMAU-Pro), and video future event forecast (66.7 on FutureOmni). It can self-plan and execute across modalities.

MiMo-V2-TTS

This speech synthesis model is trained on audio data, which makes it capable of emotional transitions, mid-sentence tone shifts, singing, and synthesis of regional dialects like Sichuan, Cantonese, Henan, and Taiwanese.

Also read: Instagram Linked To Poorer Mental Health Outcomes Than WhatsApp: Study

Essential Business Intelligence, Sharp Market Insights, Practical Personal Finance Advice, Daily Fuel, Gold and Silver Prices and Latest Stories — On NDTV Profit.