Alibaba Qwen launches the Qwen3.5-Omni full-modal model, supporting 113 speech recognition languages

Gate News message, March 30, Alibaba Qianwen announced the launch of the all-modal large model Qwen3.5-Omni. This series includes Instruct versions in three sizes: Plus, Flash, and Light. It supports a 256k long context window. The model supports audio input of more than 10 hours and video-and-audio input of over 400 seconds of 720P (1FPS). The model is natively multimodal pre-trained on large-scale text, vision, and more than 100 million hours of video-and-audio data, demonstrating outstanding all-modal perception and generation capabilities. Compared with the previous-generation Qwen3-Omni, Qwen3.5-Omni has greatly improved multilingual capabilities, enabling speech recognition in 113 languages and dialects and speech generation in 36 languages and dialects.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments