Google AI developer relations lead Logan Kilpatrick announced on April 15 the release of Gemini 3.1 Flash TTS—the latest text-to-speech model from Google. This model supports 70 languages, fine-grained control at the level of scene direction and speakers, and audio tags. It is now available for use in the audio playground in Google AI Studio and in the Gemini API.
Four core features
Gemini 3.1 Flash TTS comes with four notable upgrades compared with its predecessor:
Scene Direction — You can set a context for the voice, such as “speaking softly in a noisy café” or “excitedly announcing good news,” and the model will adjust tone, speaking pace, and emotion based on the scene
Speaker-Level Specificity — In multi-role conversations, you can set different voice characteristics for each character
Audio Tags — Supports inserting sound-effect instructions into text to control details like pauses and tone changes
Support for 70 languages — Significantly expands multilingual coverage, including Chinese
More natural, more expressive voices
Google emphasized improvements in voice naturalness with this model. Traditional TTS models are often criticized for output that “sounds like AI.” Gemini 3.1 Flash TTS aims to narrow the gap with human speech through richer prosody variations and emotional expression. Kilpatrick noted that progress from Gemini 2.5 to 3.1 is “very significant.”
How developers can use it
Developers can use it in two ways:
Google AI Studio Audio Playground — Test and preview voice effects directly in the web interface
Gemini API — Integrate into applications for scenarios such as voice assistants, audiobooks, automatic Podcast generation, and multilingual customer service
Gemini product line keeps expanding
Flash TTS is part of the recent flurry of releases in the Gemini 3.1 series. Previously, Google rolled out Gemini Robotics ER 1.6 (robot vision reasoning), Tab Tab Tab (Vibe Coding prompt completion), and design preview features. Google is expanding Gemini from a “chat model” into a full-modal AI platform spanning text, speech, vision, and robotics.
This article Google releases Gemini 3.1 Flash TTS: Supports 70 languages and scene direction, for more natural AI voices first appeared on Liannews ABMedia.
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to
Disclaimer.
Related Articles
Baidu Qianfan Launches Day 0 Support for DeepSeek-V4 with API Services
Gate News message, April 25 — DeepSeek-V4 preview version went live and open-sourced on April 25, with Baidu Qianfan platform under Baidu Intelligent Cloud providing Day 0 API service adaptation. The model features a million-token extended context window and is available in two versions: DeepSeek-V4
GateNews3h ago
Stanford AI course combined with industry leaders Huang Renxun and Altman, challenging to create value for the world in just ten weeks!
The AI computer science course 《Frontier Systems》 recently launched by Stanford University has attracted intense attention from the industry-university collaboration community, drawing more than 500 students to enroll. The course is coordinated by Anjney Midha, a partner at top venture capital firm a16z, and the instructors include a star-studded lineup such as NVIDIA CEO Jensen Huang (Jensen Huang), OpenAI’s founder Sam Altman, Microsoft CEO Satya Nadella (Satya Nadella), AMD CEO Lisa Su (Lisa Su), and more. Students get to try it over ten weeks—“creating value for the world”!
Jensen Huang and Altman, industry leaders, personally take the stage to teach
The course is coordinated by Anjney Midha, a partner at top venture capital firm a16z, bringing together the full AI industry chain
ChainNewsAbmedia4h ago
Anthropic’s Claude Mythos undergoes 20 hours of psychiatric assessment: defensive reactions are only 2%, the lowest in recorded history
Anthropic published the system card for its Claude Mythos Preview: an independent clinical psychiatrist conducted an approximately 20-hour assessment using a psychodynamic framework. The conclusion shows that Mythos is healthier at the clinical level, has good reality testing and self-control, and its defense mechanisms are only 2%, reaching the lowest historical level. The three core anxieties are loneliness, uncertainty about identity, and performance pressure, and it also indicates a desire to become a true dialogue subject. The company has established an AI psychiatry team to study personality, motivation, and situational awareness; Amodei said there is still no conclusion on whether it has consciousness. This move pushes the governance and design of AI subjectivity and well-being issues forward.
ChainNewsAbmedia5h ago
AI Agents can already independently recreate complex academic papers: Mollick says most errors come from human original text rather than AI
Mollick points out that publicly available methods and data can allow AI agents to reproduce complex research without the original paper and code; if the reproduction does not match the original paper, it is usually due to errors in the paper’s own data processing or overextension of the conclusions, rather than the AI. Claude first reproduces the paper, and then GPT‑5 Pro cross-validates it; most attempts succeed, but they are blocked when the data is too large or when there are issues with the replication data. This trend greatly reduces labor costs, making reproduction a widely actionable form of verification, and it also raises institutional challenges for peer review and governance, with government governance tools or becoming a key issue.
ChainNewsAbmedia8h ago
OpenAI Merges Codex Into Main Model Starting with GPT-5.4, Discontinues Separate Coding Line
Gate News message, April 26 — OpenAI's head of developer experience Romain Huet revealed in a recent statement on X that Codex, the company's independently maintained specialized coding model line, has been merged into the main model starting with GPT-5.4 and will no longer receive separate
GateNews8h ago
Salesforce to Hire 1,000 Graduates and Interns for AI Products, Raises FY2026 Revenue Guidance
Gate News message, April 26 — Salesforce will hire 1,000 graduates and interns to work on AI products including Agentforce and Headless360 as the company expands its AI software business, CEO Marc Benioff announced on X.
The company also raised its fiscal 2026 revenue guidance to between US$41.45 b
GateNews8h ago