MODEL RELEASEAUDIOreleaseGoogle
Gemini 3.1 Flash TTS Preview
Cost-efficient, expressive, and steerable text to speech model.
Google's cost-efficient, expressive, and steerable text-to-speech model for generating audio from text.
Pricing
Official site · Apr 16, 2026, 6:22 AM
Input / 1M
$1.00
Output / 1M
$20.00
Audio tokens correspond to 25 tokens per second of audio. The pricing page lists the model under Gemini-TTS as Gemini 3.1 Flash TTS (Preview).
Google’s official Text-to-Speech pricing page lists Gemini 3.1 Flash TTS (Preview) at $1.00 per 1 million text input tokens and $20.00 per 1 million audio output tokens, with no free usage limit. This is a first-party public price for the named preview model.
Model Intelligence
Context window
8,192 tokens
Benchmarkable
Yes
Model level
release
Recent stories
1 linked story