Skip to content
AI Primer
MODEL RELEASEAUDIOreleaseGoogle

Gemini 3.1 Flash TTS Preview

Cost-efficient, expressive, and steerable text to speech model.

Google's cost-efficient, expressive, and steerable text-to-speech model for generating audio from text.

Pricing

Official site · Apr 16, 2026, 6:22 AM
Input / 1M
$1.00
Output / 1M
$20.00

Audio tokens correspond to 25 tokens per second of audio. The pricing page lists the model under Gemini-TTS as Gemini 3.1 Flash TTS (Preview).

Google’s official Text-to-Speech pricing page lists Gemini 3.1 Flash TTS (Preview) at $1.00 per 1 million text input tokens and $20.00 per 1 million audio output tokens, with no free usage limit. This is a first-party public price for the named preview model.

View source

Model Intelligence

Context window
8,192 tokens
Benchmarkable
Yes
Model level
release

Recent stories

1 linked story
AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.