Our best model in terms of price-performance, offering well-rounded capabilities.
Multimodal Gemini 2.5 Flash model release for large-scale, low-latency, high-volume tasks; supports text, image, video, and audio inputs with text output, thinking, caching, function calling, and a 1,048,576-token input context.
Standard pay-as-you-go pricing on Vertex AI. The page separately lists audio input at $1.00 per 1M tokens and shows priority and flex/batch pricing on the same source page.
First-party Google Cloud pricing page lists Gemini 2.5 Flash under Standard pricing with multimodal inputs (text, image, video) at $0.30 per 1M tokens, audio input at $1.00 per 1M tokens, text output at $2.50 per 1M tokens, and cached input at $0.03 per 1M tokens.