Skip to content
AI Primer
release

Google releases Gemini 3.1 Flash Lite GA with 1M context and $0.25 input pricing

Google moved Gemini 3.1 Flash Lite from preview to GA, and OpenRouter added the model with 1 million context and low-cost multimodal pricing. The preview endpoint now has a shutdown schedule, and users should verify whether the GA model differs from the March preview.

4 min read
Google releases Gemini 3.1 Flash Lite GA with 1M context and $0.25 input pricing
Google releases Gemini 3.1 Flash Lite GA with 1M context and $0.25 input pricing

TL;DR

Google shipped the cheap model and the API shape change on the same day. You can jump from the official GA post to the interactions migration guide, and OpenRouter's post already exposes the practical details engineers usually want first: context window, modality coverage, pricing, and an endpoint you can hit today.

What shipped

The launch facts are short and unusually concrete. Google described Flash Lite as its cheapest model, and testingcatalog's AI Studio screenshot shows the public model card in AI Studio with text, image, and video input support plus Jan 2025 knowledge cutoff.

The rollout details from OpenRouter's post are the ones most likely to matter in practice:

  • Multimodal input: text, image, video, audio, and PDF to text
  • Context window: 1 million tokens
  • Pricing: $0.25 per million input tokens, $1.50 per million output tokens
  • Selectable thinking levels

Google's own launch thread links the official blog post, but the community's first question was whether GA meant a new model or just a label change.

Preview versus GA

llm-gemini 0.31

Release: llm-gemini 0.31 gemini-3.1-flash-lite is no longer a preview. Here's my write-up of the Gemini 3.1 Flash-Lite Preview model back in March. I don't believe this new non-preview model has changed since then. Tags: llm-release, gemini, llm, google, generative-ai, ai, llms

The cleanest counterweight to the launch framing came from Simon Willison. In his tweet, he said the preview model had already been available since March 3 and that pricing appeared unchanged; in his release note, he added that he does not believe the non-preview model changed since then.

That leaves one unresolved but reportable fact: Google announced GA, but the evidence in hand does not show a public capability delta from preview. For this story, GA looks more like a status change than a visible version bump.

Interactions API

The more substantial engineering change may be the request format around the model. According to GoogleAIStudio's thread, Gemini's Interactions API now represents each action as its own step instead of squeezing everything into strict user and model roles.

The step types shown in the migration graphic are:

  • user_input
  • thought
  • function_call
  • function_call_result
  • google_search_call
  • google_search_result
  • model_output

Google linked dedicated docs for the change in the interactions breaking changes guide. For teams building multi-step agentic runs, that doc probably matters more than the GA badge.

Preview shutdown dates

The old preview endpoint already has a clock on it. OpenRouter's notice said gemini-3.1-flash-lite-preview is deprecated starting May 11 and shuts down May 25, pointing to Google's deprecations schedule.

OpenRouter also used the launch to surface one extra integration detail: its follow-up post says Google models work with OpenRouter's service_tier parameter, which exposes standard, flex, and priority cost-latency tradeoffs and returns the served tier in the response.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 1 thread
What shipped1 post
Share on X