Skip to content
AI Primer
update

OpenCode, Kilo, Replicate, and Mastra support Gemini 3.5 Flash on day one

OpenCode, Kilo, Replicate, and Mastra exposed Gemini 3.5 Flash on launch day across coding agents, routers, and hosted APIs. The fast uptake gives engineers multiple harnesses to test Google's 1M-context model despite mixed first-party app reports.

4 min read
OpenCode, Kilo, Replicate, and Mastra support Gemini 3.5 Flash on day one
OpenCode, Kilo, Replicate, and Mastra support Gemini 3.5 Flash on day one

TL;DR

You can read Google's launch post, browse Mastra's Google provider page, hit the new Replicate model endpoint, and then jump straight into the weirder part, Antigravity's OS-building demo, where Google says 93 subagents burned through 2.6B tokens to build a working OS.

What shipped

Google positioned Gemini 3.5 Flash as its strongest coding and agentic model so far, not a cheap lite tier. The public launch materials paired that claim with 1,048,576 token input context, 65,536 max output tokens, and pricing of $1.50 per million input tokens and $9 per million output tokens in the API, as noted in Google's launch post and Simon Willison's launch writeup.

In Google's own comparison, Gemini 3.5 Flash beat Gemini 3.1 Pro on three flagship evals:

The catch is price. Before launch, scaling01's pricing post and AiBattle_'s price comparison both clocked the new Flash tier at triple the price of Gemini 3 Flash.

OpenCode and Kilo

The fast part of this story is not the model card, it is how quickly coding tools started wiring the slug in. OpenCode advertised Gemini 3.5 Flash as live with 1M context and pricing comparable to GLM, Kimi, and DeepSeek Pro.

Kilo moved just as fast. kilocode's launch post said Gemini 3.5 Flash was live in Kilo before I/O ended, with an initial 74.2% result on PinchBench and a claim of roughly 4x faster performance than comparable frontier models.

Kilo's follow-up posts matter because they show the harness Google seems to want for this model family, not just a dropdown entry. kilocode's Cloud Agent post pitched remote repo sessions with auto-commits and idle spin-down, while kilocode's Code Reviewer post pushed free PR review with inline comments.

Replicate and Mastra

Hosted API and router support also appeared immediately. replicate exposed Gemini 3.5 Flash through Replicate's API, and mastra added google/gemini-3.5-flash to Mastra's router the same day.

That gives the model three very different day-one entry points:

The ecosystem kept expanding through the evening. warpdotdev added Gemini 3.5 Flash to Warp Agent, Cognition's Windsurf repost signaled Windsurf support, and testingcatalog's AI/ML API post pointed to a 24-hour free testing window through AI/ML API.

Antigravity teamwork

Google's most concrete demo for why it thinks 3.5 Flash belongs in agent harnesses was not inside Gemini the app. It was inside Antigravity 2.0, where Google's OS demo thread said agents built a functioning operating system from a single prompt in 12 hours using 93 parallel subagents, more than 15,000 model requests, and 2.6B processed tokens for less than $1,000 in API credits.

Mirrokni's thread broke the coordination pattern into four roles for /teamwork-preview, which he said is now in research preview:

  • Orchestrators
  • Explorers
  • Workers
  • Critics

According to mirrokni's access note, that preview is currently limited to Google AI Ultra subscribers. That makes the day-one story two-tiered: broad model availability almost everywhere, and the most aggressive multi-agent harness still behind a research-preview gate.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 4 threads
What shipped6 posts
OpenCode and Kilo2 posts
Replicate and Mastra3 posts
Antigravity teamwork4 posts
Share on X