releaseMay 20, 2026

Google DeepMind releases Gemini 3.5 Flash; claims No. 1 on Zapier Automation Bench

Google DeepMind released Gemini 3.5 Flash and later claimed a No. 1 result on Zapier's Automation Bench. Try it for cheap, fast early concepts, but watch for weaker results on dense layout, motion, and typography decisions.

5 min read

Google DeepMind releases Gemini 3.5 Flash; claims No. 1 on Zapier Automation Bench

TL;DR

Google rolled out Gemini 3.5 Flash across the Gemini API, Google AI Studio, Antigravity, AI Mode in Search, and the Gemini app, according to GoogleDeepMind's availability post and OfficialLoganK's rollout note.
In Google's own launch framing, GoogleDeepMind's benchmark post says 3.5 Flash beats Gemini 3.1 Pro on Terminal-Bench 2.1, GDPval-AA, and MCP Atlas, while OfficialLoganK's Automation Bench post adds a later claim that it ranks No. 1 on Zapier's Automation Bench.
Early creative use looks strongest on fast concepting and animated site generation, with viktoroddy's tutorial post showing a 25-minute animated website workflow and viktoroddy's stack post putting Gemini 3.5 Flash at the center of a broader design stack.
The first hands-on caveat came fast: after 50 design runs, MengTo's review said Gemini 3.5 Flash was cheaper than GPT-5.5 but weaker than Gemini 3.1 Pro on dense layout, responsiveness, scroll behavior, WebGL, and animation complexity.
Google is clearly shipping the harness around the model too, with GoogleDeepMind's Antigravity post outlining Antigravity 2.0, a CLI, and an SDK, and OfficialLoganK's rate-limit update saying the team tripled Antigravity rate limits a day later.

You can open Google's launch page, compare it with OfficialLoganK's benchmark table, watch GoogleDeepMind's city-building demo where 3.5 Flash deploys subagents inside Antigravity, and then jump to the Science Skills page for a very different use case. The weirdly useful detail is how quickly the creative crowd turned it into a website tool, with viktoroddy's tutorial post and his prompt pack appearing almost immediately.

What shipped

Google positioned Gemini 3.5 as a model family, but the first public release is 3.5 Flash. In GoogleDeepMind's launch thread, the company called it its strongest model yet for agents and coding.

Availability was broad on day one. GoogleDeepMind's availability post put it in the Gemini app, AI Mode in Search, Antigravity, and Google AI Studio, while OfficialLoganK's rollout note added the Gemini API and said it was live "everywhere now."

Automation Bench and Google's own chart

Google's launch numbers mostly compare 3.5 Flash to Gemini 3.1 Pro. In GoogleDeepMind's benchmark post, Terminal-Bench 2.1 moves from 70.3% to 76.2%, GDPval-AA from 1314 to 1656 Elo, and MCP Atlas from 78.2% to 83.6%.

The later claim that got the most attention outside Google's own chart came from OfficialLoganK's Automation Bench post, which said Gemini 3.5 Flash ranked No. 1 on Zapier's Automation Bench while costing much less than other frontier models.

OfficialLoganK's benchmark table also widened the picture beyond coding and agents. That table lists 78.4% on OSWorld-Verified, 84.2% on CharXiv Reasoning, 77.3% on MRCR v2 (8-needle), and 72.1% on ARC-AGI-2.

Creative hands-on

The fastest real use case was animated website work. In viktoroddy's tutorial post, Gemini 3.5 Flash is the engine behind a 25-minute workflow for generating a high-polish site concept inside Google AI Studio.

That post was not a one-off demo. viktoroddy's stack post placed Gemini 3.5 Flash in a working stack with Figma AI, Bolt, GPT Image 2, Seedance 2.0, Screen Studio, Shaders, and MotionSites AI.

The creator pattern here is pretty clear from the evidence:

fast code-first website concepts from prompts, per viktoroddy's tutorial post
pairing with visual tools instead of replacing them, per viktoroddy's stack post
using a shared prompt library to keep outputs repeatable, via MotionSites AI

Antigravity and AI Studio

Google did not ship 3.5 Flash as a naked API story. GoogleDeepMind's Antigravity post framed the release around three new Antigravity surfaces:

Antigravity 2.0, a mission control where agents can work simultaneously on one project
CLI, a terminal interface for working with agents
SDK, a toolkit for wiring software into those agents

In OfficialLoganK's AI Studio update, OfficialLoganK added the adjacent pieces that matter for builders:

managed agents built on the Antigravity harness
native Android app creation inside AI Studio
native Workspace integrations
one-click export from AI Studio into Antigravity

Google's own demo push leaned hard into subagents. GoogleDeepMind's city-building demo showed 3.5 Flash using multiple subagents to design and build an entire city, which lines up with GoogleDeepMind's launch thread describing parallel subagent work over long horizons.

Caveats from early users

The most specific early critique came from MengTo's review, which is useful precisely because it is narrow. After 50 designs, MengTo said 3.5 Flash was "as good as GPT 5.5 but way cheaper," but still worse than Gemini 3.1 Pro on layout grids, responsiveness, scroll behaviors, WebGL, and animation complexity.

That splits the early creative read in two. viktoroddy's tutorial post makes it look great for fast animated concepts, while MengTo's review suggests the model still loses sharpness when the job turns into dense systems design rather than flashy first-pass generation.

There were also early questions around cost and freshness. stevibe's ModelClock post estimated a 2025-03-19 knowledge cutoff for Gemini 3.5 Flash, while bennash's quoted complaint amplified criticism that the new model costs more than older Gemini tiers.

Google reacted to usage pressure almost immediately. In OfficialLoganK's rate-limit update, OfficialLoganK said Antigravity rate limits were tripled across all tiers so people could push 3.5 Flash harder.

Science Skills

The most interesting late addition was not a design demo. In GoogleDeepMind's Science Skills post, Google tied Gemini 3.5 Flash to a new Science Skills package for Antigravity that pulls from more than 30 life sciences sources, including UniProt and the AlphaFold Database.

That same post says the team used the stack on an AK2 mutation case and got structural-analysis results faster than usual, with "novel insights" into the disease mechanism. It is a reminder that Google is pitching 3.5 Flash as both a creative model and a tool-using research model, not just a cheaper coding assistant.