Google opens Gemini 3.5 Flash Computer Use in Gemini API with explicit confirmations
A day after Gemini 3.5 Flash Computer Use surfaced as a launch story, Google formally opened it through the Gemini API and Enterprise Agent Platform. Explicit user confirmation, automated task stopping, and an Android adb quickstart make the rollout concrete for agent builders.

TL;DR
- Google opened Gemini 3.5 Flash's built-in Computer Use through the Gemini API and Gemini Enterprise Agent Platform, according to Google's availability post and Google's launch post.
- The rollout ships with two named safeguards, as Google's safety post describes them: explicit user confirmation for sensitive actions, and automated task stopping when the system detects a potential threat.
- _philschmid's Android quickstart turned the launch into something builders can copy, with a one-script emulator install, an
adbcontrol loop, and support for remote devices. - On Google's own OSWorld-Verified chart in testingcatalog's screenshot, Gemini 3.5 Flash scored 78.4. In Cua's KiCad EDA eval, trycua's early-access report put it at 0.267 mean reward, ahead of the pack on that metric.
- The most useful extra detail came from _philschmid's Interactions API guide: the agent loop uses
previous_interaction_id, streaming responses, local function calls, and remote sandboxes.
Google's announcement, _philschmid's guide, and Cua's benchmark page are the three tabs worth opening first. _philschmid's Android example also exposed a small but practical quirk, the model preferred English UI text and used a click action to trigger translation. The public-preview framing in Wes Roth's summary matters because Google is shipping both the API surface and the enterprise guardrails at the same time.
Availability
Google's core change is simple: Computer Use is now a native tool inside Gemini 3.5 Flash, not a separate orchestration layer wrapped around it.
- Google's launch post says the model can see and act across browser, mobile, and desktop environments.
- Google's follow-up says developers can access it through the Gemini API, while enterprises can access it through the Gemini Enterprise Agent Platform.
- Wes Roth's summary adds that the rollout is in public preview.
Safeguards
Google attached two concrete control points to this release instead of leaving safety guidance at the usual high level.
- Google's safety post says Explicit User Confirmation requires human sign-off before sensitive or irreversible actions.
- The same Google safety post says Automated Task Stopping halts a run when a potential threat is identified.
- _philschmid's launch thread adds that Google also trained the model against prompt injection.
- Google's safety post still recommends sandboxing, human review, and strict access controls around live environments.
Android quickstart
The launch got a useful day-two companion in _philschmid's Android quickstart, which shows the same Computer Use pattern against a phone instead of a browser tab.
- One script installs an Android emulator from the terminal.
- The agent loop uses the Interactions API plus
adbto control the device. - Remote devices work through
adb connect <ip>:5555. - _philschmid's quickstart thread says the same pattern should port to iOS via tools like
simctl.
Android emulator controlled through Gemini and adb
_philschmid's translation example showed another practical wrinkle: on at least one website, the model performed better after switching the page into English.
Benchmarks
Google's own chart, visible in testingcatalog's screenshot, puts Gemini 3.5 Flash at 78.4 on OSWorld-Verified. That ties Sonnet 4.6, lands just behind GPT-5.5 at 78.7, and trails Opus 4.8 at 83.4.
Cua's early-access eval used a very different setup. trycua's report and the methodology thread say the KiCad EDA suite covered 25 professional CAD tasks with a matched 200-step budget across seven models.
- Gemini 3.5 Flash posted 0.267 mean reward, per trycua's top-line result.
- trycua's leaderboard note says GPT-5.5 solved more tasks outright, 6 of 25, but earned no partial credit.
- trycua's execution-quality note attributes Gemini's edge to pixel-accurate grounding in a dense CAD UI.
- trycua's caveat says the same failure mode remains: design-from-scratch runs still time out inside the 200-step budget.
Interactions API loop
_philschmid's developer guide is where the agent harness becomes legible.
- It chains turns with
previous_interaction_id. - It supports streaming responses.
- It shows the full execution loop for local function calls.
- It covers running Antigravity Agent in remote sandboxes.
That makes this release more concrete than a benchmark drop. The model ships with a named tool, an enterprise safety layer, and a documented interaction loop that already spans browser sessions, Android control through adb, and remote sandbox execution.