Gemini
Google Gemini model family.
Stories
Filter storiesA day after Gemini 3.5 Flash Computer Use surfaced as a launch story, Google formally opened it through the Gemini API and Enterprise Agent Platform. Explicit user confirmation, automated task stopping, and an Android adb quickstart make the rollout concrete for agent builders.
Google released built-in Computer Use for Gemini 3.5 Flash across browser, mobile, and desktop. Try it for agent workflows, but watch for timeout issues on long design-from-scratch runs.
Google put the Interactions API into GA as the new default for Gemini, adding background execution, managed agents, remote sandboxes, and multimodal tools. Builders now get one stateful interface for models, long-running jobs, and future Gemini Omni support.
Google released Gemini 3.5 Live Translate for low-latency speech translation across 70+ languages in the Gemini Live API, AI Studio, and Google Translate. The same model is also heading to Google Meet in private preview for Workspace customers.
Posts from WWDC say Apple Intelligence now combines Apple Foundation and Gemini models, and Siri gains visual, on-screen, and app-level actions. Watch for the beta rollout later this year; multiple posts say it will not ship in the EU at launch.
Gemini Managed Agents can spin up a sandboxed Linux environment with code execution, web access, and file I/O from one API call, and early examples now include W&B and LlamaIndex workflows. That gives builders a higher-level runtime for long tasks while third-party templates start to define the first production use cases.
Google moved Nano Banana 2 and Nano Banana Pro to GA in AI Studio and the Gemini Enterprise Agent Platform. Nano Banana 2 also takes video as input, giving image pipelines published per-image pricing and a production API.
Google said AI Studio users created more than 250,000 native Android apps in the first week after app generation launched. The number matters because it is the first adoption signal for Google's free no-code Android builder and device-testing workflow.
A new paper says AlphaProof Nexus resolved 9 of 353 open Erdős problems and 44 OEIS conjectures using Gemini-guided search plus Lean checks. The strongest results came where Lean libraries are already mature, so those libraries remain the bottleneck to watch.
Antigravity added a lower-cost Gemini 3.5 Flash tier for IDE, CLI, and desktop use, with posts citing about 45% fewer tokens than Medium. Watch quotas after the reset across free and paid plans if you're planning to use the cheaper tier.
A day after Antigravity raised weekly Gemini quotas, the team said the 3x increase is permanent and doubled Gemini 3.5 Flash max context in AGY. The same update batch also clarified the IDE split and shipped Windows fixes, changing day-to-day limits and workflow behavior for developers.
Google opened iOS pre-registration for the AI Studio mobile app and confirmed native iOS and Android clients for AI Studio workflows. The rollout matters because it extends Google’s developer-facing Gemini environment beyond the browser into a mobile form factor for prototyping and testing.
Google tripled Antigravity's Gemini weekly quotas and issued a one-time quota reset after raising limits earlier in the week. The change lets teams run more Gemini 3.5 Flash work inside Google's CLI and managed-agent workflows.
Users reported failed harness runs, benchmark misses, broken Calendar and video-editing flows, and later a tripled Antigravity rate limit after Gemini 3.5 Flash launched. Watch real agent workflows closely, because the speed gains are arriving with higher spend and unstable behavior.
OpenCode, Kilo, Replicate, and Mastra exposed Gemini 3.5 Flash on launch day across coding agents, routers, and hosted APIs. The fast uptake gives engineers multiple harnesses to test Google's 1M-context model despite mixed first-party app reports.
Google shipped Gemini 3.5 Flash as a GA model with 1M context, 65K max output, and stronger agentic benchmarks than Gemini 3.1 Pro. Watch task-level cost, since third-party evals show it can exceed Gemini 3.1 Pro and GPT-5.5 Medium on some jobs.
A day after leaks previewed Spark, Google officially launched Gemini Spark as a persistent personal agent that runs on dedicated cloud VMs and will connect to MCP tools. It matters because Google is moving Gemini from chat responses toward long-running delegated work across consumer and enterprise surfaces.
Google expanded AI Studio with native Android app building, on-device testing, Workspace integrations, and export into Antigravity, while its mobile app entered pre-registration. It matters because AI Studio is becoming a fuller build surface instead of just a prompt playground.
Google launched Gemini Omni Flash as its first shipping any-input-to-video model, with character consistency, physics-aware scenes, and conversational video editing. Use it in Gemini, Flow, and YouTube surfaces first, and wait for API access if you need programmatic integration.
Leak videos and tester reports pointed to a larger Gemini desktop app with Stream to Cursor, Spark local-file access, Live, and Omni ahead of I/O. Independent testers also reported faster 3.2 and 3.5 Flash checkpoints, but Google had not announced the features publicly.
Multiple users posted reproducible steps and videos showing Gemini app UI changes, Thinking Level rollout, and Fast mode or Canvas sessions that look like 3.2 or 3.5-class routing. This matters because Google appears to be testing new model paths and app surfaces in production ahead of I/O, though the exact model names remain unconfirmed.
Google unveiled Gemini Intelligence at the Android Show with cross-app task automation, Gemini in Chrome, Rambler voice cleanup, custom widgets, and AppFunctions. The rollout moves Gemini into core Android workflows on Pixel and Galaxy devices this summer.
Google DeepMind published Gemini pointer experiments in AI Studio that act on whatever the cursor highlights, turning PDFs, tables, images, and recipes into direct actions. The shift matters because it moves assistant UX from separate chat panes into in-place pointing and voice commands.
Google moved Gemini 3.1 Flash Lite from preview to GA, and OpenRouter added the model with 1 million context and low-cost multimodal pricing. The preview endpoint now has a shutdown schedule, and users should verify whether the GA model differs from the March preview.
Google is replacing the Gemini Interactions API’s older outputs-and-roles structure with a steps schema for multi-step agent workflows. The change matters because SDK upgrades, migration work, and schema assumptions in existing tooling may break before the new interface reaches GA.
Google expanded Gemini API File Search to index text and images together, add custom metadata filtering, and return page-level citations. RAG builders can use it for tighter retrieval control and more auditable answers.
Google added Webhooks to the Gemini API and upgraded Interactions API errors with exact field paths, bad values, enum lists, and type mismatches. The changes target long-running tasks and agent integrations where polling and opaque validation failures slow debugging.
Google AI Studio added multi-chat threads and web search grounding to Build mode, so Gemini coding sessions can branch while pulling live docs into the workspace. The feature improves in-browser prototyping loops, but it is currently scoped to AI Studio rather than the Gemini API itself.
Gemini models can now use Grounding with Exa to search websites, technical docs, papers, people, and companies through Exa's index. That gives Gemini a new agent-style grounding path alongside Google's first-party search tooling.
Google introduced Gemini Enterprise Agent Platform as the evolution of Vertex AI, with Agent Studio, shared agent management, and Model Garden access to 200-plus models. Enterprises now get one stack for building, governing, and deploying agents across Gemini and Workspace surfaces.
Google added Deep Research and Deep Research Max to the Gemini API with collaborative planning, multimodal inputs, MCP support, and native charts. The agents push cited web-plus-private-data reports into developer workflows, and Max is tuned for slower overnight runs.
Google enabled Pro and Ultra subscriptions inside AI Studio, turning consumer plans into a higher-quota bridge before direct API billing. The rollout still has quota bugs and does not yet support Workspace accounts, so check access before migrating.
Google released Gemini 3.1 Flash TTS with inline Audio Tags, multi-speaker control and 70+ languages, and opened preview access through the Gemini API and AI Studio with rollout to Vertex AI and Google Vids. Independent evals ranked it near the top of current speech leaderboards, but it runs slower and costs more than the leading system.
Google DeepMind shipped Gemini Robotics-ER 1.6 to the Gemini API and AI Studio with better visual-spatial reasoning, multi-view success detection, and gauge reading. The model's 93% instrument-reading score targets robots that need to reason over cluttered scenes and physical constraints.
Google released Veo 3.1 Lite in Gemini API and AI Studio with 720p and 1080p output, 4-8 second clips, and text-to-video plus image-to-video support. Watch the April 7 Veo 3.1 Fast pricing drop if you need lower video generation costs.
Hankweave added short aliases that route the same prompt and code job into Anthropic's Agents SDK, Codex, or Gemini-style harnesses with unified logs and control. The release treats harness choice as a first-class variable instead of forcing teams to rebuild orchestration for each model stack.
Google launched Gemini 3.1 Flash Live in AI Studio, the API, and Gemini Live with stronger audio tool use, lower latency, and 128K context. Voice-agent teams should benchmark quality, latency, and thinking settings before switching.
ARC-AGI-3 swaps static puzzles for interactive game-like environments and posts initial frontier scores below 1%, with Gemini 3.1 Pro at 0.37%. Teams can use it to inspect agent reasoning, but score interpretation still depends heavily on the human-efficiency metric and no-harness setup.
Lyria 3 Pro and Lyria 3 Clip are now in Gemini API and AI Studio, with Lyria 3 Pro priced at $0.08 per song and able to structure tracks into verses and choruses. That gives developers a clearer path to longer-form music features, with watermarking and prompt design built in.
Google extended its OpenAI compatibility layer so existing OpenAI SDK code can call Veo 3.1 video generation and Gemini image models with only base URL and model changes. It lowers migration cost for teams that want multimodal fallbacks without rewriting client code.
Google rebuilt AI Studio Build around the Antigravity coding agent, one-click Firebase auth and databases, multiplayer backends, and persistent sessions. It pushes AI Studio closer to production app scaffolding and gives Firebase Studio users a clear migration path.
Google now lets Gemini chain built-in tools like Search, Maps, File Search, and URL Context with custom functions inside a single API call. This removes orchestration glue for agent builders and brings Maps grounding into AI Studio for faster prototyping.
Google is adding Grounding with Google Maps to AI Studio’s UI, and a Google reply says the Maps grounding capability already exists in the API. If you build location-heavy Gemini apps, start designing around map lookups instead of stitching search and geocoding manually.
Reports say Meta pushed Avocado from March to at least May after internal reasoning, coding, and writing tests missed current frontier targets. Expect more delayed launches at the top end, and watch for products that route some features through competitor models.
Google launched Gemini Embedding 2 in preview, unifying multiple modalities and 100+ languages in one embedding space with flexible output dimensions. Try it to simplify cross-modal RAG and search pipelines, but compare it with late-interaction systems before committing.
Google AI Studio now lets developers set experimental per-project spend caps for Gemini API usage. Use it as a native billing guardrail, but account for roughly 10-minute enforcement lag and possible batch-job overshoot.
Google rolled Gemini deeper into Docs, Sheets, Slides, and Drive, including spreadsheet task execution, editable slide generation, and grounded file search. Teams should test approval flows, context sourcing, and current feature limits before broad rollout.
Google put Gemini Embedding 2 into public preview with one vector space for text, images, video, audio, and PDFs, plus 3072, 1536, and 768 output sizes. Use it to replace multi-model retrieval pipelines with one API for RAG and cross-media search.
A Google bot-authored LiteRT-LM pull request references Gemma4 and AIcore NPU support, while multiple posts claim a largest version around 120B total and 15B active parameters. Engineers targeting on-device inference should wait for a formal model card before locking plans.