Gemini
Google's multimodal AI model family
Gemini is Google's multimodal AI model family from Google DeepMind, surfaced through the Gemini API and related Google AI products.
Pricing
Model Intelligence
Recent stories
A day after Gemini 3.5 Flash Computer Use surfaced as a launch story, Google formally opened it through the Gemini API and Enterprise Agent Platform. Explicit user confirmation, automated task stopping, and an Android adb quickstart make the rollout concrete for agent builders.
Google released built-in Computer Use for Gemini 3.5 Flash across browser, mobile, and desktop. Try it for agent workflows, but watch for timeout issues on long design-from-scratch runs.
Google put the Interactions API into GA as the new default for Gemini, adding background execution, managed agents, remote sandboxes, and multimodal tools. Builders now get one stateful interface for models, long-running jobs, and future Gemini Omni support.
Posts from WWDC say Apple Intelligence now combines Apple Foundation and Gemini models, and Siri gains visual, on-screen, and app-level actions. Watch for the beta rollout later this year; multiple posts say it will not ship in the EU at launch.
Gemini Managed Agents can spin up a sandboxed Linux environment with code execution, web access, and file I/O from one API call, and early examples now include W&B and LlamaIndex workflows. That gives builders a higher-level runtime for long tasks while third-party templates start to define the first production use cases.
Google moved Nano Banana 2 and Nano Banana Pro to GA in AI Studio and the Gemini Enterprise Agent Platform. Nano Banana 2 also takes video as input, giving image pipelines published per-image pricing and a production API.
Hyperbrowser launched AgentRank, an open-source tool that runs Claude, GPT, and Gemini agents against a site to show where they get stuck. It matters because teams can turn agent website compatibility into a repeatable eval instead of an anecdotal demo.
Google said AI Studio users created more than 250,000 native Android apps in the first week after app generation launched. The number matters because it is the first adoption signal for Google's free no-code Android builder and device-testing workflow.
Antigravity added a lower-cost Gemini 3.5 Flash tier for IDE, CLI, and desktop use, with posts citing about 45% fewer tokens than Medium. Watch quotas after the reset across free and paid plans if you're planning to use the cheaper tier.
A new paper says AlphaProof Nexus resolved 9 of 353 open Erdős problems and 44 OEIS conjectures using Gemini-guided search plus Lean checks. The strongest results came where Lean libraries are already mature, so those libraries remain the bottleneck to watch.
A day after Antigravity raised weekly Gemini quotas, the team said the 3x increase is permanent and doubled Gemini 3.5 Flash max context in AGY. The same update batch also clarified the IDE split and shipped Windows fixes, changing day-to-day limits and workflow behavior for developers.
Google opened iOS pre-registration for the AI Studio mobile app and confirmed native iOS and Android clients for AI Studio workflows. The rollout matters because it extends Google’s developer-facing Gemini environment beyond the browser into a mobile form factor for prototyping and testing.
Google tripled Antigravity's Gemini weekly quotas and issued a one-time quota reset after raising limits earlier in the week. The change lets teams run more Gemini 3.5 Flash work inside Google's CLI and managed-agent workflows.
OpenCode, Kilo, Replicate, and Mastra exposed Gemini 3.5 Flash on launch day across coding agents, routers, and hosted APIs. The fast uptake gives engineers multiple harnesses to test Google's 1M-context model despite mixed first-party app reports.
Users reported failed harness runs, benchmark misses, broken Calendar and video-editing flows, and later a tripled Antigravity rate limit after Gemini 3.5 Flash launched. Watch real agent workflows closely, because the speed gains are arriving with higher spend and unstable behavior.
Google launched Gemini Omni Flash as its first shipping any-input-to-video model, with character consistency, physics-aware scenes, and conversational video editing. Use it in Gemini, Flow, and YouTube surfaces first, and wait for API access if you need programmatic integration.
A day after leaks previewed Spark, Google officially launched Gemini Spark as a persistent personal agent that runs on dedicated cloud VMs and will connect to MCP tools. It matters because Google is moving Gemini from chat responses toward long-running delegated work across consumer and enterprise surfaces.
Leak videos and tester reports pointed to a larger Gemini desktop app with Stream to Cursor, Spark local-file access, Live, and Omni ahead of I/O. Independent testers also reported faster 3.2 and 3.5 Flash checkpoints, but Google had not announced the features publicly.
Multiple users posted reproducible steps and videos showing Gemini app UI changes, Thinking Level rollout, and Fast mode or Canvas sessions that look like 3.2 or 3.5-class routing. This matters because Google appears to be testing new model paths and app surfaces in production ahead of I/O, though the exact model names remain unconfirmed.
Google DeepMind published Gemini pointer experiments in AI Studio that act on whatever the cursor highlights, turning PDFs, tables, images, and recipes into direct actions. The shift matters because it moves assistant UX from separate chat panes into in-place pointing and voice commands.
Google unveiled Gemini Intelligence at the Android Show with cross-app task automation, Gemini in Chrome, Rambler voice cleanup, custom widgets, and AppFunctions. The rollout moves Gemini into core Android workflows on Pixel and Galaxy devices this summer.
Google is replacing the Gemini Interactions API’s older outputs-and-roles structure with a steps schema for multi-step agent workflows. The change matters because SDK upgrades, migration work, and schema assumptions in existing tooling may break before the new interface reaches GA.
Google moved Gemini 3.1 Flash Lite from preview to GA, and OpenRouter added the model with 1 million context and low-cost multimodal pricing. The preview endpoint now has a shutdown schedule, and users should verify whether the GA model differs from the March preview.
Google expanded Gemini API File Search to index text and images together, add custom metadata filtering, and return page-level citations. RAG builders can use it for tighter retrieval control and more auditable answers.
Google added Webhooks to the Gemini API and upgraded Interactions API errors with exact field paths, bad values, enum lists, and type mismatches. The changes target long-running tasks and agent integrations where polling and opaque validation failures slow debugging.
Google AI Studio added multi-chat threads and web search grounding to Build mode, so Gemini coding sessions can branch while pulling live docs into the workspace. The feature improves in-browser prototyping loops, but it is currently scoped to AI Studio rather than the Gemini API itself.
Gemini models can now use Grounding with Exa to search websites, technical docs, papers, people, and companies through Exa's index. That gives Gemini a new agent-style grounding path alongside Google's first-party search tooling.
Google introduced Gemini Enterprise Agent Platform as the evolution of Vertex AI, with Agent Studio, shared agent management, and Model Garden access to 200-plus models. Enterprises now get one stack for building, governing, and deploying agents across Gemini and Workspace surfaces.
Google added Deep Research and Deep Research Max to the Gemini API with collaborative planning, multimodal inputs, MCP support, and native charts. The agents push cited web-plus-private-data reports into developer workflows, and Max is tuned for slower overnight runs.
Google enabled Pro and Ultra subscriptions inside AI Studio, turning consumer plans into a higher-quota bridge before direct API billing. The rollout still has quota bugs and does not yet support Workspace accounts, so check access before migrating.
OpenClaw 2026.4.15 adds Anthropic Opus 4.7, bundled Gemini TTS, bounded memory reads, and transport self-heal fixes. The release targets context and reliability issues users had been reporting this week.
Google released Gemini 3.1 Flash TTS with inline Audio Tags, multi-speaker control and 70+ languages, and opened preview access through the Gemini API and AI Studio with rollout to Vertex AI and Google Vids. Independent evals ranked it near the top of current speech leaderboards, but it runs slower and costs more than the leading system.
Google DeepMind shipped Gemini Robotics-ER 1.6 to the Gemini API and AI Studio with better visual-spatial reasoning, multi-view success detection, and gauge reading. The model's 93% instrument-reading score targets robots that need to reason over cluttered scenes and physical constraints.
Google released Veo 3.1 Lite in Gemini API and AI Studio with 720p and 1080p output, 4-8 second clips, and text-to-video plus image-to-video support. Watch the April 7 Veo 3.1 Fast pricing drop if you need lower video generation costs.
Google launched Gemini 3.1 Flash Live in AI Studio, the API, and Gemini Live with stronger audio tool use, lower latency, and 128K context. Voice-agent teams should benchmark quality, latency, and thinking settings before switching.
Google extended its OpenAI compatibility layer so existing OpenAI SDK code can call Veo 3.1 video generation and Gemini image models with only base URL and model changes. It lowers migration cost for teams that want multimodal fallbacks without rewriting client code.