Gemini
A family of multimodal AI models
Google DeepMind's multimodal AI model family.
Pricing
Model Intelligence
Recent stories
Google DeepMind published Gemini pointer experiments in AI Studio that act on whatever the cursor highlights, turning PDFs, tables, images, and recipes into direct actions. The shift matters because it moves assistant UX from separate chat panes into in-place pointing and voice commands.
Google unveiled Gemini Intelligence at the Android Show with cross-app task automation, Gemini in Chrome, Rambler voice cleanup, custom widgets, and AppFunctions. The rollout moves Gemini into core Android workflows on Pixel and Galaxy devices this summer.
Google is replacing the Gemini Interactions API’s older outputs-and-roles structure with a steps schema for multi-step agent workflows. The change matters because SDK upgrades, migration work, and schema assumptions in existing tooling may break before the new interface reaches GA.
Google moved Gemini 3.1 Flash Lite from preview to GA, and OpenRouter added the model with 1 million context and low-cost multimodal pricing. The preview endpoint now has a shutdown schedule, and users should verify whether the GA model differs from the March preview.
Google expanded Gemini API File Search to index text and images together, add custom metadata filtering, and return page-level citations. RAG builders can use it for tighter retrieval control and more auditable answers.
Google added Webhooks to the Gemini API and upgraded Interactions API errors with exact field paths, bad values, enum lists, and type mismatches. The changes target long-running tasks and agent integrations where polling and opaque validation failures slow debugging.
Google AI Studio added multi-chat threads and web search grounding to Build mode, so Gemini coding sessions can branch while pulling live docs into the workspace. The feature improves in-browser prototyping loops, but it is currently scoped to AI Studio rather than the Gemini API itself.
Gemini models can now use Grounding with Exa to search websites, technical docs, papers, people, and companies through Exa's index. That gives Gemini a new agent-style grounding path alongside Google's first-party search tooling.
Google introduced Gemini Enterprise Agent Platform as the evolution of Vertex AI, with Agent Studio, shared agent management, and Model Garden access to 200-plus models. Enterprises now get one stack for building, governing, and deploying agents across Gemini and Workspace surfaces.
Google added Deep Research and Deep Research Max to the Gemini API with collaborative planning, multimodal inputs, MCP support, and native charts. The agents push cited web-plus-private-data reports into developer workflows, and Max is tuned for slower overnight runs.
Google enabled Pro and Ultra subscriptions inside AI Studio, turning consumer plans into a higher-quota bridge before direct API billing. The rollout still has quota bugs and does not yet support Workspace accounts, so check access before migrating.
OpenClaw 2026.4.15 adds Anthropic Opus 4.7, bundled Gemini TTS, bounded memory reads, and transport self-heal fixes. The release targets context and reliability issues users had been reporting this week.
Google released Gemini 3.1 Flash TTS with inline Audio Tags, multi-speaker control and 70+ languages, and opened preview access through the Gemini API and AI Studio with rollout to Vertex AI and Google Vids. Independent evals ranked it near the top of current speech leaderboards, but it runs slower and costs more than the leading system.
Google DeepMind shipped Gemini Robotics-ER 1.6 to the Gemini API and AI Studio with better visual-spatial reasoning, multi-view success detection, and gauge reading. The model's 93% instrument-reading score targets robots that need to reason over cluttered scenes and physical constraints.
Google released Veo 3.1 Lite in Gemini API and AI Studio with 720p and 1080p output, 4-8 second clips, and text-to-video plus image-to-video support. Watch the April 7 Veo 3.1 Fast pricing drop if you need lower video generation costs.
Google launched Gemini 3.1 Flash Live in AI Studio, the API, and Gemini Live with stronger audio tool use, lower latency, and 128K context. Voice-agent teams should benchmark quality, latency, and thinking settings before switching.
Google extended its OpenAI compatibility layer so existing OpenAI SDK code can call Veo 3.1 video generation and Gemini image models with only base URL and model changes. It lowers migration cost for teams that want multimodal fallbacks without rewriting client code.