breakingMay 12, 2026

Google DeepMind tests Gemini pointer demos in AI Studio with PDF bullets and recipe doubling

Google DeepMind published Gemini pointer experiments in AI Studio that act on whatever the cursor highlights, turning PDFs, tables, images, and recipes into direct actions. The shift matters because it moves assistant UX from separate chat panes into in-place pointing and voice commands.

3 min read

Google DeepMind tests Gemini pointer demos in AI Studio with PDF bullets and recipe doubling

TL;DR

Google DeepMind is testing an "AI-enabled pointer" that uses cursor position as context, so Gemini can act on the exact word, image, table, or code block under the mouse, according to GoogleDeepMind's launch thread and GoogleDeepMind's follow-up demos.
The demo set is about in-place actions, not a separate chat pane: GoogleDeepMind shows PDF summarization and recipe scaling, while GoogleDeepMind's scribble demo adds note-to-checklist and paused-video-to-booking flows.
Google frames speech plus pointing as a shorthand interface, where commands like "fix this" or "move that" inherit meaning from whatever the pointer is hovering over, per GoogleDeepMind's shorthand explanation.
The experiments are already exposed through Google AI Studio, and koltregaskes' pointer roundup also links the work to a separate Gemini in Chrome post.

You can try the experiments through Google AI Studio, read the official DeepMind blog post, and cross-check the Chrome angle in Google's Gemini in Chrome post. GoogleDeepMind's main demo shows the pointer turning highlighted text into a tweet draft, while GoogleDeepMind's second demo uses the same pattern on handwritten notes and paused video frames.

Pointing as context

The core claim is simple: the cursor stops being a location marker and becomes a selector for model context. In GoogleDeepMind's code-block example, hovering over part of a code snippet is enough to trigger an explanation flow for that exact block.

Google's examples all follow the same interaction loop:

point at something on screen
let Gemini infer the local context
issue a short spoken or typed command
get the result inline, next to the pointer or in a side panel

That is a cleaner UX than the usual copy-paste into a chat box, which is why kimmonismus' reaction immediately framed it as a possible break from classic chatbot windows.

Shorthand actions

The interesting part is the action inventory. Across GoogleDeepMind's launch thread and GoogleDeepMind's follow-up, the demos show at least six concrete transforms:

PDF passage to bullet points for an email
Table to pie chart
Recipe to doubled ingredient list
Scribbled note to interactive to-do list
Paused restaurant video frame to booking link
Selected text to tweet draft

Google also says the interface is built around natural shorthand instead of full prompts. In GoogleDeepMind's shorthand explanation, the examples are short imperative phrases like "fix this," "move that," and "book this table," with the pointer supplying the missing nouns.

AI Studio and Chrome

The product status is still experimental. demishassabis called it a prototype, while GoogleDeepMind's thread sends users to Google AI Studio rather than announcing a standalone product.

One extra wrinkle is surface area. koltregaskes' roundup pairs the pointer blog with Google's separate Gemini in Chrome post, which suggests the broader story is Gemini moving closer to the browser and the cursor, not just a one-off research video.

TL;DR

Pointing as context

Shorthand actions

AI Studio and Chrome

Discussion across the web