Computer Use
Stories, products, and related signals connected to this tag in Explore.
Stories
Filter storiesGoogle DeepMind showed an experimental pointer that lets Gemini act directly on screen elements with motion, speech, and shorthand commands. The demos move assistance from chat into live workspace control, but the feature was presented as an experiment rather than a shipped product.
UI-TARS resurfaced as an open-source desktop-control stack while Opendesk described using accessibility APIs and marked elements instead of raw pixel guesses. The approach makes computer-use workflows more repeatable, but it still depends on human-oriented interfaces.
Levelsio added a Cursor-style sidebar to Photo AI that uses Grok 4 to take photos, run packs, and remix uploads by operating the web app’s own controls. The demo stored chat history in localStorage and used on-screen context to translate requests into UI actions.
Codex App Server added a Fedora RPM package for Linux installs as users pushed Codex into browser control, 3D-print setup, and rapid game prototypes. Watch for more repeatable desktop workflows as Codex moves beyond chat-only experiments.
OpenAI updated Codex with Mac app control, background computer use, image tools, ongoing tasks, and 90+ plugins, while Remotion added a one-click skill. Agents can now work inside desktop creative apps and stacks without blocking the visible cursor.