Browser Automation
Stories, products, and related signals connected to this tag in Explore.
Stories
Filter storiesGoogle introduced WebMCP as a proposed bridge between websites and coding agents, and paired it with Chrome DevTools support for agent debugging plus Modern Web Guidance. It matters because Google is trying to standardize browser-facing agent behavior, not just model APIs.
OpenClaw 2026.5.18 shipped Grok OAuth and sidecar auth fixes, realtime Android Talk Mode, Telegram forum-topic delivery fixes, and better browser dialog handling. The release removes several auth and UI dead-ends that can stall long agent runs.
Kimi released Web Bridge, a browser extension that lets agents search, scroll, click, type, and save repeatable skills across websites. The bridge works with Kimi Code CLI plus Claude Code, Cursor, Codex, Hermes, and other agents.
Google unveiled Gemini Intelligence at the Android Show with cross-app task automation, Gemini in Chrome, Rambler voice cleanup, custom widgets, and AppFunctions. The rollout moves Gemini into core Android workflows on Pixel and Galaxy devices this summer.
Hyperbrowser shipped a CLI that exposes sandbox lifecycle, web fetch/search/crawl, and snapshotting from the terminal. The tool matters because it turns browser automation and forkable state into shell primitives for agent workflows.
OpenAI shipped a Chrome extension for Codex on macOS and Windows that can work across logged-in sites and multiple background tabs. It should speed up testing, data entry, and other web app tasks by letting Codex run more parallel browser work.
Yutori rolled out Navigator n1.5 as a web computer-use model and said it improves the tradeoff between accuracy, latency, and cost for browser tasks. The launch matters because related environment-generation work is aimed at the long-horizon web workflows that make computer-use agents expensive and brittle.
LangChain shipped a Browserbase integration that gives Deep Agents dedicated search, fetch, and browser subagents with dashboard observability. That turns web navigation into a first-class tool path for agent workflows instead of a custom one-off browser loop.
Sigma added a private AI browser mode that runs OpenClaw with local models such as Gemma 4, Qwen, and Nemotron on-device. That matters because browser automation and page context can stay local instead of being routed through a hosted agent service.
Factory launched Automated QA in Droids, adding /install-qa and /qa to drive apps like a real user and attach screenshots, traces, and logs to PRs. The feature packages browser-based regression testing as a built-in agent workflow.
Browser Use launched Browser Use Box, a 24/7 Browser Harness environment with persistent logins and Telegram control. It moves browser agents off laptops and into always-on remote sessions for long-running web tasks.
OpenClaw shipped a release that routes realtime voice queries to the full agent, defaults new users to V4 Flash, and adds coordinate clicks plus stale-lock recovery for browser automation. It also fixes Telegram, Slack, MCP session, and TTS issues, so update if those flows matter to your setup.
Cua Driver open-sourced a macOS driver that lets agents control apps in the background with multi-player and multi-cursor support. It matters because it turns background computer use from an app-specific feature into a reusable primitive that any agent loop can adopt.
Hermes Agent added Tool Gateway, bundling 300+ models with web, browser, image, terminal, and TTS tools behind one subscription. Firecrawl, Browser Use, Fal image models, and Gemini Voice shipped at launch.