Skip to content
AI Primer
TOPIC50 stories

DX Tooling

Stories about IDE features, CLI ergonomics, memory/context handling, or other day-to-day tool ergonomics that change how an engineer works (Cursor rules, Claude Code memory, Codex CLI features).

RELEASE1st June
Perplexity launches Search as Code in Agent API with WANDR 0.386 and Python search pipelines

Perplexity replaced one-shot search calls with Search as Code, a Python-based search runtime in its Agent API that is also now the default in Computer. The change matters because agents can batch, rank, filter, and aggregate search steps inside code, and Perplexity says the system scored 0.386 on WANDR versus 0.152 for the next system.

RELEASE1st June
Files SDK 1.7 adds resumable uploads, provider sync, and read-only clients

Files SDK 1.7 adds resumable uploads, provider-to-provider sync, read-only clients, directory-style list(), and MCP adapter hardening. The release matters for long-running transfer jobs and safer file access patterns in agent workflows.

RELEASE1st June
Lovable introduces TanStack Start output with SSR, server functions, and type safety

Lovable moved newly generated apps onto TanStack Start, adding route-level SSR, SSG, CSR, server functions, and stricter type-safe boundaries to its generated stack. The migration matters because framework primitives become guardrails for both generated-code quality and deploy-anywhere app behavior.

RELEASE1st June
Codex releases Python SDK with thread control, session resume, and sandbox access

OpenAI shipped a Python SDK and app-server support for Codex with thread creation, streamed turns, session resume, image inputs, and sandbox controls. That gives teams a supported way to embed Codex inside internal tools and automation instead of driving it only through the CLI or desktop app.

NEWS31st May
Codex raises weekly and hourly limits to 100% after 5 million users

OpenAI restored Codex weekly and hourly quotas across paid ChatGPT plans after Tibo Sottiaux said the product hit 5 million users. Watch for long-running QA loops, migration PRs, and remote desktop sessions that can still burn through quotas fast.

NEWS31st May
Coding-agent builders add shared memory, provider routing, and app launchers

Independent developers shipped sidecars that let Claude Code, Cursor, and Codex share memory, hot-swap model providers, package local projects as apps, and automate browser QA. Try these reusable tools if you want memory, routing, QA automation, and app packaging outside editor-specific features.

RELEASE31st May
Hermes Agent adds native Windows support with PowerShell install

Nous Research moved Hermes Agent's native Windows build out of beta with direct PowerShell installation and a dedicated guide. Windows users now have a first-party install path instead of relying on WSL or other workarounds.

RELEASE31st May
CopilotKit integrates Claude Agent SDK with AG-UI for React and mobile frontends

CopilotKit shipped an AG-UI integration that streams Claude Agent SDK agents into web and mobile frontends with generative UI and approval checkpoints. The adapter lets teams embed terminal-first Claude agents in React, Vue, Angular, and React Native without rewriting transport or state plumbing.

WORKFLOW30th May
Codex community ships /dynamic swarms, session lifecycles, and model routing

Builders added /dynamic orchestration, custom-model routing, and repo runbooks around Codex as users exposed new session lifecycle controls in the app. That makes Codex a better fit for long-running, multi-context coding work.

NEWS30th May
Hermes ecosystem ships Web UI, Control Room, and 14% lower read_file tokens

Builders released a chat-first Web UI and a multi-agent Control Room template around Hermes Agent, while core updates cut read_file input tokens by 14% and fixed TUI startup hangs. Use the new controls to manage local multi-agent setups while reducing routine token burn.

RELEASE30th May
OpenClaw releases 2026.5.28 with Opus 4.8 support and faster turns

OpenClaw 2026.5.28 added Claude Opus 4.8 and Krea support while cutting fresh-install size 52.8% and speeding both cold and warm turns. It also expanded /subagents inspection, which should make delegated runs easier to debug.

WORKFLOW30th May
Pi ecosystem adds /goal tasks, acceptance gates, and Lovely Dev Tools

Three independent Pi builders shipped a goal runner, contract-style subagent acceptance gates, and a new Lovely Dev Tools extension in the same window. That gives Pi users more deterministic long-running loops and cleaner local tool interfaces without starting from an empty harness.

RELEASE29th May
OpenAI Codex adds Windows computer use and ChatGPT mobile remote control

OpenAI added computer use to Codex on Windows and lets ChatGPT mobile steer tasks running on Windows PCs. The update extends Codex to existing Windows dev machines and adds remote review and debugging from mobile.

RELEASE29th May
Claude Code 2.1.158 adds auto mode for Bedrock, Vertex, and Foundry

Anthropic followed Claude Code 2.1.157 with 2.1.158, enabling auto mode on Bedrock, Vertex, and Foundry for Opus 4.7 and 4.8. The paired releases also add local plugin scaffolding and auto-load plus fixes for image handling and sandbox permission prompts.

RELEASE29th May
Codex iOS adds /side conversations, diff summaries, and Spotlight shortcuts

Codex on iOS now supports side conversations, end-of-turn diff summaries, archived remote threads, model switching, and Spotlight or Shortcuts hooks. The update brings more desktop-style task steering and change review to mobile sessions.

RELEASE29th May
Vercel Sandbox adds Docker support with persistent images and isolated container runs

Vercel Sandbox can now build and run Docker containers, persist images and installs across sessions, and host databases or full apps inside the sandbox. That broadens what coding agents and preview environments can validate without leaving Vercel.

RELEASE29th May
Gemini API adds Managed Agents with sandboxed Linux, web access, and file I/O

Gemini Managed Agents can spin up a sandboxed Linux environment with code execution, web access, and file I/O from one API call, and early examples now include W&B and LlamaIndex workflows. That gives builders a higher-level runtime for long tasks while third-party templates start to define the first production use cases.

RELEASE29th May
llama.cpp launches official site with one-line installer and unified `llama` CLI

llama.cpp now has an official website and a single-line installer that provides one `llama` entrypoint for running, serving, and agent integrations. The packaging change simplifies local setup while reusing GGUF models already on disk.

RELEASE29th May
Cursor adds auto-review mode with classifier subagent and fewer approval prompts

Cursor shipped auto-review mode, letting agents run more tool calls with fewer approval prompts and sending unsafe or unsandboxed actions to a classifier subagent. The change lowers review friction while keeping a separate path for higher-risk calls.

NEWS28th May
Agent tools add Claude Opus 4.8 to Cursor, Warp, OpenRouter, and Perplexity on day one

Independent IDEs, gateways, and agent runtimes rolled out Claude Opus 4.8 within hours of launch, including Cursor, Warp, OpenRouter, and Perplexity. That matters because teams can benchmark or swap the model into existing workflows without waiting for connector lag.

RELEASE28th May
OpenAI updates GPT-5.5 Instant with writing blocks and less bullet-heavy replies

OpenAI rolled a new GPT-5.5 Instant into ChatGPT and the API with less bullet-heavy output, better pacing, and higher multilingual quality. The update also replaces Canvas in GPT-5.5 Instant and Thinking with in-chat writing and code blocks, so users should migrate workflows while legacy models still keep Canvas temporarily.

RELEASE28th May
Vercel CLI ships experimental native binaries with ~80% smaller footprint

Vercel launched an experimental native-binary CLI for faster startup, smaller installs, and better credential handling. Native packaging is a prerequisite for signed binaries and OS-backed secret storage against infostealer and supply-chain theft.

RELEASE28th May
Linear launches Diffs with AI-guided PR reviews and realtime updates

Linear launched Diffs, a PR review workflow inside Linear with realtime updates, threaded comments, focused notifications, and beta AI guidance. It keeps review closer to issue tracking, though teams still need GitHub for some PR discovery.

NEWS28th May
Cursor reports input tokens make up 70% of coding-agent costs

Cursor's Developer Habits Report says input tokens account for about 70% of price-equivalent coding-agent costs as agents read more context. The report also says auto-accepted code is up 5x since the start of the year, so teams should watch context usage and review rates.

NEWS27th May
Codex removes GPT-5.2 and GPT-5.3-Codex on June 2

OpenAI said ChatGPT-linked Codex will drop GPT-5.2 and GPT-5.3-Codex on June 2, with GPT-5.5 becoming the default frontier model for free users. The API versions stay available, but the in-product model surface is being reduced for compute-fleet management.

NEWS26th May
Grok Build Beta adds Toad and Kilo Code integrations plus a web Build tab

xAI broadened Grok Build Beta while Toad and Kilo Code shipped direct support and published concrete build demos. That matters because Grok Build is moving from a standalone beta into terminal, editor, and web workflows engineers can actually wire into daily use.

RELEASE26th May
Warp Agent adds OpenRouter URLs and /model aliases for custom endpoints

Warp now lets agents connect directly to an OpenRouter endpoint and switch providers through remembered model aliases. The change reduces endpoint setup friction for teams routing across hosted models inside Warp Agent.

NEWS26th May
Firecrawl integrates into Vercel Marketplace with scraping, search, and dynamic-site access

Firecrawl is now available through Vercel Marketplace and Agent Marketplace for apps and agents that need live web data. The integration reduces setup friction for teams adding scraping, search, and structured retrieval to deployed AI workflows.

RELEASE26th May
Weights & Biases launches MCP server with 20 tools for schema-first queries

Weights & Biases released an MCP server that exposes experiment data to Claude Code, Cursor, Codex, Gemini CLI, and Le Chat. The schema-first design helps agents inspect available metrics before pulling rows, which can prevent preview runs from overflowing context windows.

WORKFLOW1w ago
Developers compare 128GB workstations, M5 Max laptops, and 20/80 local-cloud agent splits

Developers published new local-first agent setups spanning 128GB workstations, M5 Max laptops, local-model checkers, and 20/80 local-cloud splits. The pattern matters because teams are moving extraction, coordination, and offline tasks off frontier APIs while keeping harder reasoning in the cloud.

WORKFLOW1w ago
Developers ship Chrome MCP, repo-graph search, and token compression for Claude Code and Codex

Independent developers released browser-control MCP tooling, repo-context graphing and packaging utilities, and token-compression helpers for coding agents. The cluster matters because agent workflows are now adding browser control, context packing, and cost controls as external infrastructure instead of waiting on raw model upgrades alone.

NEWS1w ago
Google AI Studio reports 250,000 native Android apps in its first week

Google said AI Studio users created more than 250,000 native Android apps in the first week after app generation launched. The number matters because it is the first adoption signal for Google's free no-code Android builder and device-testing workflow.

RELEASE1w ago
Files SDK 1.6 adds transfer() streaming and byte-range downloads

Files SDK 1.6 added cross-provider transfer() streaming and byte-range downloads for partial reads. The release matters because large-file migrations, resumable flows, and media-style UIs no longer need full-file buffering.

WORKFLOW1w ago
Agent Skills ecosystem ships handoff docs, htmx v4 packs, and Project Think support

Independent builders published reusable skills infrastructure across coding agents, including Project Think preview support, handoff docs, and an htmx v4 skill pack. That matters because skills are starting to work like portable workflow units instead of one-off prompt snippets inside a single tool.

NEWS1w ago
Grok Build opens CLI access to SuperGrok and X Premium+ users

Rollout posts say Grok Build CLI is reaching SuperGrok and X Premium+ users beyond the earlier higher tier. That broadens access to xAI's command-line agent and X search client without a new API launch.

RELEASE1w ago
Datasette 1.0a30 adds a slash Jump To menu and plugin hook

Datasette 1.0a30 introduced a slash-triggered Jump To menu plus a hook for plugin-supplied search items. Simon Willison used it in datasette-agent 0.1a4 to start agent chats from the same menu, so plugin authors can wire in their own actions.

WORKFLOW1w ago
Agent Skills supports Codex, Cursor, Gemini CLI, and VS Code through new libraries and plugins

New guides, plugins, and reusable libraries show the Agent Skills format moving beyond Claude Code into multiple coding-agent clients and runtimes. That matters because workflows are becoming portable artifacts instead of one-off prompts tied to a single harness.

RELEASE1w ago
OpenClaw releases 2026.5.22 with ~5ms /models startup

OpenClaw 2026.5.22 shipped leaner gateway and model startup paths, bringing /models to about 5 ms, while also adding locked dependency shrinkwraps and safer Windows rollbacks. That matters because it targets both startup latency and release-install trust for local agent operators.

RELEASE1w ago
Pi releases v0.75.5 with collapsed read cards and Ctrl+O expansion

Pi v0.75.5 now shows only the read line in collapsed tool cards while keeping the full inspected range behind Ctrl+O. That matters because long read outputs were obscuring edits and steering signals in collaborative coding sessions.

RELEASE1w ago
Grok Build updates 0.1.218 with Ctrl+X help fixes

Grok Build 0.1.218 shipped shortcut and help fixes, while early testers reported strong terminal UX but missing long-run control, browser use, and reliable self-verification. That matters because xAI is already competitive on TUI ergonomics even as core agent controls remain incomplete.

WORKFLOW1w ago
Codex users report iPhone simulator bug-bashes, Appshots form fills, and locked-Mac runs

Two days after Codex added locked-Mac control and Appshots, users posted end-to-end iPhone simulator debugging, Safari form-filling, and remote-control workflows. That matters because the feature is moving from launch copy into concrete computer-use tasks that can replace manual QA and repetitive UI work.

WORKFLOW1w ago
Codex users report better compaction and Colab control after v0.133.0

Developers say Codex v0.133.0 improved compaction, remote-control workflows, and Chrome-driven Colab runs after `/goal` became default. The same update window also brought easier skill discovery and new diff options, though some users saw approval-pause regressions in full-access mode.

RELEASE1w ago
Hermes Agent adds Bitwarden Secrets Manager for key rotation and team access

Hermes Agent now supports Bitwarden Secrets Manager, giving users a managed way to store, rotate, and share agent credentials. That matters because secret handling becomes a real operational problem once agents move beyond solo local setups.

RELEASE1w ago
Antigravity updates Gemini 3.5 Flash with permanent 3x quotas and 2x context

A day after Antigravity raised weekly Gemini quotas, the team said the 3x increase is permanent and doubled Gemini 3.5 Flash max context in AGY. The same update batch also clarified the IDE split and shipped Windows fixes, changing day-to-day limits and workflow behavior for developers.

RELEASE1w ago
Letta Code adds embedded local server with Ollama and LM Studio support

Letta Code can now run fully locally with an embedded server, removing the login and Docker requirement while keeping memory sync via `/memory-repository`. That gives developers a local-first agent harness with optional Ollama and LM Studio support instead of forcing everything through Letta’s hosted API.

RELEASE1w ago
Cursor releases Composer 2.5 SDK for Python and TypeScript

Cursor opened a Python and TypeScript SDK for building custom agents on Composer 2.5 and paired the launch with a 90% usage discount for the long weekend. Artificial Analysis data still shows Composer 2.5 leading on cost per task, making the SDK launch an efficiency play for builders.

RELEASE1w ago
Warp adds BYOK to Warp Agent with OpenAI-compatible endpoints

Warp Agent now accepts user-supplied OpenAI, Anthropic, and Gemini keys plus OpenAI-compatible endpoints such as OpenRouter and DeepSeek. The change removes the paid-plan requirement for inference access and gives terminal users more routing options.

RELEASE1w ago
OpenAI updates Codex with locked-Mac control and Appshots

OpenAI shipped a Codex update that lets the mobile app control a locked Mac, adds Appshots for screen context, and graduates /goal. It also adds browser annotation tools, team plugin sharing, and expanded analytics for business users.

RELEASE1w ago
Cognition adds Windows VMs to Devin for MSBuild, IIS, and .NET migrations

Cognition added native Windows VMs to Devin so it can build, run, and test Windows applications with MSBuild, IIS, PowerShell, and SQL Server. The rollout lets Devin handle enterprise codebases where Linux sandboxes are not enough.

RELEASE1w ago
Datasette Agent releases 0.1a3 with SQL chat, charts, and Fly sandbox plugins

Simon Willison shipped the first Datasette Agent release and companion chart and Fly sandbox plugins for conversational SQLite workflows. The stack combines live SQL inspection, chart rendering, and optional command execution inside an extensible local data assistant.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.