TOPIC50 stories

Developer Experience

How tools and workflows affect engineers using AI daily.

Stories

Claude Code updates hidden-features guide with /loop, hooks, and /batch

Claude Code lead Boris Cherny published a feature guide covering mobile handoff, /teleport, /remote-control, scheduled loops, hooks, Dispatch, browser testing, worktrees, /batch, and CLI flags like --bare. The guide shows Claude Code being used as a persistent automation surface, so teams can evaluate whether to lean on remote sessions and repo-scale fan-out.

NEWS1d ago

LiteLLM guidance adds source-to-release checks after PyPI backdoor

Fresh discussion after the compromised LiteLLM wheels focused on two concrete fixes: publicly verifiable source-to-release correspondence and stronger separation of agent runtimes, credentials, and network egress. The incident matters because the attack path ran through CI tooling and install-time execution, so teams should harden build provenance and runtime isolation.

NEWS1d ago

OpenCode supports zero-retention for all Go models

OpenCode said all Go models now run under zero-retention agreements, then clarified that hosted routes use the same providers customers get direct and explained why higher subscription tiers are risky to price. The clarification matters for users debating telemetry, proxying, and how local the web UI really is, so teams should verify their data path.

NEWS1d ago

GitHub Copilot updates Free and Pro training opt-out before Apr. 24

Unless Free, Pro, and Pro+ users opt out by Apr. 24, GitHub will use Copilot interaction data for model training rather than excluding it by default. The discussion focused on shared-repo edge cases, since prompts, accepted outputs, filenames, and navigation traces can cross team boundaries even when repo data at rest is excluded.

NEWS1d ago

Jai clarifies .jai persistence and selective home-directory mounts

HN follow-up on Stanford's jai sandbox emphasized that agent changes persist under .jai by default, with explicit mounts back into the real home directory when needed. That clarification matters for teams comparing dev containers, bubblewrap, podman, and LXC, so they can decide how much host state an agent should be allowed to keep or touch.

WORKFLOW1d ago

Claude Code supports Figma MCP sketch-to-mockup workflow

Users showed a flow where a rough Figma sketch and file link are handed to Claude Code, which fleshes out styled mockups and extra components inside Figma. The handoff keeps UI iteration in the design tool before code generation, so teams can keep design review upstream of frontend implementation.

RELEASE2d ago

Hermes Agent ships v0.5.0 with 400+ Portal models and Exa support

Hermes Agent v0.5.0 adds 400+ models via Nous Portal, Hugging Face access, Exa support, GPT-5.4 behavior tweaks, and a published changelog. The release broadens provider coverage and hardens the runtime without changing the terminal-first workflow.

WORKFLOW2d ago

Claude Code guides compare `.claude/` commands, agents, and global rules

Two new guides map how Claude Code teams are using `.claude/`, `CLAUDE.md`, commands, agents, skills, and global rules. The overlap matters because commenters favor short instructions and a small number of repeatable guardrails over larger prompt stacks.

WORKFLOW2d ago

FutureSearch reports 72-minute response to LiteLLM .pth malware

A published transcript shows a 72-minute response to the malicious LiteLLM wheel, from spotting a frozen laptop to reporting the `.pth` credential stealer and posting disclosure. It turns the compromise into a concrete incident-response playbook for Python AI tooling.

NEWS2d ago

CopilotKit introduces AG-UI as an event-driven protocol for agent UIs

CopilotKit published a walkthrough of AG-UI, an event-driven protocol that standardizes how agent frameworks stream text, tool calls, lifecycle events, and state to applications. The protocol aims to let teams swap agent backends without rewriting the UI contract.

RELEASE2d ago

tinybox ships red v2 with 4x 9070 XT and 64 GB GPU RAM for $12,000

tiny corp is shipping tinybox red v2 at $12,000 with four 9070 XT GPUs and 64 GB of GPU memory, alongside higher-end Blackwell systems. Buyers are weighing the bundled tinygrad stack against DIY rigs, model-fit limits, and cloud economics.

NEWS3d ago

GitHub updates Copilot policy: private-repo interactions train models by default on Apr. 24

GitHub said Copilot Free, Pro, and Pro+ interaction data will train models by default from Apr. 24 unless users opt out, while private repo content at rest stays excluded. Teams should review per-user enforcement, enterprise coverage, and repo privacy settings before the change lands.

NEWS3d ago

Claude Code adds scheduled cloud tasks for PR reviews and `/schedule` runs

Claude Code can now run recurring prompts and background pull-request work on Anthropic-managed cloud environments from the web, desktop, or `/schedule`. That makes long-running repo tasks less dependent on a local machine, but users report task caps and restricted egress.

WORKFLOW3d ago

Codex adds use-case gallery and official plugins for shadcn/ui and Box

OpenAI published a Codex use-case gallery with one-click workflows, and shadcn/ui and Box shipped official plugins. Teams can now install reusable app and web workflows directly instead of wiring each integration by hand.

RELEASE3d ago

Composio launches Universal CLI for terminal-native tool access

Composio shipped Universal CLI as a shell-first interface to its integrations, moving install, search, and agent workflows out of MCP setup. The release targets users who want simpler agent tool access after complaints that MCP stacks are harder to install, slower, and less stable.

RELEASE3d ago

OpenCode launches open-source coding agent with `opencode serve` and multi-backend web UI

OpenCode shipped terminal, desktop, and `opencode serve` workflows for an open-source coding agent with LSP support, plugins, and more than 75 providers. Users should look at the multi-backend web sessions, IPC plugins, and sandboxed local setup as the main differentiators.

RELEASE3d ago

Hermes Agent adds Hugging Face provider with 28 curated models

Hermes Agent now treats Hugging Face as a first-class inference provider and surfaces 28 curated models in its picker, plus a custom path to the broader catalog. That broadens model choice for a persistent local agent workflow without requiring users to wire a provider manually.

RELEASE4d ago

Codex launches plugins for Slack, Figma, Gmail, and Google Drive

OpenAI rolled out Codex plugins across the app, CLI, and IDE extensions, with app auth, reusable skills, and optional MCP servers. Teams should test plugin-backed workflows and permission models before broad rollout.

RELEASE4d ago

Cline launches Kanban with worktree-linked parallel CLI agents

Cline launched Kanban, a local multi-agent board that runs Claude, Codex, and Cline CLI tasks in isolated worktrees with dependency chains and diffs. Teams can use it as a visual control layer for parallel coding agents on repo chores that split cleanly.

RELEASE4d ago

Claude Code 2.1.85 releases with conditional hooks and /compact overflow fix

Claude Code 2.1.85 adds hook if filters, new MCP header env vars, transcript timestamps, and fixes for /compact overflow, remote leaks, auth flow, and terminal bugs. Upgrade if your workflow depends on hooks or long sessions, and use the new cloud auto-fix flow for unattended PR cleanup.

NEWS4d ago

Every launches Plus One, a hosted OpenClaw for Slack

Every opened Plus One, a hosted OpenClaw that lives in Slack, comes preloaded with internal skills, and works with a ChatGPT subscription or other API keys. It lowers the ops burden for deployed coworkers, so teams can test packaged agents before building their own stack.

RELEASE4d ago

Chroma launches Context-1, a 20B search agent with Apache 2.0 weights

Chroma released Context-1, a 20B search agent it says pushes the speed-cost-accuracy frontier for agentic search, with open weights on Hugging Face. Benchmark it against your current search stack before wiring it into production.

RELEASE4d ago

Rork launches Max Publishing to auto-fill App Store listings and submit builds

Rork added Max Publishing to generate icons, screenshots, listing text, review metadata, and submission steps for App Store releases, and also shipped an App Store MCP. Use it first on non-critical apps and keep a manual review gate.

RELEASE4d ago

Imbue launches Latchkey: local agents call HTTP APIs without exposing tokens

Imbue released Latchkey, a library that prepends ordinary curl calls so local agents can use SaaS and internal APIs while credentials stay on the developer machine. Try it where agents need many HTTP integrations but should not see raw secrets.

RELEASE5d ago

Claude Code releases 2.1.84: PowerShell preview, task hooks, idle-return clearing

Claude Code 2.1.84 adds an opt-in PowerShell tool, new task and worktree hooks, safer MCP limits, and better startup and prompt-cache behavior. Anthropic also documented auto mode’s action classifier and added iMessage as a channel, so teams should review permissions and remote-control workflows.

RELEASE5d ago

OpenCode adds remote sandboxes and syncs agent state across devices

OpenCode is adding remote sandboxes, synced state across laptop, server, and cloud, and more product surface inside its plugin system. That makes long-running off-laptop workflows more practical, but operators should still review telemetry, sandbox, and exposure defaults.

NEWS5d ago

LiteLLM reports credential-stealing code in 1.82.7 and 1.82.8

Malicious LiteLLM 1.82.7 and 1.82.8 releases executed .pth startup code to steal credentials and were quarantined after disclosure. Rotate secrets, audit transitive AI-tooling dependencies, and add package-age controls before letting agents install packages autonomously.

RELEASE5d ago

Google launches Lyria 3 Pro API at $0.08 per song

Lyria 3 Pro and Lyria 3 Clip are now in Gemini API and AI Studio, with Lyria 3 Pro priced at $0.08 per song and able to structure tracks into verses and choruses. That gives developers a clearer path to longer-form music features, with watermarking and prompt design built in.

RELEASE5d ago

Data Agent Benchmark launches with 54 queries and 38% pass@1

Data Agent Benchmark launches with 54 enterprise-style queries across 12 datasets, nine domains, and four database systems, while the best frontier model reaches only 38% pass@1. It gives teams a stronger eval for cross-database agents than text-to-SQL-only benchmarks.

RELEASE5d ago

Firecrawl launches /interact for natural-language browser actions

Firecrawl’s new /interact endpoint lets agents click, fill, scroll, and keep live browser sessions right after /scrape. It shortens the path from page extraction to web automation, but Playwright remains the better fit when you need deterministic full-session control.

NEWS5d ago

Claude adds Figma, Canva, and Amplitude tools to mobile apps

Claude mobile apps now expose work tools like Figma, Canva, and Amplitude, letting users inspect designs, slides, and dashboards from a phone. Anthropic is turning Claude into a mobile front end for workplace agents, so teams should review auth and data-boundary rules.

RELEASE5d ago

Expect launches CLI to QA apps in a real browser and record bug videos

Expect wraps browser QA for Claude Code, Codex, or Cursor into a CLI that records bug videos and feeds failures back into a fix loop. It gives coding agents a tighter UI validation cycle without requiring a custom browser harness.

NEWS5d ago

GitHub updates Copilot policy to train on Free, Pro, and Pro+ interactions

GitHub will start using Copilot interaction data from Free, Pro, and Pro+ tiers for model training unless users opt out, while Business and Enterprise remain excluded. Engineers should recheck privacy settings and keep personal and company repository usage boundaries explicit.

RELEASE1w ago

Cursor adds Instant Grep: 13ms regex search across millions of files

Cursor shipped Instant Grep, a local regex index built from n-grams, inverted indexes, and Bloom filters that drops large-repo searches from seconds to milliseconds. Faster candidate retrieval shortens the coding-agent loop, especially when ripgrep-style scans become the bottleneck.

NEWS1w ago

ChatGPT adds Library tab for reusable file uploads across conversations

ChatGPT now saves uploaded and generated files into an account-level Library that can be reused across conversations from the web sidebar or recent-files picker. It removes repetitive re-uploading and makes past PDFs, spreadsheets, and images part of a persistent working context.

NEWS1w ago

Claude Code adds macOS computer use with app control and permission prompts

Claude can now drive macOS apps, browser tabs, the keyboard, and the mouse from Claude Cowork and Claude Code, with permission prompts when it needs direct screen access. That makes legacy desktop workflows automatable, and Anthropic is pairing the push with more background-task support for longer agent loops.

NEWS1w ago

PlayerZero launches AI production engineer and claims 92.6% accuracy on test cases

PlayerZero launched an AI production engineer and claims its world model can simulate failures before release, trace incidents to exact PRs, and beat existing tools on real production test cases. If those numbers hold, the interesting shift is from code generation to debugging, testing, and observability after code ships.

RELEASE1w ago

Agent Computer launches cloud computers in under 0.5s with SSH access

Agent Computer launched cloud desktops that boot in under half a second and expose persistent disks, shared credentials, SSH access, and ACP control for agents. It gives coding agents a faster place to run tools and reuse auth, but teams still need to design safe session and credential boundaries.

RELEASE1w ago

Vercel Emulate adds programmatic API for creating, resetting, and closing local emulators

Vercel Emulate added a programmatic API for creating, resetting, and closing local GitHub, Vercel, and Google emulators inside automated tests. That makes deterministic integration tests easier to wire into CI and agent loops without manual setup.

RELEASE1w ago

Claude Code tests /init interview flow with CLAUDE_CODE_NEW_INIT=1

Anthropic is testing a new /init flow that interviews users and configures Claude.md, hooks, and skills in new or existing repos. Try it in a sandbox repo, then watch for skills behavior differences between chat and web surfaces.

RELEASE1w ago

CopilotKit adds useAgentContext and useFrontendTool for UI-aware agents

CopilotKit shipped hooks that let agents inspect app state and call frontend actions, then paired them with Shadify for ShadCN-based UI composition. It gives embedded agents a cleaner path from chat to in-app behavior.

RELEASE1w ago

Flash-MoE claims 4.4 tokens/sec on Qwen3.5-397B on 48GB M3 Max

A pure C and Metal engine streams 209GB of MoE weights from SSD and reports tool-calling support in 4-bit mode on a laptop-class Mac. It is a concrete benchmark for teams exploring expert streaming, quantization, and page-cache tricks on consumer hardware.

RELEASE1w ago

KittenTTS releases 15M-to-80M ONNX voice models for CPU deployment

KittenTTS released nano, micro, and mini ONNX TTS models sized for CPU-first deployment instead of GPU-heavy stacks. Voice-agent builders should benchmark both dependency weight and real-time latency before treating tiny size as enough.

RELEASE1w ago

Vercel Labs releases Emulate for stateful GitHub, Vercel, and Google API tests

Vercel Labs published a stateful service emulator for GitHub, Vercel, and Google integrations instead of relying on brittle mocks. It is useful when agents or CI need deterministic auth and third-party API flows in local or sandboxed runs.

WORKFLOW1w ago

Claude tests 25 Capacitor screens daily through Android CDP and iOS accessibility

A solo developer wired Claude into emulators and simulators to inspect 25 Capacitor screens daily and file bugs across web, Android, and iOS. The writeup is a solid template for unattended QA, but it also shows where iOS tooling and agent reliability still crack.

RELEASE1w ago

Claude Code adds scheduled cloud tasks on remote machines with MCP access

Claude Code can now run scheduled cloud tasks against remote repos and MCP-connected tools, while Anthropic is also pushing reusable agent SDK and skill controls. Test remote automation paths carefully, because messaging and multi-repo edge cases still surface in practice.

NEWS1w ago

Cursor Composer 2 ranks #2 on Next.js evals, ahead of Opus and Gemini

Vercel's Next.js evals place Composer 2 second, ahead of Opus and Gemini despite the recent Kimi-base controversy. The result matters because it separates base-model branding from measured task performance on a real framework workflow.

WORKFLOW1w ago

OpenCode adds CloudShell workflow that reuses AWS auth and Bedrock models

OpenCode can now run from AWS CloudShell via npx and inherit AWS auth plus Bedrock models; the same update also brought Firecrawl, India billing, and heap-snapshot debugging. It is becoming a real ops workflow, not just a local terminal toy.

RELEASE1w ago

OpenAI adds container pools to Responses API for 10x faster agent spin-up

OpenAI says Responses API requests can reuse warm containers for skills, shell, and code interpreter, cutting startup times by about 10x. Faster execution matters more now that Codex is spreading to free users, students, and subagent-heavy workflows.

RELEASE1w ago

Conductor adds plan mode, fast mode, and skills for Codex workflows

Conductor now bundles plan mode, fast mode, skills, repo quick start, and an experimental merge-conflict UI around Codex sessions. Try it if you want a higher-level harness for long-running code agents, but watch the foreground chat UX on larger tasks.