workflowJune 7, 2026

Agent tooling adds .prose.md programs, PR panes, and exact-edit primitives

Builders shipped OpenProse workflow files, ghzinga PR tabs, cmux terminal controls, datasette-agent-edit primitives, and an agent-optimized CLI fork. These pieces turn prompt strings into reusable files, panes, and testable edit loops for coding agents.

7 min read

Agent tooling adds .prose.md programs, PR panes, and exact-edit primitives

TL;DR

TheTuringPost's OpenProse thread captured the biggest shift of the week: workflow logic is moving out of disposable prompts and into reviewable .prose.md contracts, while the OpenProse repo frames the coding agent itself as the compiler for those contracts.
Exact edit tools are getting standardized. Simon Willison's release note and Simon Willison's weblog item both describe a three-tool editing pattern, view, str_replace, and insert, instead of freeform rewrite prompts.
Agent-first CLI design is getting more opinionated. dbreunig's botmap thread pushed four rules, no interactive modes, rich failure context, bundled docs and skills, JSON everywhere, while the Hugging Face hf CLI post argued that agent-optimized commands can cut token use by up to 6x versus hand-rolled API calls.
Terminal surfaces are filling in missing UI. daniel_mac8's cmux post and the cmux repo point to browser panes, notifications, SSH, and cookie-aware workflows, while onusoz's ghzinga demo shows PRs and issues turning into tabs beside the agent instead of browser sprawl.
Higher-level harnesses are converging on the same idea from different angles: rohanpaul_ai's Spec Kit thread makes specs executable, aibuilderclub_'s /goal thread decomposes long Codex runs into six required fields, and pvncher's RepoPrompt CE link packages orchestration and review into a dedicated workspace.

You can browse OpenProse, check the Spec Kit command flow, and read Hugging Face's agent CLI benchmark writeup. Simon Willison's editing plugin post quietly turns Claude-style text editing into a reusable primitive, while the cmux repo and ghzinga repo show how much of the remaining work is just building the panes, tabs, and sidecars agents have been missing.

OpenProse and Spec Kit

OpenProse and Spec Kit landed on the same answer from opposite ends. The OpenProse README describes declarative *.prose.md contracts for an "ideal world state," with optional ProseScript when the run order actually matters, while TheTuringPost's summary translated that into a practical claim: less babysitting, explicit tool and skill dependencies, isolated sub-agent sessions, and reusable run artifacts.

Spec Kit is less language-like, but the shape is similar. The Spec Kit repo breaks the path into constitution, spec, plan, tasks, and implementation commands, and rohanpaul_ai's thread framed the result as an executable development contract rather than loose documentation.

A lot of the week's discourse compressed this into slogans. dbreunig's reply called it "program, don't prompt," and daniel_mac8's post gave it the more meme-ready name "Loop Engineering."

Exact-edit primitives

datasette-agent-edit 0.1a0

Release: datasette-agent-edit 0.1a0 I'm planning several plugins for Datasette Agent which can make edits to existing pieces of text - things like collaborative Markdown editing, updating large SQL queries, and editing SVG files. Agentic editing of text is a little tricky to get right. My favorite published design for this is for the Claude text editor, which implements the following tools: view - view sections of a file, with line numbers added to every line. str_replace - find an exact old_str and replace it with new_str - fail if the original string is not unique insert - insert the specified text after the specified line number Rather than recreate these patterns for every plugin that needs them I decided to create this base plugin, datasette-agent-edit, which implements the core tools in a way that allows them to be adapted for other plugins. Tags: ai, datasette, generative-ai, llms, llm-tool-use, datasette-agent

The cleanest low-level contribution came from Simon Willison's release note: wrap text editing in three exact tools instead of asking an agent to regenerate a file and hope the diff is sane.

The tool surface is short:

view, show file slices with line numbers
str_replace, replace an exact unique string or fail
insert, add text after a specified line

That design is small enough to copy everywhere. Simon's post says the plugin is storage-agnostic and meant as a base for Markdown, SQL, and SVG editing plugins, which is a more interesting claim than the release itself: agent text editing is starting to look like a protocol.

That same bias toward constrained edits showed up elsewhere. WesRoth's Codex update roundup was mostly app polish, but state persistence, structured prompt-link pills, and the fix for stopping the correct chat all point at the same operational problem, long-running sessions need deterministic controls, not just better model output.

Agent-first CLIs

The botmap fork reads like a design checklist for the next wave of CLIs. dbreunig's thread listed four blunt rules:

No interactive modes or progress bars
Return context when commands fail
Ship docs and a skill with the binary
Prefer JSON everywhere

The botmap repo adds the missing piece: test the CLI by feeding prompts to an agent and watching for predictable mistakes, like downloading data locally instead of calling the right command. dbreunig's follow-up made the broader point explicit, agent UX and human UX can improve together.

Hugging Face tried to quantify the same argument. In its hf CLI post, the company said Claude Code and Codex used up to 6x more tokens and posted lower success rates when forced to hand-roll curl or SDK calls instead of using the hf CLI, which matches Clement Delangue's thread about abstractions acting as cached intelligence for agents.

Terminals, panes, and sidecars

The UI layer is getting specialized just as fast as the workflow layer. According to the cmux repo, the app is a Ghostty-based macOS terminal built for AI coding agents, with vertical tabs, attention rings, a notification panel, an in-app browser with a scriptable API, SSH support, and browser-cookie sync. daniel_mac8's GitHub link post tied that repo to the demo calling out cookies import, notifications, and remote-machine access.

ghzinga is much narrower, which is why it feels useful. The ghzinga repo describes a clickable TUI for a single PR or issue with comments, checks, files, links, and auto-refresh, and onusoz's link post paired that with a new tabs view so agents can open multiple relevant PRs or issues in a side pane.

There is a plugin version of the same pattern too. skirano's MagicPath thread described an official Codex plugin that gives the agent a multiplayer canvas, which is another way of saying the coding surface is no longer just a chat box plus a shell.

Harnesses, goals, and orchestrators

The orchestration layer is getting formal enough that people are naming its subparts. pvncher's RepoPrompt thread described sub-agents as threads you can open and steer, while the RepoPrompt CE repo pitches a native workspace for context engineering, agent orchestration, and reviewable handoffs.

The most concrete prompt-level guidance came from the aibuilderclub thread on Codex /goal. aibuilderclub_'s six-element post said a durable goal needs six fields:

Outcome
Verification
Constraints
Boundaries
Iteration policy
Stop condition

aibuilderclub_'s example post then contrasted a vague "improve performance" request with a benchmarked p95 target, while aibuilderclub_'s checklist post added eight pre-launch checks and aibuilderclub_'s tools post recommended /grill-me plus goalbuddy for clarifying ambiguous work.

The more maximal version of this idea showed up in the background commentary. pvncher's orchestrator post argued that you can describe the loops you want and let the model carry them out, zachtratar's heartbeat reply reduced the pattern to agents pulling from a task list on a heartbeat, and steipete's loop-design post said the file he uses is a VISION.md.

Skills are becoming trainable assets

The last interesting shift is that skill files themselves are starting to look mutable and testable, not static prompt cargo. Hermes v0.16.0 shipped a desktop GUI, dashboard overhaul, and a leaner built-in skillset according to Teknium's release thread, while Teknium's dashboard update separately called out a more comprehensive skills-hub search.

On the research side, Microsoft's SkillOpt paper treats the skill document as the trainable state of a frozen agent. AlphaSignalAI's summary said the optimizer model proposes bounded add, delete, or replace edits, accepts a candidate only if it beats the prior version on held-out evaluation, and reported wins or ties on 52 setups, including gains up to 24.8 points on coding harnesses.

That lands neatly beside the week's other projects. OpenProse turns workflows into versioned files, Spec Kit turns requirements into commandable artifacts, and SkillOpt pushes one level deeper by trying to optimize the instruction file itself.