workflowJune 28, 2026

Codex users report /goal, /rewind, and /compact workflows after launch

A day after /goal and thread automations landed in Codex, practitioners started standardizing on /goal specs, /fork or /side detours, and /rewind plus /compact recovery. The pattern matters because verifier design and compaction timing now control how well long runs hold together.

6 min read

Codex users report /goal, /rewind, and /compact workflows after launch

TL;DR

Practitioners are converging on /goal as a long-run wrapper for work that starts with a spec, a rubric, and a verifier, according to Vtrivedy10 on /goal defaults, Vtrivedy10's fine-tuning checklist, and Jerry Liu on goal and eval engineering.
The most repeated failure mode is not model quality, it is context drift: Matt Pocock on the dumb zone puts the drop-off around 120K tokens on frontier models, while Matt Pocock on mid-phase compaction says a badly timed /compact can derail the run.
Users are separating real branch work from quick interruptions, with aibuilderclub_ on /fork and /side framing /fork as a persistent alternate path and /side or /btw as a temporary detour.
Codex's new thread automations are being treated as scheduled wake-ups for an existing thread, while /goal is being treated as the thing that actually executes toward completion, per jxnlco on thread automations and jxnlco on loop versus goal.
Remote session visibility is part of the appeal too: HamelHusain on Codex remote access says Codex exposes running sessions across devices, and OpenAIDevs on long-thread navigation shows OpenAI still smoothing the UI around long-running work.

You can see the pattern hardening in public almost in real time: TheZachMueller setting a goal used /goal for a live Matrix agent run, dejavucoder on checking a running goal described how to inspect or pause a loop mid-flight, and reach_vb on project-focused threads argued that spinout threads plus compaction change how much users think about context at all.

/goal specs and verifiers

The center of gravity moved fast from prompts to task design. Vtrivedy10 on /goal defaults says every message outside /goal should mostly be in service of building the spec and rubric that the run will use.

Across Vtrivedy10's posts, the recurring ingredients are concrete enough to turn into a checklist:

context files and tool access, so the agent can assemble what it needs itself Vtrivedy10's fine-tuning checklist
explicit verification instructions, including target numbers or holdout sets Vtrivedy10's fine-tuning checklist
a rubric for done-ness, often binary Vtrivedy10's fine-tuning checklist
small exploratory runs before the big run, because good rubrics are discovered in traces, not guessed upfront Vtrivedy10 on discovering rubrics

Jerry Liu, founder of LlamaIndex, used nearly the same framing: Jerry Liu on goal and eval engineering says the shift is from prompt engineering toward goal and eval engineering. That maps neatly onto Vtrivedy10's more operational version, where the agent succeeds when the subtask is verifiable and the verifier matches what the human means by done Vtrivedy10 on self-verifying specs.

/fork, /side, and shallow delegation

The workflow around /goal is not one giant uninterrupted thread. aibuilderclub_ on /fork and /side draws a clean line between /fork for real implementation branches and /side or /btw for disposable side questions.

That distinction matters because people are trying to keep the main run clean. dejavucoder on /side as an ephemeral fork calls /side more of an ephemeral fork, while Vtrivedy10 on shallow depth says wrapper setups that spawn /goal workers seem to work best when recursion stays shallow and control returns to the main agent.

The thread split is starting to replace heavier workflow wiring. reach_vb on project-focused threads says project-focused threads and spinout threads made context management feel less manual, and Jerry Liu on goal and eval engineering argues that the model can increasingly infer the intermediate workflow if the goal and evaluation are clear enough.

/rewind and /compact

The sharpest shared mechanic in this batch of posts is recovery after a long run gets weird. Matt Pocock on the dumb zone suggests a simple loop: rewind to before the bad step, compact when it makes sense, then rerun and compare.

Matt Pocock's thread adds three concrete caveats:

the "dumb zone" can start around 120K tokens on state-of-the-art models Matt Pocock on the dumb zone
for a 200K context window, one commenter discussion put the danger zone around 40 percent of the window, roughly 80K, though Pocock says it varies by harness, model, and task Matt Pocock on the 40 percent rule
a mid-phase compaction can make the agent lose track of the task, so compaction timing matters as much as compaction itself Matt Pocock on mid-phase compaction

Users are also treating /goal runs as something to supervise, not just fire and forget. dejavucoder on checking a running goal says you can queue an instruction or use /btw while a loop is running, and pressing escape pauses the goal rather than killing the thread.

Thread automations

Codex now has a second long-run primitive, and users are already separating it from /goal. jxnlco on thread automations describes thread automations as heartbeat-style recurring wake-up calls attached to the current thread.

The practical split in the tweets looks like this:

thread automations preserve an existing thread's context and revisit it on a schedule jxnlco on thread automations
/goal is for a bounded task the agent should push toward completion Vtrivedy10 on /goal defaults
some users now want a dedicated /loop command that sits between those two behaviors jxnlco asking for a /loop command
jxnlco's shorthand was that "loop is a scheduled" and "goal would have to learn to sleep" jxnlco on loop versus goal

That makes the new Codex surface look less like one feature and more like a small control plane for different kinds of persistence.

Remote sessions

One more reveal from the first wave of usage reports: Codex is being used as a remote command center, not just a local coding chat. HamelHusain on Codex remote access says the app exposes all running sessions across devices, including mobile.

The screenshots attached to HamelHusain on Codex remote access show separate Remote and Local session lists plus a mobile view of active projects. OpenAI's own dev account added a quieter but relevant quality-of-life note: OpenAIDevs on long-thread navigation says long threads now scroll more smoothly and keep your place, which tells you the company is already sanding down the UI for exactly the kind of long-running, many-turn workflows these /goal and automation posts are stress-testing.