Skip to content
AI Primer
workflow

Codex users ship durable-memory workspaces and auto-triage flows

Independent Codex users published Obsidian memory setups, reusable skill prompts, auto-triage flows, and Cloudflare-backed runners for longer jobs. That matters because Codex is being wrapped into persistent workspaces and operator-defined subagents instead of one-shot chats.

6 min read
Codex users ship durable-memory workspaces and auto-triage flows
Codex users ship durable-memory workspaces and auto-triage flows

TL;DR

You can open the goal mode docs, skim the locked-use docs, install OpenAI's Codex CLI release, inspect the community agentmemory repo, and browse Kepano's obsidian-skills for a cleaner picture of where the harness is going. Christmas came early for coding-agent tinkerers.

Obsidian memory

The cleanest pattern in this batch is file-based memory outside any one repo. In daniel_mac8's prompt template, the vault structure is explicit: top-level instructions in AGENTS.md, durable task state in TODO.md, plus folders for people, projects, agent state, and notes.

The custom instructions in daniel_mac8's custom instructions are equally concrete:

  • skim AGENTS.md at the start of substantial sessions
  • write small Markdown updates instead of transcript dumps
  • run a memory closeout before ending a session
  • record unresolved loops in TODO.md or agent/open-loops.md
  • keep secrets and sensitive personal details out of the store

That matches the framing in dl_weekly's link post, which packages the approach as durable threads, file-based memory, verifiable goals, and self-scheduling loops rather than prompt tricks.

Skills and subagents

The second pattern is turning repeated asks into reusable operator code. reach_vb's tip makes a simple split: use a skill for a repeatable workflow, use a subagent for a bounded delegated job.

The prompt in reach_vb's reusable prompt gives the inventory of tasks users are actually extracting from session history:

  • CI failures
  • PR reviews
  • changelogs
  • docs updates
  • release prep
  • debugging
  • test triage

jxnlco's note about the openai/skills repo adds an official distribution point for that pattern, while daniel_mac8's follow-up shows the same memory-backed workflow surfacing eight recommended skills tied to active projects. dkundel's weekly refinement idea pushes one step further by having Codex revisit and tune those skills on a schedule.

Auto-triage and autoreview

[Src:4|steipete's autotriage workflow] is the strongest evidence that users are building a harness around Codex, not just chatting with it. The gating logic sits outside the model and checks whether an issue or PR:

  • fits the project's vision
  • is inferable in code with high confidence
  • has a clear fix
  • can be live tested

Once a task passes that filter, Codex gets a VM plus computer vision verification and works autonomously until a human reviews the suggestion. The project guidance lives in VISION.md, and the review logic can be pulled into a reusable skill file such as the linked autoreview skill.

[Src:10|steipete's scratch-log tip] fills in another missing operator artifact: a running decision log for big refactors, including tradeoffs and review fixes. That turns long agentic runs into something inspectable after the fact.

Cloud runners

Some of the most interesting work here is infrastructural. steipete's cloud runner post describes running Codex on Cloudflare Firecracker boxes, offloading heavier tests elsewhere, and exposing the session through Ghostty in WebAssembly.

jxnlco watching workers get created and jxnlco's browser-driven website build land on the same idea from the user side: Codex is provisioning cloud resources, setting up D1, and editing worker bindings through a browser session. the Cloudsail repost points to the same direction with per-task sandboxes that ship shell access, Codex, and GitHub together.

The official product surface now supports parts of that workflow directly. OpenAIDevs on goal mode says Codex can keep working toward a milestone for hours or days across the app, IDE extension, and CLI, and the goal mode docs formalize that interface. OpenAIDevs on locked Mac control and the locked-use docs extend the same long-run model to remote computer use from a phone, even with the Mac locked.

Computer-use verification

The verification loop is no longer limited to terminal output. gdb's simulator demo and the iPhone simulator bug bash repost show Codex driving an iPhone simulator end to end, including bug-bashing a feature it just built.

That matters because it closes the loop on the auto-triage pattern above: a runner can patch code, then use computer control to test UI behavior instead of stopping at unit tests. The Codex Thursday rollout around Appshots, annotation mode, and remote computer use made that surface more productized, but the practitioner demos show the operational shape more clearly than the launch copy does.

Obsidian-native agent skills

The last interesting wrinkle is that memory is turning into ecosystem-specific capability, not just a generic folder of notes. pauliusztin_'s post on obsidian-skills points to Kepano's obsidian-skills, which teaches agents Obsidian-flavored Markdown, .base files, JSON Canvas structures, CLI workflows, and web-to-markdown extraction.

That means the memory layer can become tool-aware. LLMpsycho's agentmemory post makes the same broader claim from another angle, advertising persistent memory across Claude Code, Cursor, Codex CLI, Hermes, OpenCode, and other MCP clients. The useful shift is not just persistence. It is persistence with a schema and file semantics that a harness can actually operate on.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 7 threads
TL;DR4 posts
Obsidian memory2 posts
Skills and subagents3 posts
Auto-triage and autoreview1 post
Cloud runners4 posts
Computer-use verification2 posts
Obsidian-native agent skills1 post
Share on X