breakingJune 16, 2026

ENPIRE launches 8-agent Codex robot fleet for physical autoresearch

ENPIRE launched a physical autoresearch setup that gives eight Codex agents robots, GPUs, and real-world APIs for tasks like zip ties and part sorting. It matters because it moves long-horizon agent evaluation from browser-only loops into embodied experimentation with explicit safety controls.

4 min read

ENPIRE launches 8-agent Codex robot fleet for physical autoresearch

TL;DR

DrJimFan's launch thread says ENPIRE hands eight Codex agents a shared fleet of robots, GPUs, and a token budget, then lets them pursue a single goal with humans out of the loop.
According to the launch thread, the agents used that setup to solve hardware tasks including tying zip-ties, organizing fine pins, and installing GPUs.
DrJimFan's thread framed the main research result as "physical scaling," where eight robots exploring in parallel improved faster than smaller setups, and deredleritt3r's quote post amplified that claim.
In a reply about safety, DrJimFan said the robot safety measures are set up and verified manually before autoresearch starts.
DrJimFan's project-site post says a project site and technical thread are live, while the launch thread says the team plans to open-source the whole stack.

You can jump straight to the project site, watch the launch post show the lab running on real hardware, and see DrJimFan's benchmark reply claim the paper's results ranked Codex ahead of Claude and Kimi. One stray but useful detail sits in a harness reply, where Jim Fan says the harness design space is "huge," which is the part most likely to matter more than the robot demo itself.

ENPIRE

ENPIRE is a physical autoresearch loop, not a one-off scripted robotics demo. DrJimFan's thread describes eight Codex agents sharing robots, compute, and a token budget under a simple objective: finish the task quickly, keep the robots busy, stay safe, and avoid wasting GPUs.

The thread also sketches what the agents actually do once released into the lab:

look for visual clues
reset the scene
practice novel skills
tinker with the control stack
read papers online
debate and reflect
retry directly on hardware

That combination is the interesting bit. The agents are not confined to browser research or simulator-only loops, because the launch thread says they can act on "the world of atoms" through a real-world API.

Physical tasks

DrJimFan's launch thread names three tasks ENPIRE completed on its own: tying zip-ties, organizing fine pins, and installing GPUs. Those are fiddly, high-precision tasks where resetting the scene and trying again matters as much as the final motion plan.

The overnight setup is also part of the pitch. DrJimFan's thread says part of NVIDIA's GEAR lab now self-improves through the night, with humans reading the reports the next morning, while DrJimFan's overnight reply summarizes the vibe more plainly: "Robots humming tirelessly at night."

Physical scaling

The headline claim is that more parallel hardware improved learning speed. DrJimFan's launch thread says eight robots exploring in parallel "improves significantly faster than fewer ones," and deredleritt3r's quote post pulled out that line as the key reveal.

That gives ENPIRE a different scaling axis from the usual agent story. Instead of adding only more inference or longer context, the system adds more physical trials, more resets, and more simultaneous opportunities to discover a working behavior.

Safety and harness

Safety is still front-loaded by humans. In a direct reply about safeguards, DrJimFan says the team manually sets up and verifies the robots' safety measures before autoresearch happens.

The other buried detail is the harness. a reply on design space says the harness design space is "huge," which suggests the orchestration layer around the model, tools, budgets, and robot interfaces is still a major variable, not just the choice of Codex.

Benchmark and open-source plans

One extra claim arrived in replies rather than the launch post. DrJimFan's benchmark reply says the paper's results ranked "Codex > Claude > Kimi," and framed ENPIRE as a benchmark "that can't be hacked in the physical world."

The release plan is broader than a demo video. DrJimFan's launch thread says the team will open-source everything so others can run a self-operating robot lab at home, and a follow-up reply adds that more is still coming.