workflowMarch 12, 2026

Shopify opens /autoresearch after AI loop made Liquid 53% faster

Shopify opened the /autoresearch plugin after an autoresearch loop produced a 53% faster parse-plus-render path and 61% fewer allocations in Liquid. Try it if you want agent-driven optimization backed by tests and measurable performance targets.

3 min read

Shopify opens /autoresearch after AI loop made Liquid 53% faster

TL;DR

Shopify opened the /autoresearch plugin after using the same loop on Liquid, where the performance thread says parse-plus-render ended up 53% faster with 61% fewer memory allocations.
According to the optimization breakdown, the gains came from stacked micro-optimizations rather than a rewrite: replacing repeated regex scans, caching parsed variable parts, fast-pathing simple conditionals, and skipping unnecessary lexer and loop-limit work.
Simon Willison's notes on the PR add an important implementation detail: this kind of agent-driven search depended on a robust safety net, including 974 unit tests and a benchmark script that turned “make it faster” into a measurable target.
The release matters beyond Liquid because the thread says the plugin can optimize against other concrete metrics too, including test speed, bundle size, build times, and Lighthouse scores, via the open-source repo at the GitHub project.

What shipped and what changed in Liquid

Shopify says it has open-sourced the /autoresearch plugin for pi, with the launch post framing it simply as “tell it what you want, it will do the rest.” The public release landed alongside a concrete case study: the Liquid thread says the loop was run against Shopify’s 20-year-old Liquid template engine and produced a 53% faster parse-and-render path plus 61% fewer allocations.

The technical pattern was iterative search, not architecture surgery. As the breakdown describes it, the agent “proposes one small change,” benchmarks it, keeps it if it improves the metric, and reverts it if it does not. The accepted changes were small but cumulative: scanning for }} directly instead of invoking regex repeatedly, freezing and reusing string objects in comparisons, detecting single-condition if statements up front, splitting product.title once at parse time instead of every render, skipping per-iteration loop-limit checks when no limit exists, and parsing simple filter names without the full lexer.

Why engineers should care about the workflow

The bigger engineering takeaway is that the workflow was only practical because Liquid already had a strong validation harness. In his notes, Simon Willison highlights “974 unit tests” as the unlock for safely letting an agent try many small performance edits, and he argues that a benchmarking script makes “make it faster” an actionable objective instead of a vague prompt. The attached [img:6|notes screenshot] makes the same point: autoresearch works when the agent can repeatedly test, measure, and discard bad ideas.

That also explains why this is more interesting than a one-off CEO coding anecdote. The thread says the same plugin can target other measurable objectives such as test speed, bundle size, build times, and Lighthouse scores, which makes it a reusable optimization loop for mature codebases with tests and benchmarks already in place. Even the reaction thread around the PR stayed focused on that pattern: one summary called out a 20-year-old production engine improving by more than 50%, while Willison’s writeup treats the result as evidence that coding agents are now effective at systematic benchmark-driven cleanup in legacy systems.

TL;DR

What shipped and what changed in Liquid

Why engineers should care about the workflow

Discussion across the web