Navigator n1.5 claims web computer-use Pareto gains on accuracy, latency, and cost
Yutori rolled out Navigator n1.5 as a web computer-use model and said it improves the tradeoff between accuracy, latency, and cost for browser tasks. The launch matters because related environment-generation work is aimed at the long-horizon web workflows that make computer-use agents expensive and brittle.

TL;DR
- Yutori introduced Navigator n1.5 as a web computer-use model, and yutori_ai's launch repost framed it as the company's new top-end browser agent.
- The benchmark image in yutori_ai's Pareto chart repost places Navigator n1.5 at 88% accuracy on Navi-bench v2 at $1.50 per 1M input tokens, ahead of the other points shown on both accuracy and price.
- The same chart in the Navi-bench v2 image puts Navigator n1 at 72% accuracy at $0.75 per 1M input tokens, which lets Yutori claim n1 and n1.5 together cover a new cost-accuracy frontier.
- A short demo clip in MVXMXM's n1.5 video post suggests the pitch is not just benchmark movement, but fast enough browser interaction to feel like a live web agent.
- Shahules786's follow-up ties the launch to a separate stack of environment-generation work, including Cloning Bench, PA Bench, Tau2-infinity, and Helix, aimed at synthesizing harder web and tool-use workflows.
You can inspect the full accuracy-vs-price chart, watch a short n1.5 browser demo, and follow the adjacent environment-generation work through Cloning Bench and PA Bench. The interesting bit is that Yutori is pitching both ends of the curve at once: n1 and n1.5 together are presented as separate operating points, while the Vibrant Labs thread points to synthetic multi-tab task generation as part of the infrastructure underneath those gains.
Navi-bench v2
The launch claim lives in one chart. In the benchmark image, Navigator n1.5 is plotted at 88% accuracy and $1.50 per 1M input tokens, while Navigator n1 is plotted at 72% and $0.75.
The same image places GPT-5.4 at 68% and $2.50, Gemini 2.5 Pro at 41.5% and $1.25, Claude Opus 4.6 at 83.5% and $5.00, Claude Opus 4.7 at 80.5% and $5.00, and GPT-5.5 at 75% and $5.00. That is the basis for the Pareto-domination framing in the launch repost and the follow-on claim in the companion repost that n1 and n1.5 together define a new frontier.
Browser-use demo
The public evidence for latency is lighter than the pricing chart, but this demo clip shows n1.5 moving through a browser workflow in real time. The launch copy in the main repost pairs that clip-level impression with the broader claim that n1.5 improves accuracy, latency, and cost at the same time.
That combination is the whole pitch. Yutori is not selling a general model here, it is selling a web operator.
Environment generation
The most useful extra detail came from Shahules786, who said more would be shared about providing post-training data and then listed four open projects tied to web and computer-use agents:
- Cloning Bench, described in the thread as scaling web app cloning to reduce sim-to-real gap.
- PA Bench, described as scalable world generation for multi-tab workflows.
- Tau2-infinity, described in the original thread as scaling task and verifier generation for tool-use workflows.
- Helix, described in the same thread as a system for scaling task and verifier generation plus QA for web and computer-use agents.
That list matters because it points at the expensive part of browser agents: not only inference, but generating enough realistic environments and long-horizon tasks to train and evaluate them. The follow-up post is also the only evidence here that explicitly mentions post-training data support behind n1.5.