releaseJune 1, 2026

Perplexity launches Search as Code in Agent API with WANDR 0.386 and Python search pipelines

Perplexity replaced one-shot search calls with Search as Code, a Python-based search runtime in its Agent API that is also now the default in Computer. The change matters because agents can batch, rank, filter, and aggregate search steps inside code, and Perplexity says the system scored 0.386 on WANDR versus 0.152 for the next system.

5 min read

Perplexity launches Search as Code in Agent API with WANDR 0.386 and Python search pipelines

TL;DR

Perplexity shipped Search as Code, a new search architecture where the model writes Python that calls search primitives directly, and perplexity_ai's launch post says it is already available in the Agent API and the default in Computer.
In Perplexity's research post, the company says the point is to replace serial tool-call loops with task-specific retrieval programs that can rank, filter, fan out, and aggregate inside one sandboxed run, a framing that denisyarats' explanation repeats in plainer language.
Perplexity's claimed benchmark win is large: perplexity_ai's WANDR post puts Search as Code at 0.386 versus 0.152 for the next-best system, with the benchmark set for release in the coming weeks.
The runtime is not just "LLM calls search": TheTuringPost's architecture summary breaks it into a model control plane, a secure compute sandbox, and an Agentic Search SDK that exposes lower-level search primitives.
The docs tie the architecture to a billable code-execution surface, because perplexity_ai's docs link lands on the Agent API sandbox docs, where Perplexity says sandbox sessions are in preview and priced at $0.03 per session.

Perplexity's own material is worth reading directly: the research post is unusually explicit about failure modes in tool-call search, the sandbox docs show how code execution is wired into the Agent API, and the pricing page makes clear that token charges still sit underneath the new runtime. One useful buried detail from the research post is that Perplexity says some Computer tasks already trigger hundreds or thousands of retrieval operations in a few minutes, which is the workload it thinks monolithic search APIs no longer fit.

Search as Code

Perplexity is replacing one-shot search calls with a code-driven runtime. In the official announcement and research post, the model generates Python, runs it in a secure sandbox, and uses that code to assemble a retrieval pipeline for the specific task instead of repeatedly calling one fixed search endpoint.

That rollout is already product-level, not just a paper architecture. perplexity_ai's launch post says Search as Code is live in the Perplexity Agent API and now the default in Computer.

Tool calls versus Python

Perplexity's argument is that search breaks down when it is forced through serial function calls or MCP-style turns. In Perplexity's research post, the company lists three failure modes from that setup: coarse context, inability to use task-specific domain knowledge, and control-flow overhead that pollutes context with noisy intermediate state.

Across the two leadership posts and the research writeup, the claimed gains cluster around a few concrete mechanics:

Custom ranking pipelines: model-written code can blend retrieval and ranking steps per task, instead of accepting one fixed search pipeline.
Wide and grid search: denisyarats says the SDK can fan out large-scale searches directly from generated Python.
Map-reduce over the index: the same thread says agents can run distributed-style search workflows over Perplexity's search stack.
Lower context pollution: the research post says noisy intermediate results can stay inside code rather than getting pushed back into model-visible context every turn.
Fewer round trips: denisyarats' explanation says collapsing many search steps into one program is the main token-efficiency win.

Agentic Search SDK

The useful way to picture Search as Code is as three layers, a structure Perplexity describes in its research post and that TheTuringPost summarizes cleanly:

Model as control plane: the model decides the search strategy and writes the Python.
Secure sandbox: the generated code runs in an isolated execution environment.
Agentic Search SDK: the SDK exposes primitives such as retrieval, ranking, filtering, fanout, parsing, rendering, and intermediate search signals.

That third layer is the real architectural shift. Perplexity says the SDK exposes the search stack at a much lower level than a traditional search API, including candidate lists and ranking signals, so the agent can modify the pipeline while it is running rather than only adjust query parameters at the front door.

WANDR

Perplexity paired the launch with a new internal benchmark called WANDR, described by perplexity_ai's WANDR post as a wide benchmark meant to mirror professional research workloads. The headline number is 0.386 for Search as Code versus 0.152 for the next-best system.

The company has not published the benchmark yet. perplexity_ai's WANDR post says WANDR is coming in the next few weeks, and the research post says the suite is "far from saturated," which is Perplexity's way of arguing that the headroom is still large.

Without the released benchmark, the most concrete takeaway is what Perplexity chose to measure. WANDR is framed around professional research workflows, not consumer QA, which lines up with the rest of the launch language around wide fanouts, asynchronous retrieval, and long-running agent tasks.

Sandbox sessions

The docs make clear that Search as Code rides on top of Perplexity's sandbox tool, not a hidden internal-only runtime. The sandbox documentation says developers enable it by adding sandbox to the Agent API tools array, after which the model can decide when to run code during a request.

A few implementation details matter:

Preview status: the docs say sandbox availability, quotas, and pricing may change.
Background execution: long-running sandbox calls can continue in the background and be polled by ID.
Billing model: the docs price sandbox at $0.03 per session, with one session covering up to 20 minutes of active use, plus token charges and any tool usage incurred inside the sandbox.
Output shape: sandbox results come back with code, status, stdout, stderr, exit code, and duration fields.

That pricing wrinkle is the most concrete operational detail Perplexity published on day one. Search as Code changes retrieval architecture, but in the Agent API it also means search-heavy agents are now explicitly tied to a metered code-execution surface rather than just model tokens or search calls.