Skip to content
AI Primer
release

Kilo Code launches Auto Efficient routing with KiloBench model selection

Kilo Code added an Auto Efficient mode that routes each request to the cheapest model that clears its benchmark bar using public KiloBench results. The router stays session-aware and falls back to stronger paid models when confidence is low.

4 min read
Kilo Code launches Auto Efficient routing with KiloBench model selection
Kilo Code launches Auto Efficient routing with KiloBench model selection

TL;DR

You can read the public benchmark writeup from kilocode's follow-up, inspect the KiloBench leaderboard via kilocode's reply about public data, and even watch a two-minute demo from kilocode's demo post. The interesting bit is not model routing by itself. It is that Kilo is exposing the scorecard it routes on, then letting users override the policy when they want a different cost-quality tradeoff.

Auto Efficient

Kilo's pitch is brutally simple: stop paying top-tier model prices for low-stakes edits. kilocode's launch thread frames the split as cheap models for tasks like variable renames, stronger models for harder work like migration planning.

The product is live now. kilocode's availability post says Auto Efficient appears directly in the model picker, where the screenshot marks it as the recommended default.

KiloBench

The routing logic is tied to KiloBench, not a hidden heuristic. kilocode's benchmark explainer says KiloBench runs continuously across the model catalog on tasks pulled from real Kilo usage, and kilocode's leaderboard note says users can inspect the same rankings themselves.

That makes this more legible than the usual black-box router. The evidence points to a loop with three parts:

A week earlier, kilocode's eight-model test also argued for the premise behind the feature: the cheapest model in one controlled code review run was the only one that caught every bug.

Session-aware routing

Kilo is not claiming per-turn roulette. kilocode's session-aware note says the router stays on a model that is already working and only switches when a cheaper option clearly fits, which is meant to avoid context loss and inconsistent output inside a thread.

The fallback policy matters as much as the cheap path:

Routing modes

Kilo also exposed a policy knob instead of hard-coding one cost target. kilocode's settings post shows three routing choices in settings: use the default, optimize for best accuracy per dollar, or optimize for best accuracy.

That means the router is not just choosing a model. It is choosing against a user-selected objective. The launch thread does not spell out the exact threshold math, but the UI evidence shows Kilo treating routing policy as a project-level setting rather than a global account switch.

Visibility and limits

One useful wrinkle surfaced in follow-up replies. In kilocode's reply about public data, Kilo says there is no API endpoint that exposes routing choices directly, but the benchmark inputs to Auto Efficient are public and users can inspect usage history to see which models were used and at what volume.

So the transparency story is partial, not total. The benchmark is public, the usage log is visible, and the routing engine itself is still exposed through product behavior rather than a dedicated API.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 2 threads
Auto Efficient1 post
KiloBench2 posts
Share on X