Sakana Fugu Ultra opens on Vercel AI Gateway
Sakana made Fugu Ultra available through Vercel AI Gateway, while new technical writeups described the trained routing head and multi-step orchestration behind it. The integration matters because teams can invoke Fugu’s model-selection workflow through existing gateway plumbing instead of standing up custom routing.

TL;DR
- Sakana put SakanaAILabs' Vercel AI Gateway announcement alongside its earlier OpenRouter rollout, so Fugu-Ultra is now exposed through two existing gateway surfaces instead of a custom Sakana-only endpoint.
- According to rohanpaul_ai's summary of the technical report, regular Fugu is a single-step router, while Fugu-Ultra can assemble per-task multi-model workflows where models solve, critique, and merge answers.
- rohanpaul_ai's routing-head breakdown says the fast path uses a lightweight head over hidden states to score worker models, with only a small slice of weights tuned for routing.
- In SakanaAILabs' podcast thread, CEO David Ha described orchestration across many models as a longer-run alternative to betting on one frontier model, and said Japanese megabanks are moving some AI work from PoCs into production.
You can read the paper, jump to [OpenRouter's release notes](OpenRouter release notes), and compare Sakana's Vercel AI Gateway post with its earlier OpenRouter launch card. The useful bit is not just that Fugu-Ultra shipped on another surface, but that Sakana is describing a trained router and a variable workflow builder, not a fixed committee pattern.
Gateway rollout
The rollout is simple: Sakana says Fugu-Ultra is now available on Vercel AI Gateway in its Vercel post, after making it live on OpenRouter in its earlier OpenRouter thread. That gives teams two existing aggregation layers for invoking the system.
Sakana's own phrasing in the OpenRouter launch thread is the tell. It frames Fugu as "the collective intelligence of the world's best models working together," which matches the architecture described in the arXiv report.
Routing head
The regular Fugu path is a router, not an answering model. In the thread summarizing the diagram, a lightweight head reads the manager model's hidden state, scores each worker model, and sends the task to the top choice.
That breakdown also claims a narrow tuning strategy: the red diagonal in the figure marks a small weight adjustment used to improve routing quality, instead of retraining a full model stack routing-head breakdown. For a gateway integration story, that matters יותר than branding, because the fast mode can look like one model call from the outside while doing model selection underneath.
Ultra workflows
Fugu-Ultra adds a second mode. According to the report summary, it can generate a task-specific workflow instead of choosing one worker once.
That workflow can include:
- one model to draft an answer technical-report summary
- another model to check or critique it technical-report summary
- a different model to attack the same task from another angle technical-report summary
- a final selection or merge step across the candidate outputs technical-report summary
The paper summary in the same thread contrasts this with simpler multi-model patterns like static voting or hardcoded domain routing. Sakana's claim is that the teamwork pattern is chosen at run time, per request, rather than fixed in advance.
Production framing
The broader product pitch shows up in SakanaAILabs' podcast summary, where David Ha says Japanese megabanks have moved some AI workflows from proof-of-concept toward production, and describes orchestration as more rational long term than relying on one huge frontier model.
That thread also adds two pieces missing from the gateway posts. First, Sakana says Fugu was trained with reinforcement learning to route multi-step tasks across different LLMs podcast summary. Second, Ha ties the product to a sovereignty argument: domestic advantage comes from the ability to develop, adapt, and run AI within a global supply chain, not from owning every layer outright podcast summary.