Skip to content
AI Primer
release

Poolside releases Laguna M.1 and XS.2 coding models with 225B/23B and 33B/3B MoEs

Poolside opened Laguna M.1 and Laguna XS.2 as its first public coding models, with Apache 2.0 weights and same-day provider support. That gives teams open coding models that can run locally or through standard serving stacks.

4 min read
Poolside releases Laguna M.1 and XS.2 coding models with 225B/23B and 33B/3B MoEs
Poolside releases Laguna M.1 and XS.2 coding models with 225B/23B and 33B/3B MoEs

TL;DR

You can already hit OpenRouter's Laguna M.1 page, spin up OpenRouter's XS.2 endpoint, or pull Laguna XS.2 on Ollama. The weirdly practical part is the serving glue: the vLLM recipe includes a dedicated vllm/vllm-openai:laguna image, parser flags for tool calls and reasoning, and a verified H200 run command on day one.

Day-zero distribution

Poolside's first public release landed across three different access patterns at once.

  • Managed deployment: Baseten says both Laguna XS.2 and M.1 are live with inference optimizations for production serving.
  • API access: OpenRouter and its XS.2 follow-up exposed both models immediately, and framed the free access window as limited.
  • Local runtime: Ollama's model-page post says XS.2 is available under an Apache license through Ollama.

That matters less as launch theater than as format compatibility. Teams can test the same model family through hosted APIs, standard inference providers, or a local Ollama pull without waiting for a second ecosystem wave.

Laguna M.1 and XS.2

The two-model lineup splits cleanly by scale.

The interesting part is not just that Poolside finally shipped weights. It shipped small-active MoE coding models with enough surrounding detail to drop into existing inference stacks instead of living as a research artifact.

Benchmark chart

The public chart in Aymeric Roucher's post compares Laguna XS.2 against Devstral Small 2, Gemma 4, Qwen 3.5, Qwen 3.6, Claude Haiku 4.5, and GPT-5.4 Nano across four coding-heavy evaluations.

From the screenshot:

  • SWE-bench Verified: XS.2 lands around the high 60s, close to Devstral Small 2 and a few points behind Qwen 3.5 and Qwen 3.6.
  • SWE-bench Multilingual: XS.2 appears in the low 60s, with Qwen 3.6 higher.
  • SWE-bench Pro: the pack compresses more tightly, with XS.2 in the middle of the compared group.
  • Terminal-Bench 2.0: XS.2 is visibly lower than Qwen 3.6 and GPT-5.4 Nano.

So the launch claim is not blanket SOTA across every chart. The stronger reading from the screenshot is that XS.2 enters the real open coding-model set on SWE-bench style tasks, while terminal-style agent evaluation still shows more separation.

vLLM and local serving

The fastest path from weights to usable infra showed up in the serving ecosystem, not in a long official blog post.

According to vLLM project's recipe, XS.2 already has:

  • a dedicated Docker image, vllm/vllm-openai:laguna
  • support in vLLM nightly
  • --enable-auto-tool-choice
  • a poolside_v1 tool-call parser
  • a poolside_v1 reasoning parser
  • a verified H200 serving recipe

That makes the Poolside release more than a Hugging Face curiosity. Between Ollama's local packaging and vLLM's parser support, XS.2 arrived with both a laptop-friendly entry point and a conventional OpenAI-compatible serving lane.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 3 threads
TL;DR1 post
Day-zero distribution1 post
Laguna M.1 and XS.21 post
Share on X