Poolside releases Laguna M.1 and XS.2 coding models with 225B/23B and 33B/3B MoEs
Poolside opened Laguna M.1 and Laguna XS.2 as its first public coding models, with Apache 2.0 weights and same-day provider support. That gives teams open coding models that can run locally or through standard serving stacks.

TL;DR
- Poolside shipped its first public models as two Apache 2.0 open-weight coding MoEs: Aymeric Roucher's launch summary described Laguna M.1 as 225B parameters with 23B active, while OpenRouter's XS.2 post sized Laguna XS.2 at 33B with 3B active.
- The rollout was unusually day-zero friendly: Baseten's announcement put both models into managed inference, OpenRouter's launch thread exposed them through an API surface, and Ollama's model-page post made XS.2 available for local pulls.
- vLLM project's recipe post shows XS.2 already wired into vLLM nightly with tool-calling and reasoning parsers, which means Poolside did not just publish weights, it landed an immediate serving path for OpenAI-compatible stacks.
- On the public benchmark chart in Aymeric Roucher's screenshot, XS.2 sits near Qwen 3.5 and Qwen 3.6 on SWE-bench Verified, but trails the top pack more clearly on Terminal-Bench 2.0.
You can already hit OpenRouter's Laguna M.1 page, spin up OpenRouter's XS.2 endpoint, or pull Laguna XS.2 on Ollama. The weirdly practical part is the serving glue: the vLLM recipe includes a dedicated vllm/vllm-openai:laguna image, parser flags for tool calls and reasoning, and a verified H200 run command on day one.
Day-zero distribution
Poolside's first public release landed across three different access patterns at once.
- Managed deployment: Baseten says both Laguna XS.2 and M.1 are live with inference optimizations for production serving.
- API access: OpenRouter and its XS.2 follow-up exposed both models immediately, and framed the free access window as limited.
- Local runtime: Ollama's model-page post says XS.2 is available under an Apache license through Ollama.
That matters less as launch theater than as format compatibility. Teams can test the same model family through hosted APIs, standard inference providers, or a local Ollama pull without waiting for a second ecosystem wave.
Laguna M.1 and XS.2
The two-model lineup splits cleanly by scale.
- Laguna M.1: 225B total parameters, 23B active, according to Aymeric Roucher's post and OpenRouter's M.1 thread.
- Laguna XS.2: 33B total parameters, 3B active, according to the same launch summary and OpenRouter's XS.2 post.
- Attention design: Aymeric Roucher says Poolside uses hybrid attention, mixing global and sliding-window attention in a 3:1 ratio.
- KV cache: the same summary says the KV cache is quantized in FP8.
- Context: the vLLM recipe screenshot labels XS.2 as a 128K-context model.
The interesting part is not just that Poolside finally shipped weights. It shipped small-active MoE coding models with enough surrounding detail to drop into existing inference stacks instead of living as a research artifact.
Benchmark chart
The public chart in Aymeric Roucher's post compares Laguna XS.2 against Devstral Small 2, Gemma 4, Qwen 3.5, Qwen 3.6, Claude Haiku 4.5, and GPT-5.4 Nano across four coding-heavy evaluations.
From the screenshot:
- SWE-bench Verified: XS.2 lands around the high 60s, close to Devstral Small 2 and a few points behind Qwen 3.5 and Qwen 3.6.
- SWE-bench Multilingual: XS.2 appears in the low 60s, with Qwen 3.6 higher.
- SWE-bench Pro: the pack compresses more tightly, with XS.2 in the middle of the compared group.
- Terminal-Bench 2.0: XS.2 is visibly lower than Qwen 3.6 and GPT-5.4 Nano.
So the launch claim is not blanket SOTA across every chart. The stronger reading from the screenshot is that XS.2 enters the real open coding-model set on SWE-bench style tasks, while terminal-style agent evaluation still shows more separation.
vLLM and local serving
The fastest path from weights to usable infra showed up in the serving ecosystem, not in a long official blog post.
According to vLLM project's recipe, XS.2 already has:
- a dedicated Docker image,
vllm/vllm-openai:laguna - support in vLLM nightly
--enable-auto-tool-choice- a
poolside_v1tool-call parser - a
poolside_v1reasoning parser - a verified H200 serving recipe
That makes the Poolside release more than a Hugging Face curiosity. Between Ollama's local packaging and vLLM's parser support, XS.2 arrived with both a laptop-friendly entry point and a conventional OpenAI-compatible serving lane.