releaseJune 18, 2026

Poolside releases Laguna M.1 open weights with 225B MoE and 256K context

Poolside released Apache 2.0 weights for Laguna M.1 and XS.2, its long-horizon coding models, with M.1 shipping at 225B total parameters, 23B active, and 256K context. SGLang and vLLM support on day one lets teams run and fine-tune the models in existing agent stacks immediately.

4 min read

Poolside releases Laguna M.1 open weights with 225B MoE and 256K context

TL;DR

poolsideai's release post says Poolside has open-sourced Laguna M.1 under Apache 2.0, with both base and post-trained checkpoints, a 256K context window, and Hugging Face distribution.
According to Cedric Chee's release summary, M.1 ships as a 225.8B-parameter MoE with 23.4B active parameters, while XS.2 lands at 33.4B total and 3B active.
lmsysorg's SGLang announcement and vllm_project's vLLM announcement put day-one support into the two inference stacks most teams already use for self-hosted model serving.
poolsideai's availability post says M.1 and XS.2 stay free on Poolside's API and OpenRouter, while dedicated paid OpenRouter endpoints are launching for heavier workloads.
poolsideai's MLX post claims someone already got a 3-bit Apple Silicon build running locally at about 26 tok/s with roughly 100 GB peak memory on an M3 Max.

You can pull the official blog post, read the technical report, and even try the new pool agent harness that Poolside says doubles as both an ACP server and client. SGLang and vLLM also surfaced more architectural detail than the launch tweet itself, including 70 layers, 256 experts, top-k=16 routing, and interleaved reasoning between tool calls.

Open weights and model sizes

The headline is simple: Poolside moved its coding model from hosted access to open weights. According to poolsideai's launch post, both the base and post-trained M.1 checkpoints are now on Hugging Face under Apache 2.0.

The model card details surfaced fastest through ecosystem posts. Cedric Chee's summary sizes M.1 at 225.8B total parameters with 23.4B activated, and XS.2 at 33.4B total with 3B activated.

Poolside also framed this as a default policy shift. In a follow-up availability post, the company said open weights are now its default and linked both the model release and API access.

Day-one inference support

The practical story is the day-zero runtime support. lmsysorg's post says Laguna M.1 is live in SGLang, while vllm_project's post says support shipped in vLLM v0.21.0.

Those posts also expose the architecture in scan-friendly form:

70 layers total, according to SGLang's architecture list
3 dense SwiGLU layers plus 67 sparse MoE layers, per the same SGLang post
256 experts with top-k=16 routing, according to SGLang and vLLM
256K context, per vLLM's support note
Native interleaved reasoning between tool calls, toggleable per request, according to SGLang and vLLM

That means self-hosting teams do not need to wait for custom integration work before benchmarking or slotting M.1 into an existing agent stack.

Pool harness and OpenRouter access

Poolside paired the weights with a lightweight agent harness called pool. In its product post, the company says pool works as both an ACP server and client, which makes the release feel more like a runnable stack than a bare checkpoint dump.

Access now spans three surfaces:

Free use on Poolside's API, according to poolsideai's availability post
Free use through OpenRouter, also according to that same post
Dedicated paid OpenRouter endpoints for higher-demand work, per poolsideai

Poolside also says M.1 has already been used on OpenRouter since April through coding agents including Kilo Code and Hermes Agent, according to poolsideai's OpenRouter thread. Kilo Code's reply confirmed it had been one of those agent surfaces and highlighted the Apache 2.0 license as the main unlock for verification and extension.

A Mac run showed the deployment floor

One of the more useful early data points came after release, when poolsideai's follow-up said a community member produced a 3-bit MLX build that ran locally on Apple Silicon.

The reported setup was specific enough to bookmark:

Around 26 tok/s, per Poolside's post
Roughly 100 GB peak memory, per the same post
M3 Max with 128 GB unified memory, also per Poolside

That does not make M.1 a casual laptop model. It does show the open release immediately triggered quantization and local-run experiments, which is usually the first real sign that a weights drop will get used rather than merely announced.

TL;DR

Open weights and model sizes

Day-one inference support

Pool harness and OpenRouter access

A Mac run showed the deployment floor

Discussion across the web