releaseApril 27, 2026

MiMo-V2.5 opens under MIT with 1M context and SGLang vLLM support

Xiaomi opened MiMo-V2.5 and MiMo-V2.5-Pro under MIT, adding a 1M-context multimodal agent model and a 42B-active Pro variant. SGLang and vLLM published day-one recipes, making the series immediately deployable.

4 min read

MiMo-V2.5 opens under MIT with 1M context and SGLang vLLM support

TL;DR

Xiaomi opened the MiMo-V2.5 series under MIT, with huggingface's repost of XiaomiMiMo's announcement pointing to commercial deployment and continued training rights, and _akhaliq's Hugging Face post linking to the public collection.
The split is unusually clean: lmsysorg's SGLang support thread describes MiMo-V2.5-Pro as a 1.02T-parameter MoE with 42B active parameters, hybrid attention, and up to 1M context, while the same post says MiMo-V2.5 is a multimodal 310B-parameter MoE with 15B active parameters and the same 1M window.
Serving landed immediately in the two stacks most infra teams already care about, with the SGLang thread announcing day-0 support in SGLang and sglang-jax, and vllm_project's day-0 support post shipping a vLLM recipe for MiMo-V2.5-Pro.
Xiaomi is pitching Pro as an agent model, and vllm_project's support post highlights 1000-plus tool calls, ultra-long context coherence, and frontier coding, while _akhaliq's benchmark image places MiMo-V2.5 and MiMo-V2.5-Pro near the upper-left of a Pass^3 vs. token-efficiency chart.
Day one was not fully uniform: lmsysorg's follow-up update says MiMo-V2.5 omni support was still being finalized because the upstream weights were being updated.

You can browse the Hugging Face collection, jump straight to the SGLang cookbook, and inspect the sglang-jax usage page. vLLM also published its own MiMo-V2.5-Pro recipe for a same-day path to serving.

MiMo-V2.5 under MIT

The headline is not just another weights drop. huggingface's repost of XiaomiMiMo's announcement says the series is under MIT, which is the part infra teams will remember because it explicitly permits commercial deployment and continued training.

The public collection linked by _akhaliq's post puts both models in one place on Hugging Face. That makes the release feel more like a real open deployment target than a paper-adjacent checkpoint dump.

Two models, two jobs

lmsysorg's launch thread lays out the product split in one pass:

MiMo-V2.5-Pro: 1.02T total parameters, 42B active, hybrid attention, up to 1M context.
MiMo-V2.5: multimodal across text, image, video, and audio, 310B total parameters, 15B active, up to 1M context.
Common framing: both shipped into the inference ecosystem immediately, not as a later follow-up.

That split matters because Xiaomi did not collapse everything into one flagship SKU. The Pro model is framed around long-horizon tool use and coding, while the base MiMo-V2.5 entry is the wider multimodal model.

Day-one serving in SGLang and vLLM

The fastest signal that a model is likely to get real usage is whether inference stacks pick it up immediately. lmsysorg's thread says SGLang support was live on day zero, including sglang-jax, and links both a cookbook and a dedicated usage page.

vllm_project's post adds the other half of the deployment story. Its recipe screenshot shows MiMo-V2.5-Pro running in vllm-openai, with --tool-call-parser mimo, tensor parallel size 8, and an MTP speculative decoding config.

The same vLLM post also surfaces the model card details most likely to matter in practice:

Native FP8 weights
Hybrid attention
1,048,576-token context
Text-only serving path for Pro in the published recipe
B200, TP=8, FP8 as the illustrated environment

Agent and coding positioning

Xiaomi's collaborators are not pitching Pro as a generic chat model. vllm_project's day-0 support post describes MiMo-V2.5-Pro as an agent-oriented model aimed at long-horizon tool use and frontier coding.

The claims called out in that post are specific enough to treat as the intended workload profile:

Long-horizon task execution across 1000-plus tool calls
Stronger instruction following in agentic harnesses
Coherence across ultra-long contexts
Frontier-tier complex software engineering

That is a pretty distinct target compared with the broader multimodal MiMo-V2.5 model described by lmsysorg's launch thread.

ClawEval placement

The attached chart in _akhaliq's post plots Pass^3 against average tokens per trajectory on ClawEval, with upper-left marked as better. MiMo-V2.5-Pro sits in that favored corner, and MiMo-V2.5 also lands near the top cluster alongside GLM 5.1, GLM 5 Turbo, and Gemini 3 Flash.

The interesting part is the shape of the claim, not just the rank. Xiaomi is pushing a joint story about capability and token efficiency, rather than raw pass rate alone.

Omni support was still catching up

The clean launch story had one visible wrinkle. lmsysorg's follow-up update says MiMo-V2.5 omni support was still being finalized in SGLang because the upstream weights were being updated.

So the day-one deployment path looked strongest for Pro, while the broader multimodal path was still settling even as the open release and serving integrations went live.