Mistral
Mistral model family.
Stories
Filter storiesMistral shipped Medium 3.5 as a 128B dense model with 256K context, configurable reasoning, remote agents in Vibe, and Work Mode in Le Chat. The release broadens Mistral’s agent stack, though early comparisons question its price-performance against newer open rivals.
Voxtral TTS uses separate semantic and acoustic token models, a 2.14 kbps codec, and 3-25 second reference audio for cloning across nine languages. Try it if you want a hybrid speech pipeline with more control and faster acoustic synthesis than all-autoregressive generation.
Mistral released open-weight Voxtral TTS with low-latency streaming, voice cloning, and cross-lingual adaptation, and vLLM Omni shipped day-0 support. Voice-agent teams should compare quality, latency, and serving cost against closed APIs.
Mistral Small 4 combines reasoning and non-reasoning modes in one 119B MoE, adds native image input, and expands context to 256K at $0.15/$0.6 per million tokens. It improves sharply over Small 3.2, but still trails similarly sized open peers on several evals.
Mistral introduced Forge, a platform for enterprises to pre-train, post-train, and reinforce models on internal code, policies, and operational data, including on-prem deployments. Consider it when retrieval alone is not enough and you need weights tuned to private workflows.
Mistral shipped Mistral Small 4, a 119B MoE model with 6.5B active parameters, multimodal input, configurable reasoning, and Apache 2.0 weights. Deploy it quickly in existing stacks if you use SGLang or vLLM, which added day-one support.