releaseSECONDARY2026-03-29
Mistral releases Voxtral TTS with 3-second cloning and 68.4% win rate
Mistral's Voxtral TTS splits speech into semantic and acoustic tokens, uses a low-bitrate codec, and claims a 68.4% win rate over ElevenLabs Flash v2.5 on voice cloning with about 3-25 seconds of reference audio. The architecture targets multilingual cloning and higher-quality speech without a fully autoregressive audio stack, so voice teams should compare it against current TTS pipelines.