releaseMarch 26, 2026

KittenTTS releases 25MB nano model for CPU text-to-speech

KittenTTS now offers nano, micro and mini text-to-speech models, with the smallest int8 build under 25MB and built for ONNX CPU inference. Creators can run local voice tools without a cloud round trip.

2 min read

KittenTTS releases 25MB nano model for CPU text-to-speech

TL;DR

KittenTTS has released three local text-to-speech models — nano, micro, and mini — with the smallest int8 nano build coming in under 25MB for ONNX-based CPU inference, according to the GitHub page.
The stack is aimed at fully local voice generation: the project page says it runs without a GPU, ships with eight built-in voices, supports speed control, and outputs 24 kHz audio.
For creators, the practical hook is lightweight offline voice work for prototypes, tools, and embedded experiences; the HN launch post frames the release around compact multi-voice, expressive speech synthesis.
The main caveat is that small model size does not automatically mean friction-free deployment, as the discussion roundup highlights questions about dependency bloat, latency, streaming, and expressive control.

What shipped

Hacker News

KittenML/KittenTTS: State-of-the-art TTS model under 25MB

560 upvotes · 182 comments

KittenTTS v0.8.1 packages three model sizes: nano at 15M parameters, micro at 40M, and mini at 80M, with the nano model quantized to roughly 25MB in int8 form the project page. The library is open source, built on ONNX, installable from Python, and positioned for CPU-first use rather than a cloud API round trip project details.

For creative workflows, the concrete features are simple but useful: eight built-in voices, adjustable speed, text preprocessing, and 24 kHz output the launch thread. That makes it more relevant for local narration, character placeholders, interactive installs, and quick voice mockups than for fully directed studio voice performance.

Where the creative limits are

Hacker News

Discussion around Show HN: Three new Kitten TTS models – smallest less than 25MB

560 upvotes · 182 comments

The early discussion is less about whether 25MB is impressive and more about what happens after install. In the thread summary, commenters say dependency chains can pull in far larger packages than the headline model size suggests, which undercuts the appeal for edge setups.

The other open questions are real-time behavior and control. Commenters ask about first-chunk latency, streaming output, Raspberry Pi performance, and whether creators get finer expressive controls such as pitch, volume, or explicit style tags latency questions expressive control.

TL;DR

What shipped

Where the creative limits are

Discussion across the web