Audio & Voice — Explore AI Tools & Stories

Fresh stories

New

KittenTTS supports 25MB ONNX voice models as HN debates prosody

Hacker News discussion around KittenTTS has shifted to edge deployment, streaming latency, expressive control, and prosody rather than new model changes. The 25MB ONNX footprint keeps it attractive for CPU and on-device use, but voice quality is still the production boundary.

🎵Voice1d ago

KittenTTS supports 25MB ONNX voice models as HN debates prosody

🎵Voice1d ago

Top storiesthis week

See all →

Breaking

KittenTTS releases 25MB nano model for CPU text-to-speech

KittenTTS now offers nano, micro and mini text-to-speech models, with the smallest int8 build under 25MB and built for ONNX CPU inference. Creators can run local voice tools without a cloud round trip.

New

Voice·2d ago·2 min read

New

Gemini 3.1 Flash Live launches with 90.8% ComplexFuncBench audio score

Google says its new realtime voice model improves noisy-environment understanding, long conversations and function calling, and it's rolling into Gemini Live, Search Live and AI Studio. Voice creators can test it for lower-latency spoken interactions.

Release🎵Gemini2d ago

New

Lightning V3.1 releases 10-second voice cloning with 44.1kHz output and sub-100ms latency

Smallest says Lightning V3.1 can clone a voice from about 10 seconds of audio with 44.1kHz output, sub-100ms latency and 50-plus languages on Waves. Test it for multilingual narration and dubbing, but get explicit permission before cloning any voice.

Release🎵Voice Cloning3d ago

New

Lyria 3 Pro launches 3-minute song tools in Gemini API and AI Studio

Google is rolling out Lyria 3 Pro for full songs and Lyria 3 Clip for 30-second generations in the Gemini API and AI Studio. Musicians can now map intros, verses, choruses and bridges instead of stitching short music clips together.

Release🎵Gemini3d ago

New

KittenTTS releases 25MB nano voice model with CPU-only ONNX runtime

KittenTTS 0.8 ships new 15M, 40M and 80M models, including an int8 nano model around 25MB that runs on CPU without GPU. It is a fit for narration, character voices and lightweight assistants that need offline or edge-friendly speech.

Release🎵Voice4d ago

New

Freepik Spaces supports music-video lipsync with Veed Fabric 1.0 Fast and OmniHuman 1.5

A Freepik Spaces workflow now uses Nano Banana 2 for stills, Veed Fabric for closeup lipsync, OmniHuman for directed performance, and Kling 3.0 for motion clips. Split one music video into model-specific stages instead of forcing a single tool to handle everything.

Workflow🎵Freepik5d ago

New

KittenTTS releases v0.8 with a 25MB int8 model and CPU-only speech synthesis

KittenML's latest open-source TTS release spans 15M to 80M models, with the smallest coming in under 25MB and the larger one reportedly running faster than realtime on CPU. Audio creators should test pronunciation and install overhead before betting on it for edge or local voice tools.

Release🎵Voice6d ago

See all stories →

New

KittenTTS releases 25MB nano model for CPU text-to-speech

Release🎵VoiceLocal Inference2d ago · 2 min read

Gemini 3.1 Flash Live launches with 90.8% ComplexFuncBench audio score

Release🎵Gemini2d ago

Lightning V3.1 releases 10-second voice cloning with 44.1kHz output and sub-100ms latency

Release🎵Voice Cloning3d ago

Lyria 3 Pro launches 3-minute song tools in Gemini API and AI Studio

Release🎵Gemini3d ago

KittenTTS releases 25MB nano voice model with CPU-only ONNX runtime

Release🎵Voice4d ago

Freepik Spaces supports music-video lipsync with Veed Fabric 1.0 Fast and OmniHuman 1.5

Workflow🎵Freepik5d ago

KittenTTS releases v0.8 with a 25MB int8 model and CPU-only speech synthesis

Release🎵Voice6d ago

Daily AI Digest

Get the best stories delivered to your inbox

Workflows you can try today

Step-by-step guides →

Today's Pick

⚡

Freepik

4d ago

Freepik Spaces supports music-video builds with Nano Banana grids, OmniHuman lipsync and Kling 3.0

A new shared Space shows how to build a music video inside Freepik using Nano Banana shot grids, OmniHuman or Veed Fabric for lipsync, and Kling 3.0 for motion. The pipeline is now reusable instead of scattered across separate tutorials and tools, so teams can follow one workflow.

FreepikNano BananaKling

Today's Pick

⚡

Freepik

5d ago

Freepik Spaces supports music-video lipsync with Veed Fabric 1.0 Fast and OmniHuman 1.5

FreepikKlingNano Banana

Explore what's new in AI

Filter by tag in Audio & Voice

Fresh stories

KittenTTS supports 25MB ONNX voice models as HN debates prosody

KittenTTS supports 25MB ONNX voice models as HN debates prosody

Top storiesthis week

KittenTTS releases 25MB nano model for CPU text-to-speech

Gemini 3.1 Flash Live launches with 90.8% ComplexFuncBench audio score

Lightning V3.1 releases 10-second voice cloning with 44.1kHz output and sub-100ms latency

Lyria 3 Pro launches 3-minute song tools in Gemini API and AI Studio

KittenTTS releases 25MB nano voice model with CPU-only ONNX runtime

Freepik Spaces supports music-video lipsync with Veed Fabric 1.0 Fast and OmniHuman 1.5

KittenTTS releases v0.8 with a 25MB int8 model and CPU-only speech synthesis

KittenTTS releases 25MB nano model for CPU text-to-speech

Gemini 3.1 Flash Live launches with 90.8% ComplexFuncBench audio score

Lightning V3.1 releases 10-second voice cloning with 44.1kHz output and sub-100ms latency

Lyria 3 Pro launches 3-minute song tools in Gemini API and AI Studio

KittenTTS releases 25MB nano voice model with CPU-only ONNX runtime

Freepik Spaces supports music-video lipsync with Veed Fabric 1.0 Fast and OmniHuman 1.5

KittenTTS releases v0.8 with a 25MB int8 model and CPU-only speech synthesis

Daily AI Digest

Workflows you can try today

Freepik

Freepik Spaces supports music-video builds with Nano Banana grids, OmniHuman lipsync and Kling 3.0

Freepik

Freepik Spaces supports music-video lipsync with Veed Fabric 1.0 Fast and OmniHuman 1.5

Explore what's new in AI

Filter by tag in Audio & Voice

Fresh stories

KittenTTS supports 25MB ONNX voice models as HN debates prosody

KittenTTS supports 25MB ONNX voice models as HN debates prosody

Top storiesthis week

KittenTTS releases 25MB nano model for CPU text-to-speech

Gemini 3.1 Flash Live launches with 90.8% ComplexFuncBench audio score

Lightning V3.1 releases 10-second voice cloning with 44.1kHz output and sub-100ms latency

Lyria 3 Pro launches 3-minute song tools in Gemini API and AI Studio

KittenTTS releases 25MB nano voice model with CPU-only ONNX runtime

Freepik Spaces supports music-video lipsync with Veed Fabric 1.0 Fast and OmniHuman 1.5

KittenTTS releases v0.8 with a 25MB int8 model and CPU-only speech synthesis

KittenTTS releases 25MB nano model for CPU text-to-speech

Gemini 3.1 Flash Live launches with 90.8% ComplexFuncBench audio score

Lightning V3.1 releases 10-second voice cloning with 44.1kHz output and sub-100ms latency

Lyria 3 Pro launches 3-minute song tools in Gemini API and AI Studio

KittenTTS releases 25MB nano voice model with CPU-only ONNX runtime

Freepik Spaces supports music-video lipsync with Veed Fabric 1.0 Fast and OmniHuman 1.5

KittenTTS releases v0.8 with a 25MB int8 model and CPU-only speech synthesis

Daily AI Digest

Workflows you can try today

Freepik

Freepik Spaces supports music-video builds with Nano Banana grids, OmniHuman lipsync and Kling 3.0

Freepik

Freepik Spaces supports music-video lipsync with Veed Fabric 1.0 Fast and OmniHuman 1.5