Voice AI
Stories, products, and related signals connected to this tag in Explore.
Stories
Filter storiesA sponsored explainer thread described Speech Engine as a WebSocket layer that adds speech-to-text, turn detection, interruption handling, and text-to-speech to existing LLM agents. The pitch is that teams can keep their current model stack and add voice without rebuilding the whole agent.
Supertone open-sourced Supertonic, a local TTS engine that runs faster than real time on phone CPUs with ONNX models and cross-language runtimes. Voice apps and audiobook workflows can use it to avoid per-character API billing and keep audio generation private.
xAI rolled out Grok Voice Think Fast 1.0 with ready-made tool schemas for medical offices, restaurants, help desks, real estate, appointments, and hotel concierge tasks. The release lowers setup work because common service actions arrive pre-wired as callable tools.
OpenBMB released VoxCPM on GitHub with text-described voice design, 3-second cloning, 48kHz audio, and 30-language support. The Apache 2.0 release makes multilingual voice work and local self-hosting cheaper.