An AI audio platform for generating, cloning, and transforming speech and other voice content.

Recent stories
A sponsored explainer thread described Speech Engine as a WebSocket layer that adds speech-to-text, turn detection, interruption handling, and text-to-speech to existing LLM agents. The pitch is that teams can keep their current model stack and add voice without rebuilding the whole agent.
A community post spotlights OmniVoice Studio, an open-source local dubbing pipeline that transcribes, translates, clones voice from 3 seconds and remixes dubbed audio back into video. Running locally keeps voice data on device and removes subscription costs, so it may fit privacy-sensitive dubbing workflows.
Posts show an open-source toolkit that turns one reference image into an interactive 3D scene with generated meshes, lighting, physics, and sound. The demo stack chains World Labs, Hunyuan 3D, ElevenLabs, and fal rather than a single native model.
Apocalypse Drone added 128 AI players, squad leader reassignment, and ElevenLabs radio chatter with location callouts in weekend dev updates. It matters for solo game builders because the project is simulating large-team coordination and voice comms on a lightweight stack instead of a bigger live-ops setup.
OpenBMB released VoxCPM on GitHub with text-described voice design, 3-second cloning, 48kHz audio, and 30-language support. The Apache 2.0 release makes multilingual voice work and local self-hosting cheaper.
ElevenLabs launched Flows, a node-based canvas inside ElevenCreative that chains image, video, voice, music, SFX, lip sync, and voice changing in one workspace. Use it to keep context across the pipeline instead of re-exporting between apps.