Together AI launched a single-cloud stack for realtime voice agents that hosts Deepgram, Cartesia, MiniMax, and other voice components on one platform. Use it to cut latency and deployment overhead if you want one billing surface for production voice apps.

Together's launch is a unified runtime for real-time voice agents: speech-to-text, LLM inference, and text-to-speech run on one cloud instead of hopping across separate vendors. In the announcement thread, the company says the practical change for builders is co-location, model swapping across the stack, and one surface for billing, deployment, and access.
The first-party and partner lineup is broader than a single STT/TTS pair. Together's [img:1|Voice stack diagram] shows Cartesia, MiniMax, Rime, Deepgram, Whisper, Voxtral, Kokoro, and Orpheus connected to the same "AI native cloud for voice," while Cartesia's post says Cartesia is now a dedicated model partner and the Deepgram note confirms Deepgram STT is hosted natively on Together infrastructure.
The engineering pitch is fewer network boundaries. Together's blog post says most current voice systems are "stitched together across vendors," which adds latency and operational overhead as audio and tokens move over the internet between STT, LLM, and TTS services. Its replacement is a modular but co-located stack, and the company says that gets end-to-end latency below 700 ms for live conversations latency details.
That matters operationally as much as interactively. The same product post says the platform exposes unified API access, security controls including zero data retention and SOC 2 Type II support, and deployment options aimed at enterprise voice workloads. Meanwhile, MiniMax's update shows Together is treating the stack as a multi-model platform rather than a fixed pipeline: MiniMax Speech 2.6 Turbo has already been added alongside Deepgram and Cartesia, which makes the "swap models" claim more concrete.
Today, Together AI is launching a unified solution for building real-time voice agents with the entire pipeline running on one cloud. AI natives can now deploy voice apps for every use case at production scale.
The world’s leading AI infrastructure platforms are converging on the same voice model 🔥 Excited to announce that Cartesia is now a dedicated model partner on @togethercompute's Voice Platform for the 450K+ teams and developers building on Together.
Most voice stacks today are stitched together across vendors. Together puts the whole pipeline in one place for natural, real-time conversation. Here is how it works: together.ai/blog/build-rea…
Real-time voice agents are getting fast enough to feel conversational🎙️ MiniMax Speech 2.6 Turbo is now part of the voice stack on @togethercompute Show more
Today, Together AI is launching a unified solution for building real-time voice agents with the entire pipeline running on one cloud. AI natives can now deploy voice apps for every use case at production scale.