Vercel adds useRealtime, generateSpeech, and transcribe to AI Gateway
Vercel shipped realtime speech and transcription support in AI Gateway and AI SDK 7, then added Grok voice models through the same interface. The update puts voice agents on the same gateway, WebSocket, and AI SDK stack Vercel already uses for text models.

TL;DR
- Vercel shipped realtime voice primitives into the same stack as its text gateway, with vercel's launch post saying AI Gateway now supports
useRealtime,generateSpeech, andtranscribein AI SDK 7. - The first-party workflow is already wired into a starter path, because vercel's build-your-first-voice-agent post points straight to a voice agent guide.
- According to cramforce's note on platform timing, last week's WebSocket support on Vercel directly enabled this week's realtime model rollout in AI Gateway.
- Vercel did not stop at generic realtime support, because vercel_dev's Grok model post added xAI voice, TTS, and STT model slugs through the same interface a few hours later.
- The gateway had already been widening its model roster before the voice ship, with Sakana AI Labs' Fugu-Ultra post showing Fugu-Ultra on AI Gateway earlier in the month.
You can build your first voice agent through Vercel's linked guide, watch vercel's launch demo card, and see Grok voice model IDs land on the same gateway surface. rauchg's repost compressed the pitch to four words, while cramforce's follow-up exposed the more interesting implementation detail: this shipped immediately after platform WebSocket support.
Voice agents
The notable part is not just that Vercel added speech features. It exposed three separate primitives, realtime session handling, text-to-speech, and speech-to-text, under AI Gateway and AI SDK 7 in one move, according to vercel's launch post.
That gives voice agents a cleaner shape than a single monolithic API:
useRealtimefor live sessions, per vercel's launch postgenerateSpeechfor TTS, per vercel's launch posttranscribefor STT, per vercel's launch post
WebSockets
The most concrete architecture clue came from cramforce's platform note, which said Vercel shipped platform WebSocket support last week and the AI Gateway team used it this week to ship realtime AI models.
Two short replies from the same thread, cramforce's "Indeed" reply and cramforce's "See" reply, reinforce that the dependency chain was intentional, not coincidence.
Grok voice models
The follow-on ship mattered because it showed this was a gateway surface, not a one-model demo. vercel_dev's Grok voice model post listed three xAI slugs that map onto the same three functions:
xai/grok-voice-think-fast-1.0foruseRealtimexai/grok-ttsforgenerateSpeechxai/grok-sttfortranscribe
That is a more useful signal than a generic "voice support" announcement, because the model naming makes clear that provider routing is already part of the product surface.
Fugu-Ultra
Voice arrived on top of an already expanding gateway catalog. Earlier Sakana AI posts, the original Fugu-Ultra announcement and a later link back to it, show Fugu-Ultra joining AI Gateway before the realtime rollout.
That earlier addition does not add voice detail, but it does add one new fact at the end of the story: AI Gateway's June updates were not a single feature drop. They were a steady expansion of both transport capabilities and model inventory.