releaseMarch 17, 2026

Hao AI Lab launches Dreamverse: 30s 1080p video in 4.5s on one GPU

Dreamverse paired Hao AI Lab's FastVideo stack with an interface for editing video scenes in a faster-than-playback loop, using quantization and fused kernels to keep latency below viewing time. The stack is interesting if you are building real-time multimodal generation or multi-user video serving.

3 min read

Hao AI Lab launches Dreamverse: 30s 1080p video in 4.5s on one GPU

TL;DR

Hao AI Lab says Dreamverse can generate a full 30-second 1080p sequence in about 4.5 seconds on a single GPU, turning AI video from a minutes-long batch job into a faster-than-playback loop, according to the launch thread.
The product pitch is an edit-in-place workflow: generate a clip, watch it, then issue natural-language changes like “make it darker” or “change the background,” with revised versions arriving “within 5 seconds,” as shown in the workflow post.
Under the hood, Dreamverse rides Hao’s FastVideo stack, which the team says uses “4-bit quantization,” “fused kernels,” fast attention backends, and optimized multi-user serving, per the stack details.
For engineers, the interesting part is less the interface than the latency target: Hao is explicitly optimizing video inference until generation is faster than viewing, a threshold demonstrated in the demo thread and expanded in the blog post.

What shipped

Dreamverse is a prototype interface on top of FastVideo that aims to make video generation interactive instead of asynchronous. Hao AI Lab’s launch thread frames the change against current systems that “take minutes” for a 5-second 1080p clip, while Dreamverse is presented as a live loop where users can keep steering the same scene as outputs come back.

The workflow is deliberately short: “Generate a clip → watch it → edit,” and the workflow post gives concrete examples such as “Slow the camera” and “Change the background.” That matters because the system is not described as one-shot prompt generation; it is positioned as scene iteration with continuity across revisions. The public demo is available via the Dreamverse app, and Hao’s blog post describes this as “vibe directing” rather than prompt-and-wait generation.

How the latency target is being hit

Hao attributes the speed to a new real-time inference stack inside FastVideo. In the team’s technical thread, the named ingredients are fast attention backends, 4-bit quantization, fused kernels, and “optimized multi-user serving,” which is the most deployment-relevant detail in the announcement because it suggests the work is not only about a single offline benchmark run.

The practical bar here is unusual: generation has to stay below playback time so the “creative loop stays alive,” in Hao’s technical thread phrasing. That makes Dreamverse interesting beyond video UX. If the claim holds under load, the same stack design points toward real-time multimodal apps where responsiveness matters more than maximizing per-clip quality, especially for serving setups that need iterative edits instead of long queued renders.

TL;DR

What shipped

How the latency target is being hit

Discussion across the web