releaseMarch 17, 2026

Hao AI Lab launches Dreamverse: 30s 1080p video in 4.5s on one GPU

Dreamverse paired Hao AI Lab's FastVideo stack with an interface for editing video scenes in a faster-than-playback loop, using quantization and fused kernels to keep latency below viewing time. The stack is interesting if you are building real-time multimodal generation or multi-user video serving.

Multimodal Realtime AI LLM Serving Inference Optimization

3 min read

Hao AI Lab launches Dreamverse: 30s 1080p video in 4.5s on one GPU

TL;DR

Hao AI Lab says Dreamverse can generate a full 30-second 1080p sequence in about 4.5 seconds on a single GPU, turning AI video from a minutes-long batch job into a faster-than-playback loop, according to the launch thread.
The product pitch is an edit-in-place workflow: generate a clip, watch it, then issue natural-language changes like “make it darker” or “change the background,” with revised versions arriving “within 5 seconds,” as shown in the workflow post.
Under the hood, Dreamverse rides Hao’s FastVideo stack, which the team says uses “4-bit quantization,” “fused kernels,” fast attention backends, and optimized multi-user serving, per the stack details.
For engineers, the interesting part is less the interface than the latency target: Hao is explicitly optimizing video inference until generation is faster than viewing, a threshold demonstrated in the demo thread and expanded in the blog post.

What shipped

Hao AI Lab

@haoailab

·Follow

(1/N) We're launching Dreamverse. Most AI video models take minutes to generate a 5 s 1080p clip. In 4.5 seconds, we can generate 30 s 1080p clips on a single GPU. Our videos generate faster than you can watch them: stop waiting on prompts and start directing scenes live. Show more

Watch on X

9:19 PM · Mar 17, 2026

544

Read 39 replies

Dreamverse is a prototype interface on top of FastVideo that aims to make video generation interactive instead of asynchronous. Hao AI Lab’s launch thread frames the change against current systems that “take minutes” for a 5-second 1080p clip, while Dreamverse is presented as a live loop where users can keep steering the same scene as outputs come back.

The workflow is deliberately short: “Generate a clip → watch it → edit,” and the workflow post gives concrete examples such as “Slow the camera” and “Change the background.” That matters because the system is not described as one-shot prompt generation; it is positioned as scene iteration with continuity across revisions. The public demo is available via the Dreamverse app, and Hao’s blog post describes this as “vibe directing” rather than prompt-and-wait generation.

How the latency target is being hit

Hao AI Lab

@haoailab

·Follow

Replying to @haoailab

(3/N) Under the hood, this runs on our new real-time inference stack in FastVideo (our open-source video model post-training/inference framework): • fast attention backends • 4-bit quantization • fused kernels • optimized multi-user serving • and much more 🤫 Fast enough Show more

9:19 PM · Mar 17, 2026

Read 2 replies

Hao attributes the speed to a new real-time inference stack inside FastVideo. In the team’s technical thread, the named ingredients are fast attention backends, 4-bit quantization, fused kernels, and “optimized multi-user serving,” which is the most deployment-relevant detail in the announcement because it suggests the work is not only about a single offline benchmark run.

The practical bar here is unusual: generation has to stay below playback time so the “creative loop stays alive,” in Hao’s technical thread phrasing. That makes Dreamverse interesting beyond video UX. If the claim holds under load, the same stack design points toward real-time multimodal apps where responsiveness matters more than maximizing per-clip quality, especially for serving setups that need iterative edits instead of long queued renders.

🧾 More sources

TL;DR1 tweets

Top-line facts covering the launch claim, workflow, and technical stack behind Dreamverse.

What shipped2 tweets

Evidence focused on the product announcement, user workflow, and public demo entry points.