Skip to content
AI Primer
release

Grok Imagine 1.5 drops on fal: creator posts compare its video motion against Seedance 2.0

fal added Grok Imagine Video 1.5, and creator posts immediately tested it against Seedance 2.0 and Gemini Omni on fight scenes, lip-sync, and reference-driven clips. The early comparisons put it into the serious creator model mix, but not clearly ahead of Seedance in real-world use.

4 min read
Grok Imagine 1.5 drops on fal: creator posts compare its video motion against Seedance 2.0
Grok Imagine 1.5 drops on fal: creator posts compare its video motion against Seedance 2.0

TL;DR

You can check the fal model page, skim xAI's preview docs, and compare that positioning against Arena's public leaderboard. xAI's broader video workflow docs also show where the model sits inside a bigger stack that now covers text-to-video, image-to-video, editing, reference-driven generation, and extension.

fal's Grok endpoint

fal's listing frames Grok Imagine 1.5 as an image-to-video endpoint that generates audio too, with the API exposed at xai/grok-imagine-video/v1.5/image-to-video on fal. The page tags it for stylized, transform, and lipsync workflows, which matches the kinds of clips creators immediately started posting.

The official xAI preview docs describe the model as grok-imagine-video-1.5-preview, with text and image to video modalities, the alias grok-imagine-video-1.5-2026-05-30, and availability in us-east-1 and eu-west-1.

Fight-scene motion

Ozan Sihay's side-by-side is the most useful same-prompt comparison in the evidence pool because it keeps the scene fixed and lets motion quality do the talking. He tested back-and-forth action across four models: Grok Imagine 1.5, Gemini Omni, Seedance 2.0 Fast, and Seedance 2.0 Pro.

Grok Imagine 1.5 fight-scene output
Seedance 2.0 Fast fight-scene output

That test landed next to ozansihay's ranking reaction, where he said Grok is progressing quickly but still does not beat Seedance 2.0 in real-world use. That is a sharper read than the leaderboard alone, especially because Arena's ranking page rewards preference votes, not one creator's workflow priorities.

Lip-sync and everyday character clips

The early user examples are less about cinematic benchmarking and more about whether the model can sell a quick social post. Carolletta's two clips lean into character presence and light performance, which is where bad video models usually fall apart first.

Carolletta's Grok Imagine coffee clip

The fal page's lipsync tag is doing real work here. Ozan's broader May roundup, ozansihay's model categories post, put Grok Imagine and Gemini Omni in his Turkish lip-sync shortlist, while he still gave general video generation to Seedance 2.0 and Kling 3.0.

Reference workflows and pricing

xAI's current video generation docs show five separate workflows in the same stack:

  • Video Generation, text prompt only.
  • Image-to-Video, prompt plus a starting image.
  • Video Editing, modify an existing video.
  • Reference-to-Video, guide generation with one or more reference images.
  • Video Extension, continue from a video's last frame.

The reference-to-video docs make the most creator-relevant distinction: reference images guide who or what appears in the clip without locking the first frame, which is useful for character consistency, product placement, and virtual try-on.

On cost, xAI's preview page lists output at $0.08 per second, while fal's hosted page breaks that out to $0.08 per second at 480p and $0.14 at 720p, adds $0.01 per input image, and says audio is included. The same pages set a 60 requests per minute limit and a default clip length of 6 seconds, with a 1 to 15 second range.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 3 threads
TL;DR1 post
Fight-scene motion1 post
Lip-sync and everyday character clips1 post
Share on X