Skip to content
AI Primer
workflow

Gemini, Nano Banana Pro, Kling, and Veo power a 7-step historical short-film workflow

A filmmaker shared a seven-step pipeline that uses Gemini for research, Nano Banana Pro for consistent scenes, Kling for image-to-video, Veo for speaking shots, and CapCut for finish. The sequence is useful if you want research, references, motion, and sound separated into controllable stages.

2 min read
Gemini, Nano Banana Pro, Kling, and Veo power a 7-step historical short-film workflow
Gemini, Nano Banana Pro, Kling, and Veo power a 7-step historical short-film workflow

TL;DR

  • A filmmaker broke down a seven-step short-film workflow that starts with Gemini Deep Research for historical detail, then moves through scripting, image generation, motion, sound, and final edit workflow thread.
  • The key visual tactic in workflow thread is using real reference photos with Nano Banana Pro to build a consistent character across angles before generating scene images.
  • For motion, workflow thread says Kling Element handled character consistency in image-to-video shots, while Veo was reserved for shots that needed spoken dialogue.
  • The same thread and follow-up post show the process ending with Epidemic Sound Labs for music and SFX, then CapCut for rhythm, color, and assembly follow-up post.

How the pipeline works

The sequence is deliberately split by task. First, Gemini Deep Research gathered period details, atmosphere, and historical facts; then those materials were turned into an original script. From there, real photos of Seyit Onbaşı were used as references in Nano Banana Pro to create a stable character and a dusty, realistic cinematic look across still scenes.

Motion came only after the stills were locked. The creator says Kling Element was used to introduce the character and keep face consistency between scenes, while Veo handled the shots that required speech. The research document is linked in the shared guide.

Why this matters for AI filmmakers

What makes the workflow useful is the separation of control points. Research, script, character references, scene generation, motion, sound, and edit are all handled in different tools, which makes it easier to revise one stage without rebuilding the whole film. That is a more practical answer to “what prompt did you use?” than a single magic prompt.

The audio stage is also treated as a creative layer rather than cleanup. In the creator's breakdown, Epidemic Sound's AI-assisted Labs tools were used to find period-appropriate music and battle sound effects before the project was finished in CapCut; the music service is the one linked in audio post.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 1 thread
Why this matters for AI filmmakers1 post
Share on X