OmniVoice Studio opens local dubbing for 600 languages from one MP4
A community post spotlights OmniVoice Studio, an open-source local dubbing pipeline that transcribes, translates, clones voice from 3 seconds and remixes dubbed audio back into video. Running locally keeps voice data on device and removes subscription costs, so it may fit privacy-sensitive dubbing workflows.

TL;DR
- OmniVoice Studio packages a full local dubbing chain into one desktop app, according to hasantoxr's OmniVoice Studio thread: transcription with Whisper, translation, voice cloning from three seconds of audio, music separation with Demucs, and remixing the dubbed voice back into the source video.
- The project's pitch is privacy and cost, because hasantoxr's thread says voice data stays on-device and the app runs without subscriptions or API keys, while the GitHub repo frames it as a desktop alternative to hosted voice tools.
- Platform coverage is unusually broad for an early creative tool, with hasantoxr's screenshot showing installers for macOS, Windows, Linux, AppImage, and Debian, and the same post claiming support for Mac, NVIDIA, AMD, or CPU setups.
- Language count is the eye-catcher: hasantoxr's post says 600 languages, while the repo screenshot advertises 646, which at minimum signals a very wide multilingual target even if the exact count is still moving in beta.
You can jump straight to the GitHub repo, scan the Quickstart link in the project page, and the screenshot in hasantoxr's post already shows the shape of the workflow: video preview, transcript table, glossary controls, and voice selection in one desktop UI.
The one-file dubbing pipeline
The useful part here is how much of the dubbing stack is collapsed into a single ingest step. Drop in an MP4, then the tool chains together:
- Speech transcription with Whisper.
- Translation into the target language.
- Voice cloning from about three seconds of source audio.
- Background music separation with Demucs.
- Remixing the dubbed vocal back onto the original soundtrack.
That bundle is what makes it feel closer to a small local studio than a single-model demo. For creators cutting trailers, explainers, or social clips, the interesting shift is not one component, it is the fact that the boring handoffs are already wired together.
The desktop UI already looks production-shaped
The screenshot shows a transcript grid with timecodes, speaker labels, language, and voice columns, plus a file list that mixes ordinary video clips with music assets. There is also a glossary panel, which is the kind of detail that matters once you move from toy dubbing into brand names, character terms, and repeated phrases.
The GitHub repo also surfaces release badges, install targets, and community links on the landing page. That is a stronger first impression than most open source media tools, even with the beta warning still visible in the same screenshot.
Local-first is the actual pitch
The headline feature is not just dubbing, it is where the dubbing happens. hasantoxr's thread says the app runs fully local, with no API keys and no voice data leaving the machine, and the screenshot repeats that same positioning on the project page.
That makes OmniVoice Studio interesting for teams working with unreleased ads, client footage, internal training video, or any project where sending voice stems to a hosted service is the dealbreaker.
Install footprint and beta status
The screenshoted project page lists installers for:
- macOS
- Windows x64 MSI
- Linux AppImage x64
- Debian .deb
It also carries a blunt warning that the app is in active beta and that the latest fixes may land in source before prebuilt installers. That is new information worth ending on, because it shifts the story from "open source alternative exists" to "the easiest way in may not be the most current one yet."