Skip to content
AI Primer
update

Meta TRIBE v2 claims 130% average viewed and 81.5% retention in creator tests

Youraipulse and Amir Mushich posted side-by-side TRIBE v2 analyses against YouTube and TikTok data, including one YouTube example with 130% average viewed and 81.5% retention. The posts push the tool from cut scoring toward prediction-versus-performance claims, but the evidence is still creator- and vendor-reported.

5 min read
Meta TRIBE v2 claims 130% average viewed and 81.5% retention in creator tests
Meta TRIBE v2 claims 130% average viewed and 81.5% retention in creator tests

TL;DR

  • youraipulse's YouTube example paired a TRIBE v2 analysis with YouTube Studio metrics showing 130% average viewed and 81.5% view retention, one of the first posts to push the tool from edit scoring toward prediction-versus-performance claims.
  • AmirMushich's TikTok test did the same with a 2.4 million view TikTok, but Mushich said in a follow-up reply that the team still needs more real tests before saying much confidently.
  • The community wrapper around Meta's model is not just a single-video analyzer. AmirMushich's product thread and the GitHub repo both describe side-by-side comparison for 2 to 4 cuts, timestamp-level weak-spot hunting, and exportable reports.
  • Meta's own framing is narrower than the virality discourse. In Meta's launch post, TRIBE v2 is pitched as a multimodal neuroscience model for predicting fMRI responses to video, audio, and text, not as a native creator analytics product.

You can read Meta's launch post, browse the model card on Hugging Face, and inspect the community TRIBE Review MVP repo. The weirdly useful part is that the wrapper has already settled on two concrete creator workflows, compare multiple cuts or attack one dip at a time, while the newer TikTok test and the YouTube post are trying to connect those graphs to real platform retention data.

Correlation tests

The strongest performance claim so far comes from youraipulse's post, which matched a TRIBE v2 analysis to a top YouTube video on one of the author's channels. The paired metrics in the follow-up tweet were 130% average viewed and 81.5% retention.

AmirMushich's earlier test described a similar one-video comparison between a TRIBE v2 prediction and real YouTube performance. Later, his TikTok example said a 2.4 million view video produced a high predicted brain-response rate, with strong early activity in the model's visual and recognition areas.

The caveat is in the same evidence stream. In his reply, Mushich said the beginning looked wild but more real tests were still needed. That makes these posts interesting creator-side experiments, not a settled benchmark.

Comparison workflow

The wrapper is built around two editing modes, and the repo explains them more clearly than the tweets do.

  • Compare 2 to 4 versions: The repo README says the goal is not to pick one whole winner, but to identify the strongest hook, middle section, transitions, and payoff across several cuts, then build a new edit from those sections.
  • Improve one video: The same README describes solo mode as dip hunting. Find a drop in the response curve, inspect nearby frames, change one thing, then re-test.
  • A/B pre-testing: AmirMushich's launch thread and youraipulse's reply both frame multi-video comparison as a way to compare several edits before posting.
  • Edit-map language: the README screenshots literally call the graph an editing map, which is a much more grounded pitch than full-on virality prediction.

That distinction matters because the product already looks more mature as an editing instrument than as a forecasting instrument.

Interface signals

The screenshots and demo posts keep returning to the same set of signals:

The practical shift is that the UI turns a neuroscience model into something closer to a shot-by-shot review surface. AmirMushich's feature list says it suggests edits and fixes, while the repo README adds JSON and PDF export, local Whisper-based timing hints, and optional Ollama-based copy rewriting.

Meta's actual model scope

The creator discourse has centered on viral potential, but the official sources are broader and more cautious. In Meta's announcement, TRIBE v2 is a "digital twin" for neural activity that predicts how the brain responds to sight and sound. In Meta's research publication, the system is described as a tri-modal foundation model for video, audio, and language, trained on more than 1,000 hours of fMRI from 720 subjects.

The Hugging Face model card fills in the stack: Llama 3.2 for text, V-JEPA2 for video, and Wav2Vec-BERT for audio, all mapped into a unified transformer that predicts cortical responses. Meta's launch post says the point is to help neuroscientists and clinical researchers test theories without requiring human subjects.

That is a bigger and stranger origin story than "AI tool scores your Reel." The creator wrapper is essentially a remix layer on top of a research model.

License and intended use

The repo is explicit about what this project is and is not. AmirMushich's link post calls it an independent, non-commercial community prototype built on Meta's official model, and the GitHub README says all inference runs through the released TRIBE v2 weights under a CC BY-NC 4.0 license.

That licensing detail is easy to miss, but it sets the boundary for the whole experiment. The tool is open source and local-first, according to AmirMushich's launch thread, yet it is still sitting on top of a non-commercial research release rather than a creator product Meta shipped for ad teams or editors.

The result is a very 2026 artifact: a neuroscience model published for research, a community wrapper built in two weeks, and creators immediately trying to see whether the curve lines up with YouTube and TikTok retention data.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 3 threads
Correlation tests3 posts
Comparison workflow1 post
Interface signals2 posts
Share on X