Skip to content
AI Primer
breaking

X publishes Phoenix ranking code and monthly GitHub release notes

Elon Musk says X will publish the Phoenix feed system and future changes on GitHub with monthly release notes. Creators are already analyzing reply, dwell, native media, and negative-signal weighting in the Grok-assisted stack.

6 min read
X publishes Phoenix ranking code and monthly GitHub release notes
X publishes Phoenix ranking code and monthly GitHub release notes

TL;DR

You can browse the repo, watch the commit history, and check whether X follows through on release notes. The useful bit landed fast: Everlier's diagrams surfaced concrete weights and component names, minchoi's thread mapped the candidate pipeline, and LinusEkenstam's follow-up showed how creator frustration quickly turned into a public request for better scoring transparency.

Phoenix replaced the old hand-tuned story

X is now framing feed ranking as a published software system, not a black box explained by folklore.

The repo link in techhalla's post points to xai-org/x-algorithm. Musk's promise of monthly drops matters because it turns ranking changes into something creators can diff, not just infer from reach swings.

Both minchoi's meta summary and Everlier's notes describe the new center of gravity the same way: a learned, multimodal, Grok-assisted ranking stack with fewer visible manual heuristics.

The feed pipeline now has named stages

The most useful high-level map is the pipeline itself.

Across minchoi's thread and

, the feed flow breaks into:

  1. User understanding and query hydration.
  2. Candidate sourcing from in-network and out-of-network pools.
  3. Candidate hydration with media, author, and safety context.
  4. Pre-scoring filters for duplicates, blocks, age, and seen content.
  5. ML scoring and ranking.
  6. Selection of top posts.
  7. Post-selection visibility filters.

That out-of-network step is where Phoenix shows up most clearly.

describes Phoenix as both a retrieval system for global-corpus similarity search and a transformer ranker that predicts engagement probabilities.

The score is a weighted bundle of actions

The interesting shift is not that X predicts likes. It predicts a larger menu of actions and then sums them into a final score.

According to techhalla's Phoenix summary, the final score is a weighted sum over predicted outcomes.

lists positive weights for favorite, repost, reply, quote, share, click, dwell, video view, profile click, photo expand, and follow-author.

The same screenshot shows favorites and reposts at +1.0, replies and quotes at +0.8, shares at +0.7, and clicks and follows at +0.5. That is more concrete than the old creator lore about "engagement" as one fuzzy bucket.

Grok shows up in moderation and understanding

The repo analysis threads converge on one point: Grok is not just a chatbot glued onto X, it is threaded through content interpretation.

Between minchoi's Grok list and Everlier's notes, Grok or Grox is attached to:

  • content understanding
  • spam detection
  • safety screening
  • post-category classification
  • multimodal embeddings
  • topic matching
  • policy enforcement

That helps explain why several creator takes moved past follower count and hashtags. If the system is classifying text, image, video, and reply context directly, semantic fit matters more than manual packaging tricks.

Negative signals and native media are first-class inputs

The repo readers did not just focus on upside. They immediately zeroed in on the things that suppress reach.

From minchoi's list and

, the main negative actions are:

  • not interested
  • mute author
  • block author
  • report

minchoi's native-content post adds another practical read from the code path: media hydration, conversation context, safety, and dwell are all explicit enough that external-link posts likely start with a handicap compared with native video, images, and on-platform threads.

Replies, recency, and diversity each have their own lever

Three creator-facing details kept popping up in the thread analysis, and they are more specific than generic advice about posting better.

According to minchoi's reply-weight post, replies appear to carry their own dedicated positive weight. According to minchoi's recency post, recency and post age are visible once a post enters retrieval, even if the repo does not expose the upstream "test audience" logic.

A third lever came from minchoi's author-diversity post, which says feed slices reduce later posts from the same author when too many appear together. minchoi's reply-network post adds that who you reply to also feeds future personalization.

Open code did not end creator paranoia

The public code drop and the creator complaints landed almost simultaneously.

In LinusEkenstam's analytics post, Linus Ekenstam claimed an 85 to 95 percent drop across metrics after a viral hit, including impressions down 91 percent and engagements down 96 percent in a two-week comparison card. LinusEkenstam's longer complaint says he has seen the pattern three times in under a year and tied it to being "miss labeled by grok" in the past.

The complaint got more pointed when LinusEkenstam's Waymo comparison contrasted his 61,000-view post with a later larger account upload of similar content that hit 3 million plus views in seven hours. Open-sourcing the ranker did not answer the creator question he kept asking: why an account enters reach limbo in the first place.

The missing feature is creator-side diagnostics

The final reveal from the reaction cycle is not about Phoenix at all. It is about the tooling creators still do not have.

Ekenstam's ask in his original complaint was simple: surface account-level insights when something has gone wrong. After Musk's public engagement created a new spike, his follow-up asked whether the account would flatline again or return to normal.

That left two truths sitting next to each other. techhalla's reaction called X unusually transparent because the code is public, while LinusEkenstam's follow-up shows that code transparency still does not tell a creator why their own account was scored the way it was on a given day.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 9 threads
TL;DR1 post
Phoenix replaced the old hand-tuned story2 posts
The feed pipeline now has named stages1 post
The score is a weighted bundle of actions1 post
Grok shows up in moderation and understanding1 post
Negative signals and native media are first-class inputs1 post
Replies, recency, and diversity each have their own lever1 post
Open code did not end creator paranoia3 posts
The missing feature is creator-side diagnostics1 post
Share on X