Arena now shows input-output pricing and max context window directly on its text leaderboards, along with public material on how votes become research-grade data. Use it to compare rank against cost and context limits when choosing models.

Arena has added price and context columns directly to its text leaderboard. According to the announcement, price is shown as input and output cost per 1M tokens, while context shows the maximum context window.
That matters because the leaderboard is now doing more than rank ordering models by Arena score. The leaderboard page shows those new fields alongside model score, vote count, and license, so teams can compare quality against hard deployment constraints like token budget and long-context support in one place. Arena frames it as a way to compare models "based on what matters for your use case" in the launch post.
Arena paired the UI change with public material on its evaluation pipeline. In the linked explainer, it says user prompts are tagged by category, low-quality or suspicious activity is filtered out, and duplicate or manipulative votes are removed before they affect rankings.
The follow-up discussion thread adds a little more operational detail. Arena says it tracks many categories beyond the overall score, including "Creative Writing," "Instruction Following," occupational domains, and coding, as shown in
. The same thread says a provider once "switched an endpoint against the policy," and that validation against abuse is "a lot better now since that incident." That does not settle broader benchmark skepticism, but it clarifies that Arena is trying to position the leaderboard as a fresh, multi-category, user-driven benchmark rather than a static test set.
Arena leaderboards now include Price and Context. - Price is shown as input / output cost per 1M tokens, and context shows the maximum context window. Compare Arena scores based on what matters for your use case.
This doesn't cover every single thing we do, but to give you an idea youtube.com/watch?v=omT1oh… - very nice video by @cthorrez in our ML team