Artificial Analysis is an independent benchmarking and comparison surface for AI models, API providers, and hardware. It publishes the Artificial Analysis Intelligence Index (currently v4.0, combining 10 evaluations including GDPval-AA, τ²-Bench Telecom, Terminal-Bench Hard, SciCode, AA-LCR, AA-Omniscience, IFBench, Humanity's Last Exam, GPQA Diamond, and CritPt), plus specialised Coding and Agentic indices, image/video/speech leaderboards, an Openness Index, API Provider Performance ranks, and the new AA-AgentPerf hardware benchmark for real agent workloads. Used by buyers choosing models/providers and by the industry as a shared scoreboard. Free public site with changelog-driven evaluation updates.

Recent stories
0 linked stories
No linked stories yet.