Skip to content
AI Primer

Z.ai's language model family, covering the GLM line of foundation models and releases such as GLM-4.5, GLM-4.6, GLM-4.7, GLM-5, and GLM-5.1.

Pricing

Official site · May 1, 2026, 1:09 PM
Input / 1M
$1.40
Output / 1M
$4.40
Cached input / 1M
$0.26

Prices are shown on Z.AI’s official pricing page as per-1M-token rates. Cached input storage is listed as limited-time free.

Z.AI’s official pricing page lists GLM-5.1 as a text model priced at $1.4 per 1M input tokens, $0.26 per 1M cached input tokens, and $4.4 per 1M output tokens; cached input storage is marked limited-time free.

View source

Model Intelligence

Arena ranking
42
Benchmarkable
No
Model level
family
Intelligence Index
34.2
Coding Index
32
Math Index
48
MMLU Pro
0.79
GPQA
0.66
HLE
0.06
LiveCodeBench
0.56
SciCode
0.35
AIME 2025
0.48
IFBench
0.55
LCR
0.36
TerminalBench Hard
0.3
TAU2
0.94

Recent stories

6 linked stories
newsPRIMARY2026-04-10
GLM-5.1 ranks #3 on Code Arena

Arena ranked GLM-5.1 third on Code Arena and first among open models, putting it on par with Claude Sonnet 4.6 and within about 20 points of the overall lead. The update gives the open model a new frontier coding benchmark after its initial release and hosting wave.

newsPRIMARY2026-04-08
GLM-5.1 lands on Modal, Together AI, Letta Code, and Tembo

Providers and agent platforms added GLM-5.1 endpoints across Modal, Together AI, Letta Code, Tembo, and Tabbit, with free trials, no-key access, and 99.9% SLA options. Use the new hosting options to test the model for coding and long-horizon agent workloads without waiting on self-hosting.

releasePRIMARY2026-04-07
Z.ai releases GLM-5.1, a 744B open model with 58.4 SWE-Bench Pro and 8-hour agent runs

Z.ai released GLM-5.1, a 744B open model built for long-horizon agentic coding and ranked first among open systems on SWE-Bench Pro. Day-0 support in OpenRouter, Ollama, SGLang, vLLM, OpenCode, and local quantization paths makes it ready to test in existing stacks.

releasePRIMARY2026-04-01
Z.ai launches GLM-5V-Turbo for screenshot coding and GUI-agent tasks

Z.ai released GLM-5V-Turbo, a multimodal coding model for screenshots, video, design drafts, and GUI-agent tasks. It keeps text-coding performance steady while adding native vision support, so teams can test visual workflows without swapping models.

releasePRIMARY2026-03-28
Z.ai releases GLM-5.1 to all Coding Plan users with 5am–11am PT switch window

Z.ai said GLM-5.1 is now available to all GLM Coding Plan users and highlighted a 5am to 11am PT switch window. The update broadens access beyond the initial rollout, though early practitioner tests reported weaker Repo bench and tool-calling behavior than 5.0.

releasePRIMARY2026-03-27
Z.ai releases GLM-5.1 to Coding Plan users with `glm-5.1` model switch

Z.ai made GLM-5.1 available to all Coding Plan users and documented how to route coding agents to it by changing the model name in config. Early harness benchmarks place it near Opus 4.6 on coding evals, but BridgeBench users report much slower tokens per second.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.