releaseMarch 27, 2026

Z.ai releases GLM-5.1 to Coding Plan users with `glm-5.1` model switch

Z.ai made GLM-5.1 available to all Coding Plan users and documented how to route coding agents to it by changing the model name in config. Early harness benchmarks place it near Opus 4.6 on coding evals, but BridgeBench users report much slower tokens per second.

3 min read

Z.ai releases GLM-5.1 to Coding Plan users with `glm-5.1` model switch

TL;DR

Z.ai's launch post says GLM-5.1 is now available to all GLM Coding Plan users, turning what had been a more limited model release into a plan-wide switch for coding agents and IDE workflows.
The practical activation path is simple: Z.ai's config instructions says users can point existing agent setups at the new model by changing the model name to glm-5.1 in their config file, including ~/.claude/settings.json for Claude Code.
Early coding eval screenshots in the launch chart and a follow-up benchmark post put GLM-5.1 at 45.3 on "coding tasks using Claude Code as the harness," versus 47.9 for Claude Opus 4.6 and 35.4 for GLM-5.
Tooling support is already showing up: Charm's Crush update added GLM-5.1 as a selectable model, while a BridgeBench repost in the speed report says the model ran at 44.3 tokens/sec and was "the slowest frontier model we've ever benchmarked."

What shipped, and how do you use it?

Z.ai's launch post says GLM-5.1 is available to "ALL GLM Coding Plan users" and links the plan page via the subscription page. The same thread gives the implementation detail engineers actually need: find your agent config file and replace the model name with glm-5.1. Z.ai's example points Claude Code users to ~/.claude/settings.json, which implies the release is meant to slot into existing agent harnesses rather than require a new client.

Charm's Crush post shows that adoption started immediately in downstream tooling. Its screenshot of Crush v0.52.0 exposes GLM-5.1 in a "Switch Model" menu with a configured state, suggesting the new model is already usable as a drop-in option inside at least one terminal coding assistant.

A separate post from TestingCatalog says GLM-5.1 is also "expected to be open-sourced in the first half of April." That timing has not been stated in Z.ai's launch post, so for now it sits as a third-party expectation rather than a confirmed launch commitment.

How strong is it on coding, and what is the tradeoff?

The headline number from Z.ai's evaluation chart is 45.3 on a coding evaluation "using Claude Code as the harness." In the same chart, Claude Opus 4.6 scores 47.9 and GLM-5 scores 35.4, so GLM-5.1 lands 2.6 points behind Opus 4.6 and nearly 10 points ahead of its predecessor. BridgeMind's summary post framed that as "within striking distance" of the top closed model on this specific harness.

The catch is throughput. A BridgeBench repost in the speed report says GLM-5.1 delivered 44.3 tokens per second and called it "the slowest frontier model we've ever benchmarked." That does not invalidate the coding score, but it does mean the current story is split: promising harness-level coding performance, with an early community report that latency may be much worse than engineers expect from other frontier coding models.

TL;DR

What shipped, and how do you use it?

How strong is it on coding, and what is the tradeoff?

Discussion across the web