Z.ai released GLM-5-Turbo as a faster GLM-5 variant for OpenClaw-style tool use, with 202K context, OpenRouter access, and higher off-peak limits. Try it as a cheaper speed tier for agent workflows, but benchmark completion quality on your own tasks before wider use.

GLM-5-Turbo is not positioned as a generic chat refresh. In Z.ai's developer guide, the company says the model was optimized for OpenClaw-style work, including "reliable external tool invocation," complex instruction decomposition, scheduled and persistent tasks, and "real-time responses." That matters because those are the failure modes that usually show up first in long-running coding and automation agents.
A community read of the docs in the feature summary adds two implementation details engineers will care about: 200K-class context, 128K max output, and support for a thinking mode. Separately, the OpenRouter listing pegs the exact context window at 202,752 tokens and describes the model as optimized for long execution chains and persistent execution rather than short-turn chat.
Z.ai launched GLM-5-Turbo through its own API and via OpenRouter on day one, with the official note linking both surfaces directly. The OpenRouter listing reports token pricing at $0.96 per million input tokens and $3.20 per million output tokens, alongside routing support rather than a single fixed provider endpoint.
The model is also already showing up in downstream tooling. In Charm's Crush post, the Crush client says GLM-5-Turbo is available immediately and that users need "no update required," which suggests model selection can happen at the provider layer instead of through a new desktop release. That shortens evaluation cycles for teams already using routed model backends.
The rollout is staggered. Z.ai's schedule post says Pro users get GLM-5-Turbo this month, while Lite users wait until April for Turbo; the early-access post offers separate application forms for teams that want access sooner. For engineers planning evals, that means API availability and coding-plan availability are not the same thing.
There are two practical caveats in the first-day messaging. First, Z.ai says in its experimental note that GLM-5-Turbo is "currently closed-source," though it also says the capabilities and findings will feed the next open-source release. Second, the usage-limits update says GLM Coding Plan limits are tripled for GLM-5-Turbo during non-peak hours, with the same high-volume capacity as GLM-4.7 available anytime except 2-6 AM ET through April 30. That makes the launch more usable for batchier agent workloads, but only inside a time-boxed promotional window.
Introducing GLM-5-Turbo: A high-speed variant of GLM-5, excellent in agent-driven environments such as OpenClaw. Coding Plan Max: z.ai/subscribe OpenRouter: openrouter.ai/z-ai/glm-5-tur… API: docs.z.ai/guides/llm/glm… Show more
Note: As an experimental version, GLM-5-Turbo is currently closed-source. All capabilities and findings will be incorporated into our next open-source model release.
Rollout Schedule - Pro Users: GLM-5-Turbo arrives this March. - Lite Users: GLM-5 arrives this March. GLM-5-Turbo arrives in April. Show more
Usage limits tripled for GLM-5-Turbo in GLM Coding Plan! Enjoy the same high-volume capacity as GLM-4.7 during non-peak hours. Availability: Anytime except 2–6 AM ET. Ends: April 30.
Introducing GLM-5-Turbo: A high-speed variant of GLM-5, excellent in agent-driven environments such as OpenClaw. Coding Plan Max: z.ai/subscribe OpenRouter: openrouter.ai/z-ai/glm-5-tur… API: docs.z.ai/guides/llm/glm…