releaseMarch 15, 2026

Z.ai releases GLM-5-Turbo with 202K context for OpenClaw-style agent workflows

Z.ai released GLM-5-Turbo as a faster GLM-5 variant for OpenClaw-style tool use, with 202K context, OpenRouter access, and higher off-peak limits. Try it as a cheaper speed tier for agent workflows, but benchmark completion quality on your own tasks before wider use.

GLM Coding Agents Rate Limits

3 min read

Z.ai releases GLM-5-Turbo with 202K context for OpenClaw-style agent workflows

TL;DR

Z.ai's launch thread introduces GLM-5-Turbo as a faster GLM-5 variant tuned for "agent-driven environments such as OpenClaw," with an official developer guide describing stronger tool invocation, command following, and long-chain execution.
Access is split by plan: according to the rollout schedule, Pro users get GLM-5-Turbo in March, while Lite users get base GLM-5 in March and GLM-5-Turbo in April; Z.ai's early-access post also opened waitlists for both tiers.
The model is already exposed through routing and client surfaces: the OpenRouter listing shows 202,752-token context and pricing at $0.96 per million input tokens and $3.20 per million output tokens, while Charm's Crush post says it is live there with no client update required.
Z.ai's experimental-model note says GLM-5-Turbo is closed-source for now, and a separate capacity update says usage limits are tripled during off-peak hours through April 30, except 2-6 AM ET.

What shipped for agent workflows

GLM-5-Turbo is not positioned as a generic chat refresh. In Z.ai's developer guide, the company says the model was optimized for OpenClaw-style work, including "reliable external tool invocation," complex instruction decomposition, scheduled and persistent tasks, and "real-time responses." That matters because those are the failure modes that usually show up first in long-running coding and automation agents.

A community read of the docs in the feature summary adds two implementation details engineers will care about: 200K-class context, 128K max output, and support for a thinking mode. Separately, the OpenRouter listing pegs the exact context window at 202,752 tokens and describes the model as optimized for long execution chains and persistent execution rather than short-turn chat.

Where you can run it and what it costs

Z.ai launched GLM-5-Turbo through its own API and via OpenRouter on day one, with the official note linking both surfaces directly. The OpenRouter listing reports token pricing at $0.96 per million input tokens and $3.20 per million output tokens, alongside routing support rather than a single fixed provider endpoint.

The model is also already showing up in downstream tooling. In Charm's Crush post, the Crush client says GLM-5-Turbo is available immediately and that users need "no update required," which suggests model selection can happen at the provider layer instead of through a new desktop release. That shortens evaluation cycles for teams already using routed model backends.

What to watch in rollout and capacity

The rollout is staggered. Z.ai's schedule post says Pro users get GLM-5-Turbo this month, while Lite users wait until April for Turbo; the early-access post offers separate application forms for teams that want access sooner. For engineers planning evals, that means API availability and coding-plan availability are not the same thing.

There are two practical caveats in the first-day messaging. First, Z.ai says in its experimental note that GLM-5-Turbo is "currently closed-source," though it also says the capabilities and findings will feed the next open-source release. Second, the usage-limits update says GLM Coding Plan limits are tripled for GLM-5-Turbo during non-peak hours, with the same high-volume capacity as GLM-4.7 available anytime except 2-6 AM ET through April 30. That makes the launch more usable for batchier agent workloads, but only inside a time-boxed promotional window.

🧾 More sources

TL;DR2 tweets

Top-line launch facts: positioning, rollout timing, availability surfaces, pricing, and initial capacity policy.

What shipped for agent workflows2 tweets

This group covers the model's intended use case and the concrete capabilities highlighted for agentic execution rather than chat-first usage.

Where you can run it and what it costs2 tweets

This group captures launch surfaces, commercial details, and downstream tool availability relevant to implementation decisions.

What to watch in rollout and capacity1 tweets

This group covers staged access, early-access paths, temporary usage boosts, and the closed-source status engineers need to account for.

releaseMarch 15, 2026

Z.ai releases GLM-5-Turbo with 202K context for OpenClaw-style agent workflows

GLM Coding Agents Rate Limits

3 min read

TL;DR

Z.ai's launch thread introduces GLM-5-Turbo as a faster GLM-5 variant tuned for "agent-driven environments such as OpenClaw," with an official developer guide describing stronger tool invocation, command following, and long-chain execution.
Access is split by plan: according to the rollout schedule, Pro users get GLM-5-Turbo in March, while Lite users get base GLM-5 in March and GLM-5-Turbo in April; Z.ai's early-access post also opened waitlists for both tiers.
The model is already exposed through routing and client surfaces: the OpenRouter listing shows 202,752-token context and pricing at $0.96 per million input tokens and $3.20 per million output tokens, while Charm's Crush post says it is live there with no client update required.
Z.ai's experimental-model note says GLM-5-Turbo is closed-source for now, and a separate capacity update says usage limits are tripled during off-peak hours through April 30, except 2-6 AM ET.

What shipped for agent workflows

Z.ai

@Zai_org

·Follow

Introducing GLM-5-Turbo: A high-speed variant of GLM-5, excellent in agent-driven environments such as OpenClaw. Coding Plan Max: z.ai/subscribe OpenRouter: openrouter.ai/z-ai/glm-5-tur… API: docs.z.ai/guides/llm/glm… Show more

4:38 PM · Mar 15, 2026

2.7K

Read 197 replies

Where you can run it and what it costs

Z.ai

@Zai_org

·Follow

Replying to @Zai_org

Note: As an experimental version, GLM-5-Turbo is currently closed-source. All capabilities and findings will be incorporated into our next open-source model release.

4:39 PM · Mar 15, 2026

260

Read 15 replies

What to watch in rollout and capacity

Z.ai

@Zai_org

·Follow

Replying to @Zai_org

Rollout Schedule - Pro Users: GLM-5-Turbo arrives this March. - Lite Users: GLM-5 arrives this March. GLM-5-Turbo arrives in April. Show more

4:38 PM · Mar 15, 2026

173

Read 3 replies

Z.ai

@Zai_org

·Follow

Usage limits tripled for GLM-5-Turbo in GLM Coding Plan! Enjoy the same high-volume capacity as GLM-4.7 during non-peak hours. Availability: Anytime except 2–6 AM ET. Ends: April 30.

Z.ai

@Zai_org

5:28 PM · Mar 15, 2026

481

Read 63 replies

🧾 More sources

TL;DR2 tweets

Top-line launch facts: positioning, rollout timing, availability surfaces, pricing, and initial capacity policy.

What shipped for agent workflows2 tweets

This group covers the model's intended use case and the concrete capabilities highlighted for agentic execution rather than chat-first usage.

Where you can run it and what it costs2 tweets

This group captures launch surfaces, commercial details, and downstream tool availability relevant to implementation decisions.

What to watch in rollout and capacity1 tweets

This group covers staged access, early-access paths, temporary usage boosts, and the closed-source status engineers need to account for.