Providers and agent platforms added GLM-5.1 endpoints across Modal, Together AI, Letta Code, Tembo, and Tabbit, with free trials, no-key access, and 99.9% SLA options. Use the new hosting options to test the model for coding and long-horizon agent workloads without waiting on self-hosting.

You can try the model on Modal's free endpoint, pull the exact API string and pricing from Together's serverless catalog, and browse Z.ai's migration guide for details like tool_stream=true, 200K context, and 128K max output. Letta's docs show the one-command letta server flow for always-on remote agents, while Tembo's docs frame GLM-5.1 as another model option inside existing coding-agent workflows.
The fastest way GLM-5.1 spread was plain old hosting. Modal's launch post says the endpoint is free to try for a month, and Modal's blog says its existing free GLM endpoint was upgraded to 5.1 on April 7.
Together's model page is more specific about what buyers actually get:
Together's serverless models doc also lists zai-org/GLM-5.1 with a 202,752-token context window at $1.40 per 1M input tokens and $4.40 per 1M output tokens. Together's highlight thread adds the production packaging: 99.9% SLA plus serverless and dedicated deployment options.
Letta turned the model drop into a workflow pitch. In the thread, Letta says you can switch to GLM-5.1 with /model, then run agents in remote environments so they keep memory and state wherever they execute.
Letta's product post says the same agent can move between machines inside one conversation, carrying conversation history and context repositories with it. The remote environments docs reduce setup to letta server, which registers a local machine so the agent stays reachable from chat.letta.com, another computer, or a phone while still editing files and running shell commands locally.
Tembo's angle was access. Tembo says GLM-5.1 is live only inside its OpenCode and Pi agents and does not require an API key. Tembo's models page explains the split: workspaces can use Tembo-hosted models or bring their own provider keys, so no-key access is a product choice, not a separate public API.
Tabbit showed up the same day through Z.ai's repost, which is a small but telling detail. GLM-5.1 is already being packaged less like a single endpoint and more like a default model option inside browsers, hosted inference platforms, and stateful coding agents.