GLM-5.2 ships to BrowserCode, Hyper, OpenCode, and Together in 3 days
BrowserCode, Hyper, OpenCode, Together, and other vendors added GLM-5.2 soon after release. That turns the open model into a deployable option across coding, browser automation, and hosted chat.

TL;DR
- Within three days of release, GLM-5.2 had already landed in Hyper, OpenCode, BrowserCode, Together, Ollama Cloud, and Droid, according to charmcli's Hyper post, opencode's leaderboard post, browser_use's BrowserCode post, togethercompute's Together AI post, ollama's Ollama launch post, and RayFernando1337's Droid post.
- The interesting shift is not just availability. browser_use's BrowserCode post put GLM-5.2 into a browser agent harness for a $0.18 task, while browser_use's website-design post paired the text-only model with multimodal QA subagents and kept build plus review under $0.75.
- Providers immediately turned distribution into a speed contest: wafer_ai's provider benchmark claimed 222 output tok/s and 12.6s end-to-end latency, while togethercompute's OpenRouter post highlighted throughput gains on OpenRouter.
- The model's biggest caveat stayed visible through the rollout. browser_use's reply about image support said GLM-5.2 does not support images, while teortaxesTex's ZCode note and cedric_chee's MCP reply pointed to products patching that gap with separate vision tools over MCP.
- The ecosystem moved fast enough that GLM-5.2 was usable across hosted chat, coding agents, local quantized runs, and Claude Code shims almost immediately, per vipulved's hosted chat post, UnslothAI's local run guide, and multimodalart's Claude Code setup.
You can browse Hyper's model page, check wafer's Hermes setup docs, wire it into Claude Code via Hugging Face's router, and even run a 2-bit local build from Unsloth's GGUF release. The weird bit is that a nominally blind coding model still showed up in visual workflows because BrowserCode and ZCode wrapped it with separate vision loops, according to browser_use's website-design post and cedric_chee's ZCode screenshot.
Rollout speed
The headline is distribution. GLM-5.2 shipped as an open-weight coding model, then immediately spread across the usual agent surfaces instead of sitting in a model card waiting for tooling to catch up.
By the end of the window, the rollout already covered a few distinct surfaces:
- Coding terminal and hosted model access in Hyper, with 1M context, zero data retention, and an MIT license called out by charmcli's Hyper post
- OpenCode leaderboard placement, where opencode's leaderboard post said the model reached sixth in three days
- Hosted chat on Together infrastructure, via vipulved's chat app post
- Cloud distribution through Ollama, which ollama's Ollama launch post wired into Claude Code, Codex App, Hermes Agent, and chat
- Agent products like Droid, where RayFernando1337's Droid post and FactoryAI's availability reply confirmed support
That pace matters because open models usually arrive in pieces. GLM-5.2 showed up as something you could actually route through existing harnesses.
BrowserCode
Browser Use made the clearest case for GLM-5.2 as a cheap agent component instead of a prestige benchmark entry.
Two separate patterns showed up in those posts:
- Cheap browser-agent runs: browser_use's BrowserCode post reported a near Opus-level score for a BrowserCode task at $0.18.
- Text model plus visual QA: browser_use's website-design post said GLM-5.2 beat Claude Fable 5 at website design by pairing the model with Browser Use v2 multimodal QA subagents.
- Division of labor: per browser_use's website-design post, the subagents reviewed the site, judged aesthetics, and sent back targeted fixes.
The catch is right in the thread. browser_use's reply about image support said GLM-5.2 does not support images, so the impressive visual results are really harness results: text generation on one side, vision-based verification on the other.
Providers
Inference vendors treated GLM-5.2 as a serving benchmark on day one.
The concrete numbers in the evidence were unusually specific:
- wafer_ai's provider benchmark claimed 222 output tok/s, versus 173 for the next best provider
- The same wafer_ai's provider benchmark put Wafer at 12.6 seconds end-to-end, versus 16.9 for the runner-up
- togethercompute's OpenRouter post said Together's serving path was tuned for long-context coding and agent workloads, and attached an OpenRouter throughput chart
- Demand spiked fast enough that wafer_ai's capacity reply, wafer_ai's second capacity reply, and ollama's capacity reply were already talking about adding compute
Privacy also became part of the packaging. charmcli's Hyper post advertised zero data retention, wafer_ai's ZDR reply said ZDR could be enabled, and vipulved's logging reply said his Together-backed chat app logged requests unless users switched to anonymous mode.
Vision gap
The rollout also made GLM-5.2's main product gap easier to see.
Three different workarounds surfaced:
- jeremyphoward's vision caveat called blindness the one big gap
- cedric_chee's MCP reply said ZCode's built-in
analyze_imagetool connects to a Z.ai vision MCP server that provides GLM-4.6V - teortaxesTex's ZCode note described the same pattern from the product side, saying the subscription experience "has vision" by calling GLM-4.5V via MCP
- cedric_chee's ZCode screenshot showed the trajectory view exposing that tool call inside a long-running task
So the integration story is slightly messier than the hype cycle implied. GLM-5.2 spread everywhere, but some of the nicest demos were composite systems, not pure model capability.
Harnesses
A lot of the early excitement came from how easily people were slotting GLM-5.2 into existing harnesses.
A quick inventory from the evidence:
- ollama's Ollama launch post exposed launch commands for Claude Code, Codex App, Hermes Agent, and chat
- multimodalart's Claude Code setup showed a minimal Hugging Face router shim for running
claude --model "zai-org/GLM-5.2" - _lewtun's ML Intern feature list said ML Intern moved to GLM-5.2 and added a YOLO mode for autonomous AI R&D
- dabit3's Devin post listed GLM-5.2 alongside frontier models inside Devin's subscription surfaces
- MaximeRivest's Pi + OpenRouter reply showed Pi plus OpenRouter as another path
The model was getting adopted less like a single app and more like a part you can snap into whatever harness already owns your tools, filesystem, and approval flow.
Local and Claude Code
The last wrinkle is that GLM-5.2 did not stay cloud-only for long.
Two access paths stood out:
- UnslothAI's local run guide said a 2-bit GGUF compressed the model from 1.51 TB to 238 GB while retaining roughly 82 percent accuracy, which is still huge but no longer absurd for 256 GB RAM or mixed RAM/VRAM setups
- aibuilderclub_'s Claude Code env vars published a full environment-variable recipe for swapping Claude Code onto
glm-5.2[1m], including auto-compaction and subagent settings - _akhaliq's Hugging Face promo post noted that Hugging Face Inference Providers briefly made GLM-5.2 free across Zai, Together AI, Novita, Fireworks, and DeepInfra
That combination, local weights if you want control, API shims if you want convenience, is why this rollout felt bigger than a normal provider pickup.