Fresh stories
Briefs forJune 25
Top storiesthis week
Vercel AI Gateway adds GLM-5.2 Fast at 150-250 tok/s
Vercel and Wafer launched a serverless GLM-5.2 endpoint on AI Gateway with 1M context and published pricing. Teams get a high-throughput open-model option inside an existing gateway instead of managing GLM inference directly.


GLM-5.2 adds Perplexity Agent API and Droid support on Baseten at >280 TPS
GLM-5.2 added Perplexity Agent API, Droid, and more hosting options, while Baseten reported over 280 TPS and sub-0.8s TTFT. Builders should watch the cost and benchmark data as it moves into production agent stacks.

GLM-5.2 ranks #1 on DeepSWE with 44% pass@1
Independent results put GLM-5.2 at the top of the open-model DeepSWE board and near the top on debate and post-train evals. Watch token use and long reasoning traces, which can offset its headline price advantage.

Wafer claims GLM-5.2 hits 222 tok/s and 12.6s end-to-end
Wafer said its GLM-5.2 deployment leads Artificial Analysis on throughput and latency, and priced usage at $1.20 input and $4.10 output per million tokens. Compare serverless and dedicated endpoints if you need speed at scale.

ComputeSDK releases 2026 100k Scale Invitational results across 6 sandbox providers
ComputeSDK published results from its 2026 100k Scale Invitational after weeks of reruns and infra tuning across Modal, Tensorlake, Northflank, Declaw AI, E2B, and Isorun. It matters because sandbox and agent infra claims now have a shared public concurrency target instead of vendor-specific load demos.

Daily AI Digest
Get the best stories delivered
to your inbox
Skills Spotlighttop by stars
creative-ideation
Generate ideas via named methods from creative practice.
baoyu-comic
Knowledge comics (知识漫画): educational, biography, tutorial.
comfyui
Generate images, video, and audio with ComfyUI — install, launch, manage nodes/models, run workflows with parameter injection. Uses the official comfy-cli for lifecycle and direct REST/WebSocket API for execution.


