OpenRouter said Qwen3.6-Plus became its first model to exceed about 1.4 trillion tokens in a day, and Qwen said the model also moved to No. 1 on the service. The milestone adds a concrete deployment signal beyond benchmark scores and preview availability, so track usage data alongside evals.

Alibaba's launch post quietly includes a preserve_thinking API flag for multi-step agent runs. The OpenRouter model page lists the production model as free with 1M context and a 65.5K max output. Arena's Code leaderboard changelog added the preview build on April 2, then Code Arena opened public head-to-head testing on agentic web app tasks a day later.
The cleanest new datapoint here is usage, not another synthetic eval. OpenRouter said Qwen3.6-Plus processed about 1.4 trillion tokens in one day, the first model on the service to cross that mark, while Qwen's own account said the model had already climbed to the top spot on the router.
That makes the story more interesting than a benchmark drop. OpenRouter's model page shows the release went live on April 2 with a free tier, 1M context, and routing across providers that can handle long prompts, which helps explain how a brand new model could ramp so fast.
Alibaba's official technical launch post and Wes Roth's summary line up on the main specs:
The launch post adds a detail that did not make it into the social victory lap: Alibaba says its SWE-bench runs used an internal scaffold with bash and file-edit tools, while Terminal-Bench 2.0 used the Harbor/Terminus-2 harness. That makes the headline numbers more like system results than raw model scores.
The rollout moved fast. Arena's leaderboard changelog says qwen3.6-plus-preview hit the Code leaderboard on April 2. By April 3, Arena was pushing Qwen3.6-Plus into live pairwise battles on real-world web development tasks, including shareable HTML and React app generation.
That matters because it gives the model two very different public proving grounds at once: router traffic on one side, human preference voting on agentic web-dev tasks on the other. The Code Arena page is the public surface for that second test.
Alibaba tucked one of the more useful implementation details into the API section of its launch post: preserve_thinking can retain reasoning content from earlier turns, and the company says that can improve consistency on agent tasks while reducing redundant reasoning tokens.
The same post says Qwen3.6-Plus is available through Model Studio with both OpenAI-compatible and Anthropic-compatible APIs, and names OpenClaw, Claude Code, Qwen Code, Kilo Code, Cline, and OpenCode as supported coding assistants. Even the stray OpenClaw repost in the evidence set shows how quickly that tool had become part of the surrounding conversation, although in that case it was being used for prompt-injection testing rather than benchmarking Qwen itself.