updateApril 3, 2026

OpenRouter says Qwen3.6-Plus hits 1.4T tokens in a day

OpenRouter said Qwen3.6-Plus became its first model to exceed about 1.4 trillion tokens in a day, and Qwen said the model also moved to No. 1 on the service. The milestone adds a concrete deployment signal beyond benchmark scores and preview availability, so track usage data alongside evals.

4 min read

OpenRouter says Qwen3.6-Plus hits 1.4T tokens in a day

TL;DR

OpenRouter said Qwen3.6-Plus became the first model on its platform to clear 1 trillion tokens in a single day, landing around 1.4 trillion on April 3.
Alibaba Qwen said the same release also moved to No. 1 on OpenRouter, giving the launch a deployment signal that sits alongside benchmark claims.
According to Wes Roth's benchmark roundup and Alibaba's official launch post, Qwen3.6-Plus ships with a 1 million token context window, a hybrid linear-attention plus sparse-MoE architecture, and a reported 78.8 on SWE-bench Verified.
Arena's launch tweet and the Arena leaderboard changelog show Qwen3.6-Plus moved into live coding comparisons almost immediately after release.

Alibaba's launch post quietly includes a preserve_thinking API flag for multi-step agent runs. The OpenRouter model page lists the production model as free with 1M context and a 65.5K max output. Arena's Code leaderboard changelog added the preview build on April 2, then Code Arena opened public head-to-head testing on agentic web app tasks a day later.

OpenRouter's first trillion-token day

The cleanest new datapoint here is usage, not another synthetic eval. OpenRouter said Qwen3.6-Plus processed about 1.4 trillion tokens in one day, the first model on the service to cross that mark, while Qwen's own account said the model had already climbed to the top spot on the router.

That makes the story more interesting than a benchmark drop. OpenRouter's model page shows the release went live on April 2 with a free tier, 1M context, and routing across providers that can handle long prompts, which helps explain how a brand new model could ramp so fast.

Benchmarks and architecture

Alibaba's official technical launch post and Wes Roth's summary line up on the main specs:

1 million tokens of context by default
hybrid architecture combining linear attention with sparse mixture-of-experts routing
stronger agentic coding, including repository-scale tasks
upgraded multimodal work on GUIs, documents, video, and visual coding
reported 78.8 on SWE-bench Verified

The launch post adds a detail that did not make it into the social victory lap: Alibaba says its SWE-bench runs used an internal scaffold with bash and file-edit tools, while Terminal-Bench 2.0 used the Harbor/Terminus-2 harness. That makes the headline numbers more like system results than raw model scores.

Preview to public arena in 72 hours

The rollout moved fast. Arena's leaderboard changelog says qwen3.6-plus-preview hit the Code leaderboard on April 2. By April 3, Arena was pushing Qwen3.6-Plus into live pairwise battles on real-world web development tasks, including shareable HTML and React app generation.

That matters because it gives the model two very different public proving grounds at once: router traffic on one side, human preference voting on agentic web-dev tasks on the other. The Code Arena page is the public surface for that second test.

preserve_thinking and assistant hooks

Alibaba tucked one of the more useful implementation details into the API section of its launch post: preserve_thinking can retain reasoning content from earlier turns, and the company says that can improve consistency on agent tasks while reducing redundant reasoning tokens.

The same post says Qwen3.6-Plus is available through Model Studio with both OpenAI-compatible and Anthropic-compatible APIs, and names OpenClaw, Claude Code, Qwen Code, Kilo Code, Cline, and OpenCode as supported coding assistants. Even the stray OpenClaw repost in the evidence set shows how quickly that tool had become part of the surrounding conversation, although in that case it was being used for prompt-injection testing rather than benchmarking Qwen itself.

TL;DR

OpenRouter's first trillion-token day

Benchmarks and architecture

Preview to public arena in 72 hours

preserve_thinking and assistant hooks

Discussion across the web