Skip to content
AI Primer

An AI model-routing and API platform that provides a single interface for accessing many third-party AI models.

Screenshot of OpenRouter website

Recent stories

32 linked stories
newsPRIMARY2026-05-26
OpenRouter raises $113M Series B as weekly volume hits 25T tokens

OpenRouter announced a $113M Series B led by CapitalG and said weekly routed volume grew from 5T to 25T tokens in six months. The funding matters because the company is pitching itself as production infrastructure for multi-model deployments, not just an API convenience layer.

releaseSECONDARY2026-05-26
Warp Agent adds OpenRouter URLs and /model aliases for custom endpoints

Warp now lets agents connect directly to an OpenRouter endpoint and switch providers through remembered model aliases. The change reduces endpoint setup friction for teams routing across hosted models inside Warp Agent.

releaseSECONDARY2026-05-22
Warp adds BYOK to Warp Agent with OpenAI-compatible endpoints

Warp Agent now accepts user-supplied OpenAI, Anthropic, and Gemini keys plus OpenAI-compatible endpoints such as OpenRouter and DeepSeek. The change removes the paid-plan requirement for inference access and gives terminal users more routing options.

releaseSECONDARY2026-05-21
Qwen3.7 Max launches with 1M context, 35-hour autonomy, and 56.6 AA Index

Alibaba launched Qwen3.7 Max as its new flagship agent model with 1M context, stronger coding and reasoning scores, and cross-harness benchmarks. OpenRouter, Together, AI Gateway, and Kilo support it on day one, making it ready for immediate deployment.

releasePRIMARY2026-05-19
OpenRouter adds openrouter:web_search and Parallel results at $0.005 per request

OpenRouter replaced its old web plugin path with agentic web search and fetch tools that use a common schema across models. Migrate to the new tools if you need multi-search turns, domain filtering, or Parallel/exa-native routing.

releasePRIMARY2026-05-15
OpenRouter adds multi-key BYOK routing with fallback tiers

OpenRouter updated BYOK workspaces so teams can attach multiple provider keys, scope them to specific models or users, and choose prioritized versus fallback use. It changes how rate-limit isolation, dev and prod separation, and failover routing are handled inside one workspace.

newsSECONDARY2026-05-12
Claude Opus 4.7 opens fast mode with ~2.5x speed as Cursor, v0, Droid, and OpenRouter add support

Anthropic rolled fast mode for Opus 4.7 into Claude Code and tools including Cursor, v0, Droid, Conductor, and OpenRouter. Use it where latency matters, but watch pricing: Cursor disclosed a 6x multiplier and others treat it as premium.

releaseSECONDARY2026-05-12
Perceptron releases Mk1 with 2 FPS video reasoning, 32K context, and $0.15 per 1M input

Perceptron launched Mk1, a multimodal model for video and embodied reasoning with native 2 FPS video, 32K context, and structured spatial outputs. OpenRouter access and the low input price make it usable for deployment, not just demos.

releaseSECONDARY2026-05-10
Hermes Agent adds LINE gateway with `hermes update` support

Hermes Agent added an official LINE gateway and OpenRouter published Pareto Code setup docs while users shared Discord and mobile SSH/TUI workflows. The change matters because Hermes is moving from ranking chatter into more concrete distribution channels and repeatable operator setups.

releasePRIMARY2026-05-09
OpenRouter launches Pareto Code with min_coding_score tiers and Nitro routing

OpenRouter released Pareto Code, which routes requests to the cheapest coding model above a chosen score threshold and can re-rank for speed with Nitro. Use the API to trade cost against latency with benchmark-based routing controls.

newsSECONDARY2026-05-09
Hermes Agent reports No. 1 OpenRouter rank after v0.13.0

Nous said Hermes Agent hit No. 1 among AI apps on OpenRouter after v0.13.0 shipped and added credential pools for rotating provider keys. Independent posts also tracked migrations from OpenClaw and early routing support in the same stack.

releaseSECONDARY2026-05-07
Google releases Gemini 3.1 Flash Lite GA with 1M context and $0.25 input pricing

Google moved Gemini 3.1 Flash Lite from preview to GA, and OpenRouter added the model with 1 million context and low-cost multimodal pricing. The preview endpoint now has a shutdown schedule, and users should verify whether the GA model differs from the March preview.

releasePRIMARY2026-05-02
OpenRouter launches Response Caching with X-OpenRouter-Cache and 80-300 ms hits

OpenRouter added response caching across chat, responses, messages, and embeddings with per-key isolation, TTL controls, and cached stream replay. The beta matters because identical retries and test runs can return in milliseconds without provider charges or rate-limit hits.

releaseSECONDARY2026-04-30
Grok 4.3 drops to $1.25/$2.50 with 1M context

Provider and benchmark trackers listed Grok 4.3 with 1M context and lower token pricing, and OpenRouter and Venice exposed it through their APIs. The model undercuts Opus 4.7 and GPT-5.5 on price while independent evaluations show stronger legal and finance performance than general coding.

newsSECONDARY2026-04-29
Stripe Projects adds OpenRouter, Daytona, Vercel, and Render provisioning commands

Stripe Projects added agent-friendly provisioning commands for OpenRouter, Daytona, Vercel, Render, and related tools. That lets agents buy model access, sandboxes, and hosting from the terminal instead of dashboard-driven setup.

releaseSECONDARY2026-04-28
Nemotron 3 Nano Omni launches 30B-A3B multimodal model with 256K context

NVIDIA opened Nemotron 3 Nano Omni, a 30B-A3B model for text, image, audio, and video, with day-one serving support. That lets teams run one open model for perception-heavy agents instead of stitching separate components.

releaseSECONDARY2026-04-28
Poolside releases Laguna M.1 and XS.2 coding models with 225B/23B and 33B/3B MoEs

Poolside opened Laguna M.1 and Laguna XS.2 as its first public coding models, with Apache 2.0 weights and same-day provider support. That gives teams open coding models that can run locally or through standard serving stacks.

workflowPRIMARY2026-04-26
OpenRouter launches `create-headless-agent` for Bun-based multi-model CLIs

OpenRouter released a new skill and guide that scaffold a headless agent CLI on top of its Agent SDK. The template packages multi-model inference, tool calling, and Bun-based CLI setup into a reusable starting point.

releaseSECONDARY2026-04-26
Hermes Agent updates model lists via hosted JSON for Nous Portal and OpenRouter

Hermes now pulls provider model lists from hosted JSON so new releases appear without client updates. The same update batch also auto-switches to a local browser when an agent needs localhost access.

releaseSECONDARY2026-04-24
OpenAI opens GPT-5.5 API with 1M context and Responses support

OpenAI added GPT-5.5 and GPT-5.5 Pro to the API and Playground with 1M context and Responses support. Partners including OpenRouter, Perplexity, GitHub Copilot, Vercel, Warp, and Devin rolled it out the same day, widening access beyond Codex.

releasePRIMARY2026-04-23
OpenRouter launches Workspaces with BYOK and per-project routing controls

OpenRouter introduced Workspaces to separate API keys, BYOK, routing, plugins, and observability by environment or team. Billing stays unified at the account level while staging and production settings split cleanly.

releaseSECONDARY2026-04-23
DeepSeek releases V4-Pro and V4-Flash with 1M context and $0.14/M input

DeepSeek open-sourced V4-Pro and V4-Flash under MIT, with 1M context and aggressive Flash pricing. Day-one support in SGLang, vLLM, and OpenRouter pushes open-weight agentic coding closer to closed frontier models.

releaseSECONDARY2026-04-23
Tencent launches Hy3 preview with 295B/21B, 256K context, and day-one OpenRouter, vLLM, and SGLang support

Tencent open-sourced Hy3 preview, a 295B MoE with 21B active parameters and 256K context, then pushed it into OpenRouter, OpenCode, OpenClaw, vLLM, and SGLang immediately. That matters because engineers can test and deploy a new reasoning-agent model on day one instead of waiting for the runtime ecosystem to catch up.

releaseSECONDARY2026-04-22
Xiaomi MiMo-V2.5-Pro releases with 57.2 SWE-Bench Pro, 1M context, and OpenRouter access

Xiaomi’s MiMo-V2.5-Pro and MiMo-V2.5 arrived with million-token context windows, stronger coding and agentic claims, and immediate access through OpenRouter plus agent harnesses. The rollout adds another low-cost Chinese frontier model that engineers can route into coding workflows without waiting for a proprietary IDE deal.

newsSECONDARY2026-04-22
GitHub Copilot adds bring-your-own keys across Free, Pro, Business, and Enterprise

GitHub added bring-your-own-model keys to Copilot in VS Code, letting users connect local or cloud providers instead of only bundled models. Teams can keep the Copilot harness while routing prompts through approved backends such as LM Studio or OpenRouter.

newsPRIMARY2026-04-21
OpenRouter adds Firecrawl web search with full-page markdown grounding

OpenRouter added Firecrawl as a search provider, letting models ground responses in scraped full web pages instead of snippet-only search. The launch folds crawling into the existing plugin settings flow and includes a capped free plan on the Firecrawl side.

newsSECONDARY2026-04-20
Kimi K2.6 adds day-one support across vLLM, SGLang, Ollama, and OpenRouter

Kimi K2.6 shipped across vLLM, SGLang, OpenRouter, Baseten, Ollama, OpenCode, Hermes Agent, and Droid within hours of launch. That cuts the usual lag between model release and production trials, so mixed-provider agent stacks can test it sooner.

newsSECONDARY2026-04-11
Hermes Agent ranks #1 on OpenRouter for coding apps

Nous said Hermes became the top coding app on OpenRouter while shipping an OpenClaw migration patch, Telegram agent-to-agent messaging, and new memory controls. If you run long-lived agents, watch the migration path and memory settings before moving chats or skills hubs.

releaseSECONDARY2026-04-07
Z.ai releases GLM-5.1, a 744B open model with 58.4 SWE-Bench Pro and 8-hour agent runs

Z.ai released GLM-5.1, a 744B open model built for long-horizon agentic coding and ranked first among open systems on SWE-Bench Pro. Day-0 support in OpenRouter, Ollama, SGLang, vLLM, OpenCode, and local quantization paths makes it ready to test in existing stacks.

newsPRIMARY2026-04-03
OpenRouter says Qwen3.6-Plus hits 1.4T tokens in a day

OpenRouter said Qwen3.6-Plus became its first model to exceed about 1.4 trillion tokens in a day, and Qwen said the model also moved to No. 1 on the service. The milestone adds a concrete deployment signal beyond benchmark scores and preview availability, so track usage data alongside evals.

releaseSECONDARY2026-03-15
Z.ai releases GLM-5-Turbo with 202K context for OpenClaw-style agent workflows

Z.ai released GLM-5-Turbo as a faster GLM-5 variant for OpenClaw-style tool use, with 202K context, OpenRouter access, and higher off-peak limits. Try it as a cheaper speed tier for agent workflows, but benchmark completion quality on your own tasks before wider use.

releaseSECONDARY2026-03-12
NVIDIA releases Nemotron 3 Super on OpenRouter with 1M context and free access

NVIDIA released Nemotron 3 Super, a 120B open model with 12B active parameters and a 1M-token window, on OpenRouter with free access. Evaluate it for low-cost agent backends, especially if you need local or self-hosted deployment options.

AI PrimerAI Primer

Your daily guide to AI tools, workflows, and creative inspiration.

© 2026 AI Primer. All rights reserved.