Skip to content
AI Primer
update

MiniMax M3 adds OpenCode, Hermes Agent, Atomic Chat, and Vercel AI Gateway support

A day after MiniMax M3 launched, OpenCode, Hermes Agent, Flowith, Atomic Chat, Kilo Code, Cloudflare AI Gateway, and Vercel AI Gateway shipped support. That breadth shows M3 plugged into agent harnesses and routing layers immediately, not just its own API.

6 min read
MiniMax M3 adds OpenCode, Hermes Agent, Atomic Chat, and Vercel AI Gateway support
MiniMax M3 adds OpenCode, Hermes Agent, Atomic Chat, and Vercel AI Gateway support

TL;DR

You can read MiniMax's official announcement, inspect Vercel's AI Gateway changelog, and check the live Vercel model page. The weirder reveal is that the launch story was not just "new model, new API": OpenCode had free access first, Hermes auto-added it to model pickers, and Context Arena immediately found that the advertised 1M window was not fully exposed in practice.

OpenCode and Hermes Agent

OpenCode and Hermes made the quickest point about M3's launch: this model showed up inside existing agent harnesses, not just on MiniMax's own surface.

OpenCode's post went up on May 31 and said users could try M3 "right now" for free. Hermes followed by wiring M3 into Nous Portal, OpenRouter, and direct MiniMax providers, with Teknium's Hermes Agent post noting that it appeared in the model picker automatically.

That matches the product framing in MiniMax's launch post, which pushes M3 as a coding and agent model first, then pairs it with MiniMax Code as the preferred agent product. The external integrations made that claim feel real faster than a benchmark chart would.

Atomic Chat and Flowith

Atomic Chat and Flowith turned the launch into something you could poke at from two different angles: a multimodal build demo and a side by side comparison UI.

According to testingcatalog's Atomic Chat demo, Atomic tested M3 on a task that started with a hand-drawn napkin sketch and ended with game logic, UI, and a playable HTML platformer in one pass for $0.028. That is promotional material, but it is still a concrete demo of the multimodal coding pitch.

Testingcatalog's companion post described M3 as MiniMax's latest model with 1M-token context, native multimodality, agentic reasoning, and Sparse Attention, while flowith's support post stressed 1M context and compare mode. Together they show where M3 was being positioned on day one: not as a chatbot replacement, but as a model for agent runs and multimodal coding workflows.

AI Gateway and routing layers

The gateway rollout was unusually broad for day one, and it matters because gateways decide where a model becomes one fetch away instead of one vendor integration away.

Vercel's changelog says M3 is available through AI Gateway with the slug minimax/minimax-m3, and describes support for image input, software engineering tasks, terminal tool use, and agentic web browsing in the official announcement. The Vercel model page also lists 1M context and pricing that resolves to $0.60 per million input tokens and $2.40 per million output tokens before the launch discount.

The surrounding posts fill in the distribution map quickly:

This is Christmas-come-early behavior for agent infrastructure people. M3 did not wait for one canonical endpoint to win distribution. It hit harnesses, gateways, and comparison tools all at once.

Benchmarks and early hands-on reports

MiniMax's official blog anchored the launch around coding and agent benchmarks, and the ecosystem repeated those numbers almost verbatim.

The official benchmark set from MiniMax's launch post and WesRoth's benchmark summary breaks down like this:

  • SWE-Bench Pro: 59.0%
  • Terminal-Bench 2.1: 66.0%
  • SWE-fficiency: 34.8%
  • KernelBench Hard: 28.8%
  • MCP Atlas: 74.2%

Vercel added a different kind of eval signal. According to vercel_dev's Next.js evals post, M3 ranked as the highest open-source model on Next.js evals at #6 overall, with a 75% base score and 96% with AGENTS.md.

The early user tone was more skeptical than celebratory. bridgemindai's hands-on post said benchmarks get "benchmaxed" and started a live test on a real codebase with real TODOs and no cherry-picking. That reaction matters because it is exactly how coding model launches get stress-tested now: not by asking whether the chart is high, but by dropping the model into an existing harness and seeing whether it survives real repo work.

The 1M context caveat

The sharpest new finding on day one was not another integration. It was that the advertised 1M context window was not fully reachable everywhere M3 was already being routed.

Context Arena tested M3 on 8-needle GDM-MRCRv2 and reported a big jump over M2.7 through the 8k to 64k range, with AUC at 128k rising from 25.2% to 39.2%. The same post said OpenRouter was exposing only about 524k max context, which blocked 512k and 1M runs in that setup.

The follow-up was blunter. According to DillonUzar's follow-up repost, pingToven said the MiniMax API itself did not currently serve the full 1M context and would be updated once available. That leaves an awkward but useful split in the launch record: MiniMax's official announcement sells 1M context as a defining property, while the first public eval thread found that the full window was not yet exposed end to end.

That does not erase the rollout story. It sharpens it. M3 arrived everywhere at once, but one of its flagship specs was still catching up to the distribution footprint.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 5 threads
TL;DR3 posts
OpenCode and Hermes Agent1 post
Atomic Chat and Flowith1 post
AI Gateway and routing layers4 posts
Benchmarks and early hands-on reports1 post
Share on X