NVIDIA released Nemotron 3 Super, a 120B open model with 12B active parameters and a 1M-token window, on OpenRouter with free access. Evaluate it for low-cost agent backends, especially if you need local or self-hosted deployment options.

The practical news is simple: Nemotron 3 Super is already callable through OpenRouter, and Teknium's Hermes setup shows one immediate path into agent workflows by pasting nvidia/nemotron-3-super-120b-a12b:free into Hermes Agent's custom model field. That makes this less of a research release and more of a drop engineers can test today.
According to the OpenRouter page, the model is a 120B open hybrid MoE system with only 12B parameters active at inference, a 1M-token context window, and multi-token prediction aimed at long-context reasoning and multi-step planning. The same listing says it is released with weights, datasets, and recipes under the NVIDIA Open License, and reports roughly 28 tokens/sec average throughput alongside benchmark strength on AIME 2025, TerminalBench, and SWE-Bench benchmark summary.
The first concrete implementation signal is from OpenHands: its team says it had early access, that the model "works well," and that they are "excited to have a great new locally deployable LLM" early access note. That lines up with the release's strongest engineering angle: a big-context open model positioned for agent backends that teams may want to run outside closed hosted APIs.
The performance case is still mostly benchmark-driven, but it is specific enough to watch. Wes Roth's AA chart cites an Artificial Analysis score of 36 for Nemotron 3 Super versus 33 for gpt-oss-120B, and claims it is "roughly 10% faster per GPU," while OpenRouter's PinchBench post amplified a separate report that it is the best model on average on PinchBench for openclaw. Nathan Lambert's interview post also framed this release as "a LONG time coming," pointing to NVIDIA's broader open-model push rather than a one-off model drop.
Run Nemotron as your agent driver in Hermes Agent for free with OpenRouter: openrouter.ai/nvidia/nemotro… Just type `hermes model`, select OpenRouter, and click custom model name, and put: nvidia/nemotron-3-super-120b-a12b:free
NVIDIA releases Nemotron-3-Super, a new 120B open hybrid MoE model. Nemotron-3-Super-120B-A12B has a 1M-token context window and achieves competitive agentic coding and chat performance. Run on ~64GB RAM. GGUF: huggingface.co/unsloth/NVIDIA… Guide: unsloth.ai/docs/models/ne…
Want to see where OpenHands is headed next? 👀 Join our call TODAY. We will be presenting our roadmap and want feedback from YOU. RSVP below 👇️
Nemotron 3 Super is the new "Gold Standard" for open-weight intelligence, hitting a 36 on the Intelligence Index while remaining highly efficient. It is smarter than GPT-OSS-120B while being roughly 10% faster per GPU.
NVIDIA has released Nemotron 3 Super, a 120B (12B active) open weights reasoning model that scores 36 on the Artificial Analysis Intelligence Index with a hybrid Mamba-Transformer MoE architecture We were given access to this model ahead of launch and evaluated it across