releaseMarch 12, 2026

NVIDIA releases Nemotron 3 Super on OpenRouter with 1M context and free access

NVIDIA released Nemotron 3 Super, a 120B open model with 12B active parameters and a 1M-token window, on OpenRouter with free access. Evaluate it for low-cost agent backends, especially if you need local or self-hosted deployment options.

Coding Agents Multi-Agent Systems Benchmarks

3 min read

NVIDIA releases Nemotron 3 Super on OpenRouter with 1M context and free access

TL;DR

OpenRouter's free listing now exposes NVIDIA Nemotron 3 Super at a free endpoint, and Teknium's Hermes setup says Hermes Agent users can already select it as a custom OpenRouter model.
The OpenRouter page describes Nemotron 3 Super as a 120B open model with 12B active parameters, a hybrid Mamba-Transformer design, and a 1M-token context window for long-horizon agent tasks model page.
Early adopters are plugging it into agent tooling fast: OpenHands' early access note says it got early access, that it "works well," and calls it "a great new locally deployable LLM."
Benchmark positioning is the main pitch: the AA chart shared by Wes Roth places Nemotron 3 Super above gpt-oss-120B on Artificial Analysis' open-weight index, while OpenRouter's PinchBench post also boosted a report that it leads PinchBench on average for openclaw.

What shipped on OpenRouter?

Teknium (e/λ)

@Teknium

·Follow

Run Nemotron as your agent driver in Hermes Agent for free with OpenRouter: openrouter.ai/nvidia/nemotro… Just type `hermes model`, select OpenRouter, and click custom model name, and put: nvidia/nemotron-3-super-120b-a12b:free

Unsloth AI

@UnslothAI

NVIDIA releases Nemotron-3-Super, a new 120B open hybrid MoE model. Nemotron-3-Super-120B-A12B has a 1M-token context window and achieves competitive agentic coding and chat performance. Run on ~64GB RAM. GGUF: huggingface.co/unsloth/NVIDIA… Guide: unsloth.ai/docs/models/ne…

7:20 AM · Mar 12, 2026

328

Read 19 replies

The practical news is simple: Nemotron 3 Super is already callable through OpenRouter, and Teknium's Hermes setup shows one immediate path into agent workflows by pasting nvidia/nemotron-3-super-120b-a12b:free into Hermes Agent's custom model field. That makes this less of a research release and more of a drop engineers can test today.

According to the OpenRouter page, the model is a 120B open hybrid MoE system with only 12B parameters active at inference, a 1M-token context window, and multi-token prediction aimed at long-context reasoning and multi-step planning. The same listing says it is released with weights, datasets, and recipes under the NVIDIA Open License, and reports roughly 28 tokens/sec average throughput alongside benchmark strength on AIME 2025, TerminalBench, and SWE-Bench benchmark summary.

Where is it landing in agent stacks?

OpenHands

@OpenHandsDev

·Follow

Want to see where OpenHands is headed next? 👀 Join our call TODAY. We will be presenting our roadmap and want feedback from YOU. RSVP below 👇️

2:05 PM · Mar 12, 2026

Read 2 replies

The first concrete implementation signal is from OpenHands: its team says it had early access, that the model "works well," and that they are "excited to have a great new locally deployable LLM" early access note. That lines up with the release's strongest engineering angle: a big-context open model positioned for agent backends that teams may want to run outside closed hosted APIs.

Wes Roth

@WesRoth

·Follow

Nemotron 3 Super is the new "Gold Standard" for open-weight intelligence, hitting a 36 on the Intelligence Index while remaining highly efficient. It is smarter than GPT-OSS-120B while being roughly 10% faster per GPU.

Artificial Analysis

@ArtificialAnlys

NVIDIA has released Nemotron 3 Super, a 120B (12B active) open weights reasoning model that scores 36 on the Artificial Analysis Intelligence Index with a hybrid Mamba-Transformer MoE architecture We were given access to this model ahead of launch and evaluated it across

5:00 PM · Mar 12, 2026

Read 3 replies

The performance case is still mostly benchmark-driven, but it is specific enough to watch. Wes Roth's AA chart cites an Artificial Analysis score of 36 for Nemotron 3 Super versus 33 for gpt-oss-120B, and claims it is "roughly 10% faster per GPU," while OpenRouter's PinchBench post amplified a separate report that it is the best model on average on PinchBench for openclaw. Nathan Lambert's interview post also framed this release as "a LONG time coming," pointing to NVIDIA's broader open-model push rather than a one-off model drop.

🧾 More sources

TL;DR1 tweets

High-level summary of the release, access path, and early benchmark and tooling signals.

Where is it landing in agent stacks?2 tweets

Early practitioner uptake and benchmark-based positioning for agent backends and local deployment scenarios.