LLM Serving GPU Infrastructure Inference Optimization Local Inference MiniMax Benchmarks Coding Agents Enterprise Adoption

NVIDIA NIM

Deploy AI models with optimized inference microservices

Visit site

A NVIDIA software product for deploying optimized AI inference as containerized microservices with standard APIs across clouds, data centers, and workstations.

Recent stories

2 linked stories

releaseSECONDARY2026-04-11

MiniMax releases M2.7 open model with 56.22% SWE-Pro and 57.0% Terminal Bench 2

MiniMax open-sourced M2.7 and published coding and agent benchmark claims including 56.22% SWE-Pro and 57.0% Terminal Bench 2. Day-zero support from SGLang, vLLM, Ollama Cloud, Together AI, and NVIDIA NIM makes it easy to try on common serving stacks.

newsSECONDARY2026-03-16

NVIDIA launches Nemotron Coalition with Mistral, LangChain, and Perplexity

NVIDIA introduced a coalition of labs and platform vendors to co-develop open frontier models, including Mistral, LangChain, Perplexity, Cursor, Reflection, Sarvam, and Black Forest Labs. Watch it if you want open-model efforts tied to DGX Cloud, NIM, and production tooling instead of weights alone.