MiniMax
MiniMax multimodal AI model family
MiniMax's multimodal AI model family.
Pricing
Model Intelligence
Recent stories
Kilo's Product Week bundle added Agent Manager for isolated git worktrees, Kilo Console beta, REVIEWS.md memory hooks, and a balance-based MiniMax M3 plan. The bundle puts parallel agent runs, browser control, and plan provisioning into one shipped release.
MiniMax published M3 weights on Hugging Face with 428B total parameters, 23B active parameters, 1M context, and multimodal support. Unsloth quickly added local GGUF builds, so teams can try 2-bit runs at 138GB RAM or VRAM and 3-bit at 165GB.
A seeded code-audit benchmark found MiniMax M3 and the cheapest Claude Opus 4.8 run each caught 13 of 17 planted bugs, but at sharply different cost. The results also showed models found different bugs, and higher reasoning settings did not reliably improve cost efficiency.
OpenClaw 2026.6.1 added a native Windows node host, a Skill Workshop for reviewable agent-learned skills, and Workboard orchestration. The update extends OpenClaw beyond Unix-heavy setups and moves more agent management into built-in tools.
A day after MiniMax M3 launched, OpenCode, Hermes Agent, Flowith, Atomic Chat, Kilo Code, Cloudflare AI Gateway, and Vercel AI Gateway shipped support. That breadth shows M3 plugged into agent harnesses and routing layers immediately, not just its own API.
A day after MiniMax M3 launched, independent testers posted mixed results: cheap demos and design tasks worked, but several coding runs stalled, broke features, or used more tokens than expected. New external numbers added nuance, with Context Arena falling sharply after 64k context and one DeepSWE run passing 15 of 113 tasks.
MiniMax shipped M3 with a 1M-token context window, native multimodal input, and frontier coding claims across SWE-Bench Pro, Terminal Bench, and MCP Atlas. It also appeared on OpenRouter, Ollama Cloud, Venice, Hermes, Cline, Together, and Arena on day one.
MiniMax started winding down its M2 series while previewing M3 and a new sparse-attention design with large long-context speedup claims. The teaser points to a fresh open-model race around block selection, GQA, and million-token serving efficiency.
MiniMax M2.7 moved from announcement to deployment, with GGUF guidance for 128 GB local systems and same-day availability on Together, Fireworks, Hugging Face, and ModelScope. Use the local and managed serving options now, but check the non-commercial license before adopting the 230B model.
MiniMax open-sourced M2.7 and published coding and agent benchmark claims including 56.22% SWE-Pro and 57.0% Terminal Bench 2. Day-zero support from SGLang, vLLM, Ollama Cloud, Together AI, and NVIDIA NIM makes it easy to try on common serving stacks.
Nous Research added MiniMax M2.7, Xiaomi’s MiMo V2 Pro, a SuperMemory plugin, and expanded Manim support to Hermes through partner integrations. The additions give users new hosted model options, a shared memory backend, and more complete technical-animation tooling to try in workflows.
MiniMax introduced a flat-rate Token Plan that covers text, speech, music, video, and image APIs under one subscription. It gives teams one predictable bill across modalities and can be used in third-party harnesses, not just MiniMax apps.
Skyler Miao said MiniMax M2.7 open weights are due in roughly two weeks, with updates tuned for agent tasks. Separate replies also confirm multimodal M3, so local-stack builders should watch both the drop and the benchmark setup.