Skip to content
AI Primer
release

Mistral releases Medium 3.5 with 128B weights, 256K context, and Work Mode

Mistral shipped Medium 3.5 as a 128B dense model with 256K context, configurable reasoning, remote agents in Vibe, and Work Mode in Le Chat. The release broadens Mistral’s agent stack, though early comparisons question its price-performance against newer open rivals.

5 min read
Mistral releases Medium 3.5 with 128B weights, 256K context, and Work Mode
Mistral releases Medium 3.5 with 128B weights, 256K context, and Work Mode

TL;DR

  • Mistral's launch post says the company shipped Mistral Medium 3.5 in public preview as a dense 128B model with a 256K context window, configurable reasoning effort, open weights, and a modified MIT license, with the model card listing API pricing at $1.5 per million input tokens and $7.5 per million output tokens.
  • According to Mistral's launch post, Medium 3.5 is now the default model in Le Chat and Mistral Vibe, while the Hugging Face model card says it replaces Mistral Medium 3.1 and Magistral in Le Chat, plus Devstral 2 in Vibe.
  • MistralAI's post and Official launch details both frame the bigger product story as remote coding agents in Vibe, including cloud runs launched from the CLI and a "teleport" path that moves a live local session into a remote runtime without losing state.
  • testingcatalog's Le Chat screenshot and WesRoth's post about Work mode show Le Chat adding a new Work mode preview, where Mistral says the agent can use multiple tools in parallel across connected apps and keep long multi-step tasks running beyond a normal chat reply.
  • Early reaction split fast: kimmonismus argued the dense 128B design looks like an enterprise reliability play, while eliebakouch's pricing chart and maximelabonne questioned the price-performance story and the benchmark framing against newer open rivals.

You can read Mistral's full launch post, skim the official model card, and check the Hugging Face page where Mistral spells out the unified-model story. The release also exposed some useful edges fast: lmsysorg's SGLang post showed day-zero serving support with Mistral-specific tool and reasoning parsers, winglian's repost of Axolotl support claimed single-GPU QLoRA fine-tuning landed immediately, and AiBattle_'s prelaunch spot caught the model in a vLLM pull request before the announcement.

Mistral Medium 3.5

Mistral is calling this its first "flagship merged model": one dense 128B set of weights for instruction following, reasoning, and coding, with text and image input, text output, and a 256K context window. In the official announcement, the company says reasoning effort is configurable per request, so the same model can run in quick-response or longer test-time-compute modes.

The Hugging Face model card adds two concrete product changes that were easy to miss in the social posts: Medium 3.5 replaces Mistral Medium 3.1 and Magistral in Le Chat, and it replaces Devstral 2 in Vibe. lmsysorg also highlighted the serving hooks Mistral shipped around it, including native function calling, JSON output, and a reasoning parser already wired up in SGLang.

Remote agents in Vibe

The Vibe update is the part that feels like Christmas come early for coding agent nerds. Mistral's launch post says coding sessions can now run in cloud sandboxes, in parallel, from either the Vibe CLI or Le Chat.

The mechanics are concrete:

  • Remote sessions surface file diffs, tool calls, progress states, and questions while they run, per the official post.
  • A local CLI session can be "teleported" to the cloud with history, task state, and approvals preserved, according to WesRoth's post and Mistral's announcement.
  • Each run gets an isolated sandbox, and finished work can come back as a GitHub pull request, per the official post.
  • Mistral says Vibe plugs into GitHub, Linear, Jira, Sentry, Slack, and Teams, turning the coding agent into a harness around the rest of the engineering stack.

Work mode in Le Chat

Le Chat now has four visible operating modes in the UI shown by testingcatalog's screenshot: Fast, Think, Work, and Research. Work is in preview, and Mistral says in the launch post that it runs on a new harness powered by Medium 3.5.

Mistral's description of Work mode breaks into three buckets:

  1. Cross-tool workflows across email, messages, calendars, and connected apps.
  2. Research and synthesis across the web, internal docs, and workplace tools.
  3. Operational tasks like inbox triage, Jira issue creation, and Slack summaries.

The more interesting implementation detail is that connectors are on by default in Work mode, not added one by one, and the agent can call several tools in parallel. Mistral also says sensitive actions still require explicit approval based on user permissions.

Pricing and benchmark pushback

Mistral's own chart, shown in scaling01's benchmark screenshot, puts Medium 3.5 at 77.6 on SWE-Bench Verified, 91.4 on τ³ Telecom, and 95.8 on Collie, while the official post says the model was built for long-horizon tasks, multi-tool reliability, and structured output.

Reaction focused less on the raw scores than on what Mistral chose to optimize. kimmonismus read the dense 128B design and the standout Collie score as a reliability-first enterprise pitch, not a race for the biggest reasoning number.

The criticism landed on two fronts:

  • eliebakouch's chart argued the API price, $1.5 in and $7.5 out per million tokens, looks expensive against newer open competitors with larger context windows.
  • maximelabonne questioned benchmark selection, noting that several compared models shipped before Tau3-Bench existed.
  • suchenzang separately pointed out that Mistral's chart suddenly made the Collie benchmark much more visible than it usually is.

A small but useful correction: the official docs and launch post both list a 256K context window, even though eliebakouch's chart labeled it as 128K.

vLLM and ecosystem support

Medium 3.5 started leaking into toolchains before the formal launch. AiBattle_ linked a vLLM pull request that named both mistralai/Mistral-Medium-3.5-128B and the -EAGLE speculative draft model a day early, and scaling01's screenshot captured those identifiers in the diff.

Day-one integrations piled up right after. lmsysorg announced SGLang support with Mistral-specific tool-call and reasoning parsers, winglian's repost of Axolotl support said fine-tuning support had already landed, and the pi-mono issue link pointed to a GitHub issue where maintainers were already discussing Medium 3.5 support and API quirks around reasoning mode.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 4 threads
TL;DR2 posts
Mistral Medium 3.51 post
Pricing and benchmark pushback3 posts
vLLM and ecosystem support3 posts
Share on X