releaseJune 23, 2026

Mistral releases OCR 4 with bounding boxes and 85.20 OlmOCRBench

Mistral OCR 4 adds layout-aware extraction with bounding boxes, block typing, and inline confidence across 170 languages. Use it through the API or self-hosted deployments when document pipelines need structure, citations, redaction, and chunking.

4 min read

Mistral releases OCR 4 with bounding boxes and 85.20 OlmOCRBench

TL;DR

Mistral shipped OCR 4 as a layout-aware document model that returns extracted text plus bounding boxes, block labels, and inline confidence scores across 170 languages, according to MistralAI's launch thread.
On Mistral's reported evals, MistralAI's head-to-head claim says blind human annotators preferred OCR 4 over every tested system with average win rates of 72%, while MistralAI's benchmark post puts it at 85.20 on OlmOCRBench.
The extra structure is the product: MistralAI's structure explainer ties region-level boxes, type labels, and confidence scores to citation grounding, redaction, chunking, and human review workflows.
Availability is broad on day one. MistralAI's availability post lists the API, Mistral AI Studio, Amazon SageMaker, Microsoft Foundry, upcoming Snowflake support, and a self-hosted single-container option.
Community reaction immediately focused on the benchmark framing, with bclavie's critique arguing Mistral's comparison image omitted stronger OlmOCR entries, then bclavie's follow-up noting Mistral may have failed to reproduce those models and simply communicated that badly.

You can jump from Mistral's launch page to the availability page, and the most useful product detail is not the headline score but the structured output schema that MistralAI frames around citations, redactions, and chunking. The handwritten-math demo in a calculus-to-LaTeX clip is also a good tell for where Mistral wants this used: ingestion pipelines that need machine-readable structure, not just raw text.

Structured extraction

OCR 4 is aimed at document understanding, not plain OCR. The model returns three extra layers alongside text:

Bounding boxes for each localized region, per MistralAI's launch thread
Block classification for elements like titles, tables, equations, and signatures, per MistralAI's structure explainer
Inline confidence scores by region, also from MistralAI's structure explainer

That output shape matters because it preserves where the text came from. In MistralAI's structure explainer, Mistral explicitly connects it to source-grounded citations, selective redaction, RAG chunking, and human-in-the-loop review.

The handwritten math example pushes the same point in a more specific direction. This demo clip shows OCR 4 turning handwritten calculus into LaTeX, which is a much narrower and more useful claim than generic "reads documents" marketing.

Benchmarks and eval caveats

Mistral's public performance claims break into two buckets:

Blind human preference on 600-plus real-world documents across 12-plus languages, with average win rates of 72%, according to MistralAI's head-to-head claim
Public benchmark leadership on OlmOCRBench at 85.20, according to MistralAI's benchmark post
The biggest gains showing up on rare and low-resource languages, according to MistralAI's multilingual claim

The benchmark graphic in

shows a tight top cluster on OlmOCRBench: Mistral OCR 4 at 85, Chandra OCR 2 at 83, Mineru Pro at 82, and PaddleOCR VL at 80.

The caveat arrived almost immediately. bclavie's critique argued that Mistral included nine OlmOCR comparison points while leaving out two stronger models, then bclavie's follow-up said Mistral appears to have evaluated those systems but could not reproduce their published results. That leaves the launch with a real score and a real communication problem, both visible in the same 24-hour window.

Deployment and pricing

Mistral is distributing OCR 4 across both managed and private surfaces. MistralAI's availability post lists these paths:

API
Document AI in Mistral AI Studio
Amazon SageMaker
Microsoft Foundry
Snowflake Parse Document, marked coming soon
Self-hosted deployment in a single container

The self-hosted angle is the most enterprise-shaped part of the launch. In MistralAI's availability post, Mistral says documents can stay inside the customer's environment, which turns OCR 4 into a drop-in ingestion component for regulated document stacks.

Pricing surfaced through ai_for_success's pricing summary at $4 per 1,000 pages, with Batch API pricing at $2 per 1,000 pages. testingcatalog's recap also notes that OCR 4 is part of Mistral's open-source Search Toolkit, which makes the release look less like a standalone OCR endpoint and more like a document-ingestion block inside a larger retrieval stack.

TL;DR

Structured extraction

Benchmarks and eval caveats

Deployment and pricing

Discussion across the web