Skip to content
AI Primer
release

DeepSeek V4 Preview opens 1M context with Flash and Pro variants

DeepSeek V4 Preview surfaced as an open-source 1M-context model family, with early docs and community testing pointing to Flash and Pro variants. The release matters for creators and vibe coders looking at self-hosted options, but most performance claims are still coming from first-wave community benchmarks.

4 min read
DeepSeek V4 Preview opens 1M context with Flash and Pro variants
DeepSeek V4 Preview opens 1M context with Flash and Pro variants

TL;DR

You can check the pricing docs, browse the Hugging Face collection, and see a model card screenshot that tags the weights as MIT-licensed FP8 releases. The odd early reveal here is that the family split surfaced in docs and hosting pages almost as fast as it did in launch chatter.

Flash and Pro

The first useful split is simple: Flash and Pro are different products, not just marketing labels. stevibe's docs link post pointed to DeepSeek's pricing page, while chrisfirst's model listing screenshot showed four related Hugging Face entries: V4 Pro, V4 Flash, and base versions for each.

The screenshot in chrisfirst's model listing screenshot also suggested a much bigger Pro base checkpoint than the serving model, with a 1.6T Pro-Base listing and an 862B Pro listing. That lines up with aakashgupta's post, which claimed V4 Pro activates 49B parameters out of 1.6T per token, though that specific sparsity detail had not yet been independently corroborated in the official docs surfaced here.

Tool calling and throughput

The earliest community numbers made Flash look like the practical workhorse.

  • BenchLocal ToolCall-15: V4 Flash 100, V4 Pro 97, V3.2 93, R1 77, per stevibe's BenchLocal run.
  • Official API speed test: V4 Flash 80.63 tok/s, V4 Pro 36.72 tok/s, per stevibe's speed test.
  • The visible BenchLocal scenario in stevibe's screenshot checked whether the model chose get_weather instead of falling back to web search, so the perfect Flash score came from tool selection, not just text quality.

For creative and coding workflows, that is a concrete split: Pro may be the richer model, but the first public tests made Flash look faster and more decisive when a tool has to fire.

Canvas tests

One of the more useful early checks was not a benchmark at all. stevibe's canvas test thread ran four browser-coding tasks side by side and broke the results into discrete wins:

  • Tree scene: Flash won on canopy density and leaf distribution.
  • Night sky: Pro added shooting stars, constellations, and a Milky Way band.
  • Fish boids: Pro produced stronger caustics and tighter schooling.
  • House builder: Pro was, in stevibe's words, the first tested model that actually built a rooftop.

That is a more interesting launch signal than generic "good at coding" claims. The split already looks task-shaped: Flash for snappy execution, Pro for scenes where composition and detail matter.

Open weights on Hugging Face

The rollout landed fast on Hugging Face. _akhaliq's Hugging Face thread linked the public collection, and the attached model card screenshot labeled DeepSeek-V4-Pro as a Transformers and Safetensors release with FP8 tags and an MIT license.

That matters because the launch framing in stevibe's retweet of DeepSeek emphasized cost-effective 1M context, but the Hugging Face surface showed the other half of the story: this was not just an API announcement. It was a weights release aimed at people who want to run the stack themselves.

aakashgupta's post pushed that point further by arguing V4 Pro creates a cheaper self-hosted option against Opus-tier and GPT-tier reasoning. That cost comparison was still an attributed market read, not an official DeepSeek claim, but it fit the way the launch immediately got interpreted by self-hosting users.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 2 threads
TL;DR1 post
Flash and Pro1 post