DeepSeek releases DeepSpec and DSpark for speculative decoding on V4 checkpoints
DeepSeek open-sourced DeepSpec, a codebase for training and evaluating draft models for speculative decoding, alongside the DSpark decoding module for V4 checkpoints. It matters because inference teams get a new open stack for improving draft-model quality and decode throughput beyond earlier MTP-style baselines.

TL;DR
- scaling01's post says DeepSeek open-sourced DeepSpec, a full-stack codebase for training and evaluating draft models for speculative decoding.
- teortaxesTex's release note says DeepSeek also shipped DSpark for V4 checkpoints, framing it as a decoding module that beats MTP-1, Eagle-3, and DFlash.
- The attached DeepSpec screenshot in scaling01's post says the repo includes data preparation utilities, draft model implementations, training code, and evaluation scripts.
- teortaxesTex's follow-up argues the interesting part is not just another MTP-style baseline, but DeepSeek pushing "semi-AR drafting" as a stronger default for speculative decoding.
You can jump straight to the DeepSpec paper PDF, inspect the DSpark V4 Pro model card, and the first community signal was a LocalLLaMA thread that linked both within hours.
DeepSpec
DeepSeek is not just dropping a paper. The repo description visible in the attached screenshot calls DeepSpec a full-stack training and eval stack for draft models, which makes it more useful than a narrow inference demo.
The screenshot lists four concrete pieces:
- data preparation utilities
- draft model implementations
- training code
- evaluation scripts
That package matters because speculative decoding work usually fragments across separate repos, homemade harnesses, and unpublished eval code. This launch bundles the whole path from draft-model training to measurement in one open repo.
DSpark
The paired DSpark release is aimed at DeepSeek V4 checkpoints, not a generic benchmark artifact. teortaxesTex describes it as a decoding module that improves on MTP-1, Eagle-3, and DFlash, while the DSpark model card gives teams a concrete checkpoint to inspect.
That creates a cleaner split between stack and artifact:
- DeepSpec is the training and evaluation codebase.
- DSpark is the released decoding module for V4 checkpoints.
For engineers following open inference stacks, that separation is the useful reveal. DeepSeek shipped both the machinery for building draft models and a model-side example of the approach on its own V4 line.
Semi-AR drafting
The most opinionated reaction in the evidence pool is teortaxesTex's follow-up, which says the field has been too slow to make strong speculative decoding a baseline and singles out "semi-AR drafting" as the notable shift here.
That is a different claim from "DeepSeek released code." It suggests DeepSeek is trying to move the baseline away from older MTP-centered assumptions and toward a more aggressive drafting setup. The LocalLLaMA thread was thin on commentary, but it immediately centered the same two artifacts, the DSpark checkpoint and the DeepSpec paper, which is a good sign of what practitioners found worth opening first.