Skip to content
AI Primer
release

Devin launches Security Swarm with Agentic MapReduce and 36/50 GHSA hits

Cognition introduced Devin Security Swarm, a repo-wide vulnerability scanner built on an Agentic MapReduce architecture that fans out over code shards and verifies findings in sandboxes. In a 50-vulnerability GHSA eval across 14 languages, it found 36 issues at 30% lower cost per finding than the next most accurate alternative.

4 min read
Devin launches Security Swarm with Agentic MapReduce and 36/50 GHSA hits
Devin launches Security Swarm with Agentic MapReduce and 36/50 GHSA hits

TL;DR

  • launch thread introduced Devin Security Swarm as a new Devin for Security product built on what Cognition calls Agentic MapReduce.
  • In the eval claim post, Cognition said Security Swarm found 36 of 50 real-world GHSA vulnerabilities and did it at 30% lower cost per finding than the next most accurate alternative.
  • the architecture thread describes a four-step pipeline: map repo signals, fan out agents over bounded shards, reduce findings into one report, then verify serious issues in isolated sandboxes.
  • According to the eval detail post, the test set covered 14 languages and surfaced misses from other tools, including a PHP sandbox bypass, an argument injection path, and a broad deserialization surface.
  • the product post positions Security Swarm as one part of a broader workflow that finds bugs, validates exploitability at runtime, and opens remediation PRs.

You can jump from the announcement to the Agentic MapReduce docs and then into the eval deep-dive. A short product demo shows the flow from scan to fix PR, while a DocETL comparison post points out that the same map-reduce pattern is already familiar in other multi-agent pipelines.

Agentic MapReduce

Cognition framed security scanning as a whole-codebase reasoning problem. The core claim in the architecture thread is that narrow agent loops miss repo-wide context, so the system first builds a map of relevant signals and only then shards work across focused agents.

The pipeline breaks down into four stages:

  1. Map relevant signals across the repository.
  2. Fan out focused agents over bounded shards.
  3. Reduce the shard-level findings into one report.
  4. Re-run serious findings in isolated sandboxes before marking them confirmed.

That last verification step is the interesting part. the product post says the suite is meant not just to flag vulnerabilities but to validate exploitability at runtime, which is a higher bar than static pattern matching.

The eval numbers

Cognition's headline number is 36 hits out of 50 real-world GHSAs, or 72% recall, per the eval claim post. The company also said the system achieved that result at 30% lower cost per finding than the next most accurate alternative.

the eval detail post adds the broader shape of the benchmark:

  • 50 GHSA vulnerabilities
  • 14 languages
  • Repositories from multiple software categories
  • Comparisons against multiple security scanning tools

The same post says Security Swarm found issues other tools missed, including a template-injection PHP sandbox bypass, an argument injection route through metadata parsing, and an overly broad deserialization surface.

Find, verify, fix

Security Swarm is being sold as a new pillar inside Devin for Security, according to the product post. The workflow has three distinct outputs:

  • Find vulnerabilities across a codebase
  • Validate exploitability at runtime
  • Ship remediation PRs

The demo clip shows that full loop in the product UI, ending with an auto-generated fix PR. In a hands-on reaction post, the setup is described as roughly 10 seconds with nightly or weekly automation for any repository.

Documentation and adjacent rollout

The launch shipped with more than a marketing page. the docs and materials post links out to three separate artifacts: the announcement, dedicated Agentic MapReduce documentation, and a deeper eval write-up.

A separate same-day rollout also put Claude Fable 5 into Devin Cloud Ultra, Devin Desktop, and Devin CLI, according to the model availability post. That does not change the Security Swarm architecture, but it does mean the wider Devin surface shipped new model access at the same time as the security product push.

Further reading

Discussion across the web

Where this story is being discussed, in original context.

On X· 3 threads
TL;DR1 post
Find, verify, fix1 post
Documentation and adjacent rollout1 post
Share on X