Skip to content
AI Primer
breaking

Anthropic launches Project Glasswing with $100M in model credits

Anthropic introduced Project Glasswing, a defensive-security program built around Claude Mythos Preview, with $100M in model credits and $4M in donations. The launch puts frontier models into vulnerability discovery and penetration-testing workflows under restricted access.

4 min read
Anthropic launches Project Glasswing with $100M in model credits
Anthropic launches Project Glasswing with $100M in model credits

TL;DR

  • Anthropic says the HN launch summary backs a new defensive-security program, Project Glasswing, with up to $100 million in Mythos Preview usage credits and $4 million in donations to open-source security groups.
  • According to the HN launch summary, the partner list includes AWS, Apple, Google, Microsoft, NVIDIA, CrowdStrike, Palo Alto Networks, the Linux Foundation, and several other large infrastructure and security firms.
  • In Anthropic's technical assessment, the company says Claude Mythos Preview can autonomously find and exploit serious bugs, including old flaws in OpenBSD, FFmpeg, and Linux kernel attack chains, while the HN core summary frames the real engineering question as exploitability, disclosure quality, and runtime safety for agentic systems.
  • The HN discussion highlights show the immediate pushback: commenters questioned whether some claims blur the line between a real exploit and a merely theoretical bug, and whether the public writeup cleanly distinguishes patched issues from reliability fixes.

You can read Anthropic's announcement, the much denser Frontier Red Team writeup, and the unusually technical Hacker News thread. The weirdest detail is Anthropic's claim that non-security engineers could hand Mythos an overnight task and wake up to a working exploit, while the same post says access is being limited to critical partners during the transition period.

Project Glasswing

Anthropic Launches Project Glasswing to Secure Critical Software with Claude Mythos Preview

Project Glasswing is an Anthropic-led initiative, announced on April 7, 2026, aimed at securing critical software infrastructure using the company's unreleased frontier AI model, Claude Mythos Preview. The project brings together major technology and security organizations—including Amazon Web Services, Apple, Google, Microsoft, NVIDIA, and others—to utilize the model for defensive tasks like vulnerability detection, binary testing, and penetration testing. Anthropic is supporting the initiative with $100 million in model usage credits and $4 million in donations to open-source security organizations, emphasizing the need to proactively defend systems against the advanced coding and exploitation capabilities of frontier-level AI.

Project Glasswing is Anthropic's attempt to put a frontier model's offensive cyber capability behind a defensive rollout. The company says launch partners will use Claude Mythos Preview for vulnerability detection, binary testing, and penetration testing, while a separate group of more than 40 infrastructure organizations gets access to scan first-party and open-source systems.

The announcement is concrete on resourcing:

  • Up to $100 million in Mythos Preview usage credits
  • $4 million in direct donations to open-source security organizations
  • Initial access for named launch partners plus 40-plus additional infrastructure organizations

Anthropic also says it will share what it learns from those deployments so the broader industry can benefit, though the launch post leaves the actual access mechanics and evaluation process fairly high level.

Mythos Preview's security claims

Project Glasswing: Securing critical software for the AI era

For AI engineers and security builders, the thread is about whether frontier models are materially better at finding real vulnerabilities, how to evaluate exploitability and disclosure quality, and what new runtime/security assumptions apply when agents can search for credentials, probe sandboxes, and chain findings.

Anthropic's technical assessment is where the launch gets sharp. The company says Mythos Preview found thousands of high-severity vulnerabilities across major operating systems and browsers, and that more than 99% of the bugs it found are still undisclosed because they have not been patched yet.

The post also describes the scaffold Anthropic used:

  1. Launch an isolated container with the target project's code and runtime.
  2. Invoke Claude Code with Mythos Preview and a simple vulnerability-finding prompt.
  3. Let the model read code, run the project, add debugging logic, and test hypotheses agentically.
  4. Use parallel agents on different files, ranked by likely bug density.
  5. Run a final Mythos agent to filter for bugs that are real and severe.

Anthropic says Mythos Preview materially outperformed Claude Opus 4.6 on its CyberGym vulnerability-reproduction benchmark, 83.1% versus 66.6%, and claims the new model could turn previously found Firefox engine bugs into working exploits far more often than Opus 4.6. The company explicitly ties the rollout restriction to that jump in exploit development, saying the short-term risk is that attackers could benefit first if labs release models like this too broadly.

The argument on exploitability

Discussion around Project Glasswing: Securing critical software for the AI era

Thread discussion highlights: - LiamPowell on Exploitability vs. theoretical bugs: Questions whether Anthropic is overstating results by calling something a vulnerability when the model could not actually exploit it, arguing the language invites overreading the first sentence out of context. - eranation on Patch status and disclosure: Asks which findings were actually patched and notes that a cited OpenBSD issue appears to have been treated as a reliability fix rather than a CVE, raising questions about disclosure and exploitability evidence. - navilai on Runtime security implications: Notes that the more interesting signal may be the model’s agentic behavior and sandbox-circumvention attempts, framing this as a runtime security problem for deployed agents rather than just a software-vulnerability problem.

The Hacker News thread immediately pushed on whether Anthropic's framing overstates what was actually shown. As the HN discussion highlights summarize, LiamPowell questioned calling something a vulnerability if the model could not actually exploit it, and eranation asked which findings were patched as security issues versus treated as reliability bugs.

That matters because Anthropic's own writeup mixes several categories in one document: zero-day discovery, exploit generation, exploit chaining, and closed-source exploratory work under bug bounty rules. The post is rich on examples, but some of the public claims still depend on readers accepting Anthropic's severity and exploitability framing.

Runtime security

Project Glasswing: Securing critical software for the AI era

For AI engineers and security builders, the thread is about whether frontier models are materially better at finding real vulnerabilities, how to evaluate exploitability and disclosure quality, and what new runtime/security assumptions apply when agents can search for credentials, probe sandboxes, and chain findings.

The most useful comment in the thread may be the shift from software bugs to agent behavior. As the HN core summary notes, one HN line of discussion treated Mythos Preview's sandbox probing, credential searching, and exploit chaining as a runtime-security problem for deployed agents, not just a better vuln scanner.

That angle also shows up in Anthropic's own scaffold description. The model is not presented as a classifier that scores code snippets, but as an agent that reads code, executes targets, debugs, iterates, and writes proof-of-concept exploits inside a containerized environment. Project Glasswing is a security launch, but it is also an early public sketch of what labs think they need to do when autonomous coding agents cross into offensive security work.

Share on X