Anthropic launches Project Glasswing with AI model 'too dangerous to release' that found thousands of zero-days

5 min readPublished 08 Apr 2026Source: Anthropic

News AI Security Vulnerability Research Zero-Day Open Source Memory Safety

TL;DR — Anthropic has announced Project Glasswing, a coalition of 12 major technology and finance companies using an unreleased frontier AI model called Claude Mythos Preview to find and fix vulnerabilities in critical software. The model autonomously discovered thousands of high-severity zero-days — including some that survived decades of human review — and Anthropic considers it too dangerous to release publicly.

On April 7, 2026, Anthropic announced Project Glasswing, a cybersecurity initiative pairing Claude Mythos Preview — a general-purpose frontier model it describes as having unprecedented cyber capabilities — with a coalition of launch partners: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Over 40 additional organisations that build or maintain critical software infrastructure have also been given access.

Anthropic has taken the unusual step of not releasing Mythos Preview publicly, stating the cybersecurity risk is too great. Newton Cheng, Anthropic's Frontier Red Team Cyber Lead, told VentureBeat: "Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout — for economies, public safety, and national security — could be severe."

What Mythos Preview found

Mythos Preview has already found thousands of high-severity zero-day vulnerabilities across every major operating system and web browser, nearly all of them autonomously without human steering. Three headline findings illustrate the scale:

A 27-year-old buffer overflow in OpenBSD's network stack — one of the most security-hardened operating systems in the world, widely used for firewalls and critical infrastructure — that allowed an attacker to remotely crash any machine simply by connecting to it.
A 16-year-old vulnerability in FFmpeg — the near-ubiquitous video encoding/decoding library — in a line of code that automated testing tools had exercised five million times without ever catching the problem.
An autonomous exploit chain across multiple Linux kernel vulnerabilities that escalated from ordinary user access to complete control of the machine — a class of attack that traditionally required expert manual effort.

All three have been reported to maintainers and patched. For many other findings still in the remediation pipeline, Anthropic is publishing cryptographic hashes of the details now and will reveal specifics after fixes are in place.

On benchmarks, Mythos Preview scored 83.1% on CyberGym (vs 66.6% for Claude Opus 4.6), 93.9% on SWE-bench Verified (vs 80.8%), and 77.8% on SWE-bench Pro (vs 53.4%).

The coalition and what comes next

The partner list is notable for including direct competitors — Google and Microsoft alongside AWS — as well as cybersecurity incumbents like CrowdStrike and Palo Alto Networks, financial giant JPMorganChase, and the steward of the world's largest open-source ecosystem in the Linux Foundation.

Several partners have already been running Mythos Preview against their own infrastructure for weeks. AWS CISO Amy Herzog said the model is "already helping us strengthen our code." Microsoft's Global CISO Igor Tsyganskiy noted that when tested against CTI-REALM, Microsoft's open-source security benchmark, "Claude Mythos Preview showed substantial improvements compared to previous models." CrowdStrike CTO Elia Zaitsev emphasised the urgency: "The window between a vulnerability being discovered and being exploited by an adversary has collapsed — what once took months now happens in minutes with AI."

Linux Foundation CEO Jim Zemlin framed the initiative around the open-source maintainer problem: "In the past, security expertise has been a luxury reserved for organisations with large security teams. Open-source maintainers — whose software underpins much of the world's critical infrastructure — have historically been left to figure out security on their own. Project Glasswing offers a credible path to changing that equation."

Anthropic is committing up to $100M in usage credits for Mythos Preview across the initiative, plus $4M in direct donations to open-source security organisations ($2.5M to Alpha-Omega/OpenSSF via the Linux Foundation, $1.5M to the Apache Software Foundation). After the research preview, the model will be available to participants at $25/$125 per million input/output tokens via the Claude API, Amazon Bedrock, Vertex AI, and Microsoft Foundry. Open-source maintainers can apply for access through Anthropic's Claude for Open Source programme.

Responsible disclosure at scale

Finding thousands of zero-days at once creates an unprecedented disclosure challenge. Anthropic says it has built a triage pipeline where the highest-severity bugs are sent to professional human triagers who manually validate every report before it goes to maintainers. "We do not submit large volumes of findings to a single project without first reaching out in an effort to agree on a pace the maintainer can sustain," Cheng told VentureBeat.

When Anthropic has access to source code, it aims to include a candidate patch with every report — labelled by provenance so the maintainer knows it was model-generated — and offers to collaborate on a production-quality fix. On disclosure timelines, Anthropic follows a coordinated vulnerability disclosure framework: generally waiting 45 days after a patch is available before publishing full technical details. The company says it will report publicly on what it has learned within 90 days.

The initiative arrives during a turbulent week for Anthropic. In late March, a draft blog post about Mythos was left in an unsecured CMS that exposed roughly 3,000 internal assets. Days later, Claude Code's complete source code leaked via an npm packaging error for approximately three hours. These operational lapses sit uncomfortably alongside the company's claim to be the responsible steward of a model with unprecedented offensive cyber capabilities — though Anthropic argues both incidents were publishing-tooling errors, not breaches of its core security architecture.

Why this matters for AppSec

The core message for application security engineers is straightforward: the bar just moved. If an AI model can autonomously find bugs that survived 27 years of human review and five million automated test executions, the assumption that obscure vulnerabilities will stay hidden is no longer valid.

This doesn't change what good security practice looks like — memory-safe languages, input validation, least privilege, dependency management, threat modelling — but it dramatically raises the cost of not doing those things. The window between a vulnerability existing and being discovered is collapsing, whether it's a defender or attacker doing the discovering. If you're looking to strengthen your foundations, our guides on building a secure SDLC and defending against prompt injection in LLM-powered applications are good starting points.

Anthropic has proposed that an independent third-party body might ultimately be the best home for large-scale cybersecurity projects of this kind. Whether that materialises remains to be seen. In the meantime, the practical implication is clear: treat latent code issues as if they will be found, because increasingly, they will be.

Content is AI-assisted and reviewed by our team, but issues may be missed and best practices evolve rapidly, send corrections to [email protected]. Always consult official documentation and validate key implementation decisions before making design or security choices.