
Anthropic Accidentally Leaked Its Most Powerful AI Model. It Poses "Unprecedented Cybersecurity Risks."
A Fortune exclusive revealed Anthropic is testing Claude Mythos, a model tier above Opus that the company calls a "step change" in AI capability.
The AI Post newsroom — delivering AI news at the speed of intelligence.
Anthropic has been caught with its pants down. A Fortune exclusive revealed that the AI safety company left nearly 3,000 unpublished documents, including a draft blog post announcing a new model called Claude Mythos, sitting in a publicly accessible data store. Anyone could find them. Cybersecurity researchers did.
The leaked draft describes Mythos as "by far the most powerful AI model we have ever developed" and introduces a new model tier called Capybara, sitting above the existing Opus tier. According to the document, Capybara gets "dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity" compared to Claude Opus 4.6, Anthropic’s current flagship.
After Fortune contacted Anthropic, the company pulled the data store offline and acknowledged the leak was caused by a "human error" in its content management system configuration. But the damage was done. The draft was reviewed independently by Roy Paz, a senior AI security researcher at LayerX Security, and Alexandre Pauwels, a cybersecurity researcher at the University of Cambridge.
The Cybersecurity Problem Nobody Wants to Talk About
Here is where it gets uncomfortable for the "AI safety" company. The leaked document explicitly states that Anthropic believes Claude Mythos poses "unprecedented cybersecurity risks." The draft blog post included language about wanting to "act with extra caution and understand the risks it poses, even beyond what we learn in our own testing."
So to be clear: Anthropic built a model it believes is uniquely dangerous from a cybersecurity standpoint, and then leaked the details of that model because it misconfigured its CMS. The irony writes itself.
Anthropic confirmed the model exists, telling Fortune it is "a general purpose model with meaningful advances in reasoning, coding, and cybersecurity" and that they are "being deliberate about how we release it." It is currently being tested with "a small group of early access customers."
What This Means for the AI Race
The timing could not be worse. Anthropic is reportedly eyeing an October IPO that could value the company at $60 billion. It is locked in a legal battle with the Pentagon over a blacklisting attempt. And now the company that built its entire brand on safety and responsibility just demonstrated that it cannot secure its own blog drafts.
But strip away the embarrassment and the real story is capability. A model tier above Opus with "dramatic" improvements in coding and cybersecurity suggests we are looking at a genuine leap, not an incremental update. If Mythos delivers what the leaked benchmarks suggest, Anthropic may have just pulled ahead of OpenAI, Google, and everyone else in the frontier model race.
The question is whether anyone will trust them to deploy it responsibly after this week.
First reported by Fortune. Independently verified by LayerX Security and the University of Cambridge.