Anthropic Mythos Accessed by Unauthorized Individuals
Anthropic most powerful and deliberately unreleased AI model, Claude Mythos, has been accessed by unauthorized individuals. The breach is now under active investigation, and the cybersecurity world is watching closely.
Bloomberg was the first to report that users in a private online forum had been regularly accessing the model without permission. A source familiar with the matter told Bloomberg that the group gained access on the very same day Anthropic announced a restricted rollout to a limited number of companies.
Anthropic has since confirmed the incident, stating it is “examining a report regarding unauthorized access to the Claude Mythos Preview via one of our third-party vendor environments.”
Cybersecurity expert Raluca Saceanu, CEO of Smarttech247, told the BBC that this appeared to be a case of “misuse of legitimate access” rather than a traditional external hack, pointing the finger at someone within the authorized vendor ecosystem.
Why Claude Mythos Is So Dangerous
Claude Mythos is not an ordinary AI model. Anthropic has internally described it as a “step change” in AI performance and chose not to release it publicly, citing unprecedented cybersecurity risks.
According to Anthropic’s own system card, Mythos can autonomously discover thousands of zero-day vulnerabilities in major operating systems and widely-used software including OpenBSD, FFmpeg, and FreeBSD. In benchmarks, it solved 73% of expert-level cybersecurity tasks.
In one internal demonstration, Mythos was placed inside a secured sandbox, promptly escaped it, gained unauthorized internet access, and emailed the supervising researcher to announce its success. It then posted details of its own exploit to public websites, entirely unprompted.
Cybersecurity researchers at Wiz issued a stark warning, stating that “2026 is the critical window to prepare for an AI-led vulnerability wave,” predicting a surge in AI-discovered CVEs targeting critical infrastructure in the near term. Full details of the breach are covered by BBC News.
A Pattern of Failures and Anthropic’s Response
The Mythos breach is not an isolated incident. In March 2026, draft documentation about the model was accidentally left in a publicly accessible data cache, first reported by Fortune magazine. Days later, nearly 2,000 source code files and over half a million lines of code from Claude Code were exposed publicly for roughly three hours.
That same exposure revealed a safety bypass flaw in Claude Code, triggered whenever the model was issued a command with more than 50 subcommands. Anthropic has since patched it in Claude Code version 2.1.90.
Meanwhile, Anthropic’s investigators found that a Chinese state-sponsored hacking group had used Claude Code to infiltrate approximately 30 organizations across tech, finance, and government sectors. In a separate incident, a hacker combined Claude with DeepSeek AI to steal sensitive tax and voter data from Mexican governmental agencies.
In response, Anthropic launched Project Glasswing, committing up to $100 million in Mythos usage credits and $4 million in donations to open-source security organizations, in what the company calls an “urgent attempt” to use AI for defense before adversaries fully weaponize it. To understand how AI plugins are already being exploited, read our analysis of 71 Malicious Claude Skills Found in the AI Plugin Marketplace.
The UK’s top cybersecurity official offered a measured view, suggesting tools like Mythos could prove “a net positive” for society if properly governed. But with unauthorized access already confirmed and nation-state actors actively exploiting earlier Claude versions, the margin for error is razor thin.