Anthropic Investigates Unauthorized Access to Restricted Claude Mythos AI Model

Anthropic announced on April 22, 2026, that it has launched a formal investigation into reports of unauthorized access to its restricted Claude Mythos Preview model. The incident involves a small group of external users who reportedly gained access to the model on the same day its existence was publicly disclosed. Anthropic stated that the breach appears to have originated within a third-party vendor environment rather than the company’s core infrastructure. Preliminary findings indicate that the unauthorized parties utilized a combination of leaked credentials from a contractor and predictive URL guessing to bypass access controls.

Claude Mythos, positioned as the first model in Anthropic’s new Capybara capability tier, is a specialized frontier AI designed for advanced cybersecurity research. Unlike the general-purpose Claude 4 series, Mythos is engineered for autonomous vulnerability discovery and exploit development. According to the model’s official system card, Mythos has demonstrated the ability to identify thousands of high-severity vulnerabilities across major operating systems and web browsers. In internal testing, the model successfully achieved tier 5 control flow hijacks on ten separate fully patched targets, a task that typically requires months of expert human effort.

The unauthorized access was first reported by Bloomberg, which corroborated the breach through screenshots and a live demonstration provided by a group communicating via a private Discord channel. The group reportedly exploited a gap between Anthropic’s internal security architecture and the environment of a third-party contractor, identified in reports as the AI training startup Mercor. While the group has reportedly used the model since April 7, Anthropic officials stated there is currently no evidence that the access impacted core production systems or customer data associated with the company’s public-facing Claude 4.5 or Claude 5 models.

At the time of the breach, Claude Mythos was only available to a limited consortium of partners under a program known as Project Glasswing. This group includes major technology and financial institutions such as Apple, Amazon, Nvidia, and Goldman Sachs. Anthropic had classified Mythos as an AI Safety Level 3 (ASL-3) model, citing its potential to provide a meaningful uplift to offensive cyber operations. The UK’s AI Security Institute (AISI) recently warned that the model represents a significant step up in autonomous cyber-threat capabilities, noting its ability to chain multiple vulnerabilities to escape software sandboxes.

Anthropic has suspended the affected vendor’s access and is working with third-party forensic investigators to determine the full scope of the data exposure. The company confirmed it is reviewing API gateway logs and has notified the Cybersecurity and Infrastructure Security Agency (CISA) regarding the incident. Despite the breach, Anthropic maintains that its Responsible Scaling Policy remains in effect and that the investigation will inform future hardening of its model deployment pipelines.

Related Articles