Anthropic Confirms Unauthorized Access to Restricted Claude Mythos AI Model

Anthropic PBC confirmed on April 22, 2026, that it has launched a formal investigation into reports of unauthorized access to Claude Mythos Preview, its most advanced and restricted artificial intelligence model to date. The incident involves a small group of users from a private online forum who reportedly gained entry to the model’s infrastructure through a third-party vendor environment. While Anthropic stated that its internal systems remain secure, the breach has intensified global scrutiny over the safety protocols surrounding frontier AI models specifically designed for cybersecurity applications.

Claude Mythos was introduced earlier this month as the centerpiece of Project Glasswing, an invite-only initiative designed to allow a select group of industry partners—including Apple, Goldman Sachs, and Nvidia—to test the model's ability to identify and remediate complex software vulnerabilities. According to technical evaluations released by the UK AI Security Institute, Mythos represents a structural shift in offensive AI capabilities. The model demonstrated a 72% success rate in autonomously identifying and exploiting zero-day vulnerabilities in major web browsers and operating systems, a significant increase from the near-zero success rates of previous generations. This capability led Anthropic to withhold the model from public release, citing its potential to reshape the global cybersecurity landscape.

The unauthorized access was first reported by Bloomberg and corroborated through live demonstrations and screenshots provided by the group. Investigators believe the unauthorized users did not employ a sophisticated technical hack but instead leveraged a combination of factors. These included the credentials of an individual working for a third-party contractor and the use of naming convention patterns identified in a separate data breach at the AI startup Mercor earlier in April. By guessing the online location of the preview environment, the users were able to interact with the model for several days, reportedly using it for non-offensive tasks such as web development to avoid triggering security alerts.

The breach has prompted immediate reactions from government officials and enterprise security leaders. Kanishka Narayan, the United Kingdom’s Minister for AI, stated that businesses should be deeply concerned about the implications of such a powerful tool falling into unvetted hands. Within the Salesforce ecosystem, security architects have issued warnings regarding the agentic attack surface, noting that the integration of high-capability models like Mythos into enterprise workflows creates new vectors for data exposure through third-party grants and API connections. Experts noted that if Mythos-class capabilities reach malicious actors, the risk to Salesforce and MuleSoft environments could escalate rapidly.

Anthropic maintains that there is currently no evidence that the unauthorized activity extended beyond the specific vendor environment or that any customer data was exfiltrated. However, the company has temporarily suspended access for several preview participants as it conducts a full forensic audit of its external access points. The event underscores the growing challenge of securing the supply chain for generative AI, particularly as models gain the autonomy to discover and exploit the very systems designed to house them.

Related Articles