Anthropic Investigates Unauthorized Access to Claude Mythos AI Security Tool

Anthropic confirmed on April 22, 2026, that it has launched an investigation into reports of unauthorized access to Claude Mythos Preview, its specialized AI model designed for high-level cybersecurity and vulnerability research. The company issued an official statement following a report from Bloomberg which revealed that a small group of individuals had gained access to the model through a third-party vendor environment. Claude Mythos is the primary component of Anthropic’s Project Glasswing, a restricted initiative intended to provide elite enterprise partners with tools to identify and remediate software vulnerabilities before they can be exploited by bad actors.

Technical details surrounding the incident indicate that the unauthorized users did not breach Anthropic’s internal servers directly. Instead, the group—reportedly operating through a private Discord channel—exploited a combination of factors to gain entry. These included guessing the model’s online hosting location based on naming conventions leaked in a prior breach at the AI startup Mercor, as well as leveraging credentials from an individual working for a third-party contractor tasked with evaluating Anthropic’s software. The group reportedly demonstrated their access to journalists via live sessions and screenshots, though they claimed their use of the model was limited to non-cybersecurity tasks like web development to avoid triggering internal alarms.

Claude Mythos represents a significant leap in AI-driven security, with Anthropic claiming the model can identify and exploit zero-day vulnerabilities in every major operating system and web browser. It recently became the first AI system to pass a 32-step simulated cyber-attack challenge developed by the AI Safety Institute (AISI), successfully completing the objective in 3 out of 10 attempts. Because of these capabilities, Anthropic had limited the tool’s distribution to a small cohort of major organizations, including Salesforce, Apple, Amazon, Cisco, and Goldman Sachs.

The breach has prompted immediate warnings for Salesforce customers and architects. Security experts noted that the model’s ability to automate the discovery of IT infrastructure flaws could be used to facilitate sophisticated data theft campaigns if the tool fell into the hands of malicious actors. While Anthropic stated that there is no evidence that the unauthorized activity moved beyond the third-party vendor’s environment or compromised any customer data, the company has suspended the preview for several partners pending a full security audit.

In its April 22 statement, Anthropic emphasized that its standard commercial models, including the Claude 3 and Claude 4 series, are hosted on separate infrastructure and were not affected by this incident. The company is currently working with external forensic investigators and law enforcement to determine the full scope of the exposure and to harden its third-party development pipelines. Salesforce has not reported any direct impact on its core CRM services but continues to monitor the situation.

Related Articles