The Model Anthropic Wouldn't Release Was Accessed Without Permission
On April 22, 2026, Anthropic confirmed to reporters that an unauthorized group has been accessing Claude Mythos Preview since April 7, the exact day Anthropic publicly announced the model alongside Project Glasswing. The group obtained access by combining data from a breach at Mercor (an AI training vendor Anthropic uses) with credentials from someone employed by an Anthropic contractor.
This is the model Anthropic explicitly said it would not release publicly because of its cyber-offensive capabilities. The model that found a 17-year-old remote code execution vulnerability in FreeBSD, fully autonomously, during internal testing. The model that Project Glasswing was supposed to use to quietly secure critical software before its capabilities leaked into adversary hands.
And an unnamed Discord-coordinated group got access to it through a vendor breach, 15 days after launch. Here's what we know, what the implications are, and what it tells us about the gap between "we won't release dangerous models" and "dangerous models won't leak."
TL;DR
- An unnamed group has had unauthorized access to Claude Mythos Preview since April 7, 2026.
- They accessed it by combining a data breach at Mercor (Anthropic's training vendor) with contractor-level credentials.
- Anthropic confirmed the unauthorized access on April 22, 2026 and is investigating.
- The group is reportedly Discord-coordinated, monitored GitHub for early signals about new models, and targeted Mythos specifically.
- This happens exactly two weeks after Anthropic publicly announced Mythos and Project Glasswing as a demonstration of responsible frontier model release.
- No public evidence yet that the group has exfiltrated model weights, only that they have inference access.
- The bigger story: restricted-release only works if the restriction actually holds. This one didn't, and the mechanism was a vendor breach, not a nation-state actor.
What Actually Happened
According to reporting from Gizmodo and confirmed by an Anthropic spokesperson to multiple outlets, the sequence of events looks like this.
Step 1: The Discord group was watching. A group of people coordinated in a private Discord server monitors public GitHub activity for early signals of unreleased AI models. Commit messages, branch names, internal tool references, anything that hints at capabilities before formal announcement. This is not a new technique. It's how the GPT-4 system prompt leaked in 2023 and how several Claude model details surfaced before official launches.
Step 2: The Mercor breach opened a path. Mercor is a company that contracts human experts for AI training and evaluation work, including for Anthropic. Sometime before April 7, Mercor suffered a data breach that exposed internal information about what customers were working on. This breach surfaced on underground forums and was noticed by the Discord group.
Step 3: A contractor insider closed the loop. One member of the group reportedly worked at a firm that had contractor access to Anthropic's infrastructure. The combination of the Mercor breach data (which identified specific Anthropic projects and endpoints) and the contractor's insider credentials (which had access to those endpoints) gave the group a path to Mythos Preview.
Step 4: Access began April 7. Coincidentally or not, the same day Anthropic publicly announced the Mythos Preview and Project Glasswing. The group reportedly has been running inference against the model since then.
Step 5: Discovery and response. Anthropic received reports of the unauthorized access. The exact timing of discovery isn't public, but Anthropic confirmed it on April 22 in response to reporter inquiries. An Anthropic spokesperson stated they are "investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments."
Why This Specifically Matters
For a normal model, a vendor breach is a bad incident that costs the company trust and a quarter's worth of remediation spend. For Mythos, it's a direct contradiction of the safety argument Anthropic made when announcing the model.
Anthropic's position on Mythos was that the model's cyber-offensive capabilities (autonomous zero-day discovery, exploit generation, cross-stack reasoning) were too dangerous for public release. The alternative to public release was Project Glasswing: vetted partners only (AWS, Apple, Google, JPMorgan Chase, Microsoft, Nvidia), with structured vulnerability disclosure, and Anthropic retaining operational control.
The implicit argument was that Anthropic's internal security plus Glasswing's controlled access would prevent the wrong hands from getting the model. That argument required two things to be true:
- Anthropic's own systems and contractors are hardened enough to prevent unauthorized access.
- Other frontier labs don't ship equivalent capabilities publicly.
As of April 22, 2026, both of those have taken hits. The vendor breach shows that supply-chain security around model access is weaker than the safety case assumed. And several open-source labs (DeepSeek, Qwen, Meta's Llama 3.5) are rapidly closing the capability gap, meaning Mythos-class offense is approachable by next year from purely public models.
The Uncomfortable Irony
Project Glasswing was framed as a proactive security initiative. Anthropic would use Mythos to find zero-day vulnerabilities in critical infrastructure software before those capabilities reached adversaries. The goal was defender-first: fix the world's bugs before attackers could weaponize them.
The irony is that the first adversary to get Mythos-class capabilities against Anthropic's will didn't need to build a competing model from scratch. They just needed to wait for a vendor breach and a contractor with the right credentials. Fifteen days after launch. The very capability Anthropic was trying to meter out to defenders now sits in an unnamed Discord server's hands too.
This doesn't make Project Glasswing a bad idea. It makes the execution harder than the announcement suggested. Defense-first model deployment requires not just good intent, but contractor vetting, vendor security audits, inference-logging, rate-limiting, and threat detection at every layer. All of which takes longer than 15 days to set up correctly.
What We Still Don't Know
Several important facts are not yet public.
Have model weights been exfiltrated, or just inference access? Weights are the crown jewels. If the group only has inference access through a contractor's API key, Anthropic can revoke that key and the incident ends. If the group has exfiltrated the model weights, the model is effectively public forever. Anthropic's statement implies inference access, but it's not definitive.
What has the group done with the access? Possibilities range from benign (curiosity, capability testing) to catastrophic (using Mythos to find zero-days in critical software before Glasswing partners get to them, then selling those zero-days to the highest bidder). No public evidence either way yet.
Which contractor firm was involved? Anthropic hasn't named it. Expect the name to become public within a week as reporters chase the story. The contractor firm's other customers will face questions about whether their own sensitive access was also compromised.
What's Anthropic's remediation? Revoking the specific access is the obvious first step. Harder questions follow: what about other contractors with similar access? Does Anthropic audit their access logs? Does Anthropic apply inference rate-limiting to detect future unauthorized use patterns?
Will Glasswing partners reassess their participation? Apple, AWS, Google, Microsoft, JPMorgan, and Nvidia all signed up for Glasswing expecting that access would be tightly controlled. An unauthorized group with Mythos access is exactly what those partners were hoping to stay ahead of. Expect at least one of them to quietly ask for stronger operational guarantees.
What This Doesn't Affect
Let's be clear about what hasn't changed.
Your Claude.ai, Claude Pro, Claude Code, and API usage are unaffected. Mythos is a separate model lineage from Opus 4.7, Sonnet 4.5, and Haiku 4.5. The breach affects Mythos Preview specifically, not the consumer or developer surface you use daily.
Your account credentials haven't been compromised. The breach was at Mercor (training vendor) and through a contractor, not at the Anthropic customer authentication layer. No one has your Claude subscription credentials as a result of this.
Existing Glasswing-discovered vulnerabilities still matter. The CVEs Apple, AWS, Google, and Microsoft are patching based on Mythos findings are still being patched. Those fixes are still getting shipped. The breach doesn't undo the defensive work; it just changes the race conditions.
What Anthropic Should Do Next
Three things, in rough priority order.
Publish a public incident report within 30 days. Developer and security audiences expect post-incident transparency. A detailed report on how the access happened, what logs showed, what the remediation is, and what's changing in contractor vetting and vendor audits is the minimum acceptable response. Not publishing makes this story fester.
Audit and tighten contractor access controls. Mercor is one vendor. Anthropic works with many. If one contractor credential plus one vendor breach gave access to Mythos, that pattern exists across the rest of the supply chain too. A top-down audit of which contractors have access to which models, with what guardrails, is overdue.
Rethink the "we won't release but will use internally" pattern. Mythos-class capabilities inside Anthropic are obviously valuable for research. But if the model can be accessed through a contractor breach, the safety case for internal-only access weakens. Either the internal deployment needs to be much more restricted (fewer people, more logging, more monitoring), or the capability gap between what Anthropic uses internally and what's publicly available needs to be smaller. The current gap is what creates the attractive target.
What This Means for the Broader AI Safety Debate
Anthropic has been the industry's loudest advocate for restricted release of frontier models. Responsible Scaling Policy (RSP), the dangerous-capabilities evaluation framework, and now Project Glasswing are all bets on the premise that frontier labs can hold capability secrets while using them for good.
The Mythos breach is a case study against that premise. Not a fatal one, but a serious one. The failure mode wasn't a nation-state APT or a disgruntled employee with root. It was a Discord group with enough patience to wait for a vendor breach and a contractor insider to close the loop. That's a relatively low-resource adversary and they succeeded in 15 days.
The direct implication: restricted-release strategies require security investment proportional to the capabilities being restricted. If Mythos is worth billions of dollars in national-security value, the security around it needs to match. For a first preview 15 days in, it clearly did not.
The indirect implication: other frontier labs with similar "we can hold this" positions should be watching carefully. DeepMind, OpenAI, Meta, and smaller labs all face the same threat model. The Anthropic incident is the first public example of how that threat model actually plays out.
Honest Take
Anthropic made a reasonable decision with Mythos: some capabilities deserve restricted release. That was probably the right call in April 7 framing and it's still the right call now. What this breach reveals is that getting the execution right is much harder than the announcement suggested, and the window between "we've announced this" and "someone without permission has this" is shorter than the industry's safety conversations have assumed.
Expect Anthropic to respond competently. They have good security engineering and a strong incentive to fix this publicly. Expect the industry response to be slower and less transparent. Every other lab that's been reassuring customers about their own restricted-release postures just got a preview of a scenario they'd rather not acknowledge.
As a regular Claude user, none of this changes your day-to-day. As someone who cares about whether "we're being responsible with frontier AI" is a credible claim or a marketing line, this is an important datapoint. The answer to which it is depends on what Anthropic publishes in the next 30 days.
Related Reading
- Claude Mythos and Project Glasswing Explained - the original announcement and capability breakdown
- Amazon Anthropic $25B Investment - the capital now backing Anthropic's security budget
- Claude Opus 4.7 vs 4.6 Benchmarks - the current GA flagship (unaffected by this breach)
Sources
- Some Unknown Group Is Reportedly Using Claude Mythos Without Permission (Gizmodo)
- Claude Mythos Preview (red.anthropic.com, the original announcement)
- Project Glasswing: Securing critical software for the AI era (Anthropic)
- Mercor breach coverage (TechCrunch)
Questions about what this means for your AI supply-chain security posture, or want help thinking through which models to use in sensitive contexts? Reply on the newsletter and I answer every email.
