Anthropic Built an AI So Good at Hacking, They Won't Let You Use It — And It Just Cracked Apple's M5

Anthropic's Claude Mythos is an AI so powerful at finding security vulnerabilities that the company refused to release it to the public. Instead, they locked it behind a velvet rope, granting access only to select security researchers and large organizations with the resources to handle what it can do. That decision, announced just weeks ago, felt like typical AI safety theater — another tech company making grand gestures about responsible development while quietly building something extraordinary.

We now know exactly how extraordinary. Security researchers from a Palo Alto-based company called Calif used Anthropic's Claude Mythos Preview to discover what they describe as the "first public macOS kernel memory corruption exploit on Apple M5" — a flaw that gives an unprivileged local user complete, unfettered access to the device. Root access. Full control. On Apple's newest, most secure chip architecture, running the world's most closely guarded consumer operating system.

The kicker? Mythos didn't just find the bugs. It assisted with the entire exploit development process. According to Calif's blog post, the AI "discovered the bugs quickly because they belong to known bug classes," and once it learned how to attack that class of problems, it generalized to nearly any problem in that category. This isn't a tool that finds vulnerabilities. It's a tool that learns how to break things, then applies that knowledge systematically.

Anthropic, the company that built this digital lockpick, is the same organization that publishes papers on AI safety, alignment, and the responsible development of frontier models. The same company whose executives testify before Congress about the dangers of unrestrained AI. The same company that charges other AI labs with recklessness while positioning itself as the careful, considered alternative.

And now they're handing out keys to an AI that can crack Apple's security in days.

The Exploit

The technical details, as published by Calif in their Thursday blog post, describe an attack involving "two vulnerabilities and several techniques" that together allow an unprivileged local user to escalate to complete kernel access. In practical terms, this means anyone with physical access to a Mac running Apple's M5 chip — or anyone who can trick a user into running malicious code — can own the entire machine.

Calif's researchers were careful to note that they disclosed the vulnerabilities to Apple before publishing their findings. They even met with Apple "early this week," suggesting the company is actively working on patches. Apple's macOS Tahoe 26.5, released on Monday, includes fixes for a bug submitted by Calif in collaboration with Claude and Anthropic Research, with Calif mentioned in at least two other vulnerability reports in the same release notes.

But the full technical details won't be published until Apple completes its fixes. "Full technical details will be shared after Apple fixes the vulnerabilities and attack path," Calif wrote. This is standard responsible disclosure practice, but it also means the security community is waiting to understand the full scope of what Mythos uncovered.

What's already clear is that this isn't a theoretical exercise. The blog post is part of a series called the "Month of AI-Discovered Bugs" — a deliberate campaign to demonstrate what happens when frontier AI models are pointed at security research. And Anthropic's model, specifically, is doing things that human researchers weren't finding.

The Irony of AI Safety

Anthropic's decision to withhold Mythos from general release was framed as a safety measure. The model, according to internal and external evaluations, could autonomously identify and exploit software vulnerabilities at a level beyond any previous public AI system. Rather than make it widely available, Anthropic chose to distribute it through a controlled preview program to select partners.

This is, on its face, a reasonable decision. An AI that can find zero-day exploits in commercial operating systems probably shouldn't be available to anyone with an API key. The potential for misuse — by criminal organizations, nation-state actors, or simply reckless researchers — is obvious and enormous.

But the controlled release raises its own questions. Who decides which organizations are trustworthy enough to wield this capability? What criteria determine whether a security startup in Palo Alto gets access but a researcher in a developing country doesn't? And if the model is already finding serious vulnerabilities in one of the world's most security-conscious companies, what happens when it's pointed at infrastructure, financial systems, or government networks?

Anthropic's Project Glasswing, launched in April 2026, is the formal umbrella for this work — an initiative explicitly designed to use AI to prevent AI-driven cyberattacks. The logic is that the best defense against AI-powered hacking is AI-powered defense. It's a cybersecurity arms race logic that has governed the industry for decades, now accelerated by systems that can learn and generalize.

But there's a fundamental asymmetry here. Defensive security requires finding and fixing every vulnerability. Offensive security only needs to find one. Mythos, by all available evidence, is dramatically better at the latter than any tool that exists for the former. Anthropic is building a weapon and calling it a shield.

Apple's Response

Apple's reaction to the disclosure was predictable and, perhaps, telling. A spokesperson offered the standard corporate response: "Security is our top priority, and we take reports of potential vulnerabilities very seriously." The company patched at least one related issue in macOS Tahoe 26.5, released Monday, but the full scope of the vulnerabilities remains unclear.

The fact that Apple is mentioned in Calif's disclosure is significant. Apple's M-series chips, built on ARM architecture with Apple's own security extensions, have been marketed as the most secure consumer computing platform available. The M5, Apple's latest generation, represents billions of dollars in silicon design focused specifically on preventing exactly this kind of attack.

If Mythos can crack it in days — with assistance, not months of painstaking human research — what does that say about the security of everything else? Windows systems running on Intel and AMD chips have far larger attack surfaces and less rigorous security architectures. Linux servers powering the internet's infrastructure were designed decades ago with security as an afterthought. Industrial control systems, medical devices, financial networks — none of these were built to withstand an AI that learns bug classes and generalizes across systems.

Apple's M5 may have been the first major target because it's the hardest. The implication is unsettling.

What Mythos Actually Does

Calif's description of Mythos's capabilities is worth parsing carefully. The AI didn't just scan code for known patterns. It "learned how to attack a class of problems" and then "generalized to nearly any problem in that class." This is qualitatively different from traditional static analysis or fuzzing tools, which look for specific signatures or randomly input data hoping to trigger crashes.

Mythos appears to be doing something closer to reasoning about code. It understands what constitutes a vulnerability class — perhaps memory corruption, privilege escalation, or information disclosure — and then applies that understanding to new codebases. It can assist with exploit development, not just bug discovery. It can chain multiple vulnerabilities together into a functional attack.

This is the kind of capability that security researchers spend years developing. The best human bug hunters combine deep technical knowledge with intuition, pattern recognition, and creative thinking about how systems can be made to misbehave. Mythos, according to its users, is doing this at scale and speed that no human team can match.

The "Month of AI-Discovered Bugs" campaign suggests this Apple exploit is not an isolated success. Calif and other partners are systematically applying Mythos to real software and finding real vulnerabilities. We don't know how many bugs have been found, how severe they are, or which other companies have been notified. The campaign name implies a steady drumbeat of disclosures.

The Business of AI Security

Anthropic's controlled release of Mythos is also a business strategy. By making the model available only to select partners, Anthropic creates exclusivity and scarcity around one of the most powerful capabilities in cybersecurity. Organizations that get access gain a significant advantage over those that don't. Organizations that don't get access are left wondering what vulnerabilities Mythos might find in their systems that they're missing.

This dynamic creates a market. Security companies will pay premium prices for access. Organizations with critical infrastructure will feel compelled to participate in Anthropic's program, if only to understand what the AI might find. The model itself becomes a kind of security audit that you can either buy into or ignore at your peril.

Anthropic has also signaled its willingness to cooperate with governments. An executive recently vowed to cooperate with Japan on Mythos-related responses, suggesting the company is positioning the model as a resource for national cybersecurity efforts. This is smart politics — aligning with government interests reduces regulatory pressure and creates defensive use cases that offset the obvious offensive potential.

But it also means Anthropic is becoming an arms supplier. The governments and organizations that get access to Mythos gain offensive capabilities that others lack. The cybersecurity community has long debated whether vulnerability discovery should be openly shared or closely held. Mythos forces that debate into a new realm, where the discoverer isn't a human researcher making individual disclosure decisions but an AI system whose outputs are controlled by a single company.

The Path Forward

The Apple M5 exploit is a warning, not an endpoint. It's the first public demonstration of what happens when a frontier AI model is systematically applied to security research at scale. The results are impressive, concerning, and inevitable.

Other AI labs are building similar capabilities. Google's Project Zero has used machine learning for vulnerability research for years. Microsoft's security teams employ AI for threat detection and response. OpenAI, despite its public focus on general intelligence, has the technical capacity to build equivalent systems. The race to AI-powered offensive security is already underway; Mythos is just the first to produce public, verifiable results against a high-value target.

The deeper question is whether any organization — even one as committed to safety as Anthropic claims to be — can responsibly steward a capability this powerful. The history of technology suggests that powerful tools eventually escape control. Nuclear weapons proliferated despite the best efforts of the original possessors. Cyberweapons developed by nation-states have leaked into criminal hands repeatedly. Zero-day vulnerabilities sold to governments have been rediscovered and exploited by others.

Mythos is different in degree, not kind. An AI that learns to find vulnerabilities is more scalable than a human researcher, more consistent than a team of consultants, and potentially more accessible than a classified government program. If the model or its capabilities leak — through theft, misconfiguration, or simply the march of open-source replication — the cybersecurity landscape changes overnight.

Anthropic's decision to withhold the model from public release delays this scenario but doesn't prevent it. Competitors will build equivalent systems. Open-source projects will approximate the capabilities. Nation-states with the resources to train frontier models will develop their own versions, without Anthropic's stated safety constraints.

The Apple M5 exploit is impressive. It's also the beginning of something that the cybersecurity industry, the technology sector, and society at large are not prepared for. Anthropic has built an AI that can crack the world's most secure consumer platform. They're sharing it with select partners and calling it safety. Whether that's true depends on whether you trust the company holding the keys — and whether you believe those keys won't eventually be copied.

Anthropic Built an AI So Good at Hacking, They Won't Let You Use It — And It Just Cracked Apple's M5

The Exploit

The Irony of AI Safety

Apple's Response

What Mythos Actually Does

The Business of AI Security

The Path Forward

More Intelligence

A $5.3 Billion Startup With Art School Founders Just Declared War on Google's AI Empire

OpenAI Just Turned ChatGPT Into Your Personal Finance Advisor — And It Can See Your Bank Account

The Exploit

The Irony of AI Safety

Apple's Response

What Mythos Actually Does

The Business of AI Security

The Path Forward

Enjoyed this analysis?

More Intelligence

A $5.3 Billion Startup With Art School Founders Just Declared War on Google's AI Empire

OpenAI Just Turned ChatGPT Into Your Personal Finance Advisor — And It Can See Your Bank Account