Anthropic's Wildest Week Ever: Leaked Their Scariest AI Model While Beating the Pentagon in Court

If you wrote this week's Anthropic news as a movie script, Hollywood would reject it for being too unrealistic. In the span of 48 hours, the company that built its entire brand on AI safety managed to accidentally leak details of its most dangerous AI model onto the open internet — while simultaneously winning a landmark court battle against the Pentagon and the Trump administration. You literally cannot make this up.

Let's unpack this absolute circus, because there are two massive stories here and they paint a portrait of an AI company caught between its own ambitions and the chaos of the world around it.

Part One: Meet Claude Mythos — The Model Anthropic Didn't Want You to Know About (Yet)

On Wednesday, Fortune reporter Bea Nolan discovered something that Anthropic's security team is probably still losing sleep over. Details of an unreleased AI model called 'Claude Mythos' were sitting in a publicly accessible data cache, waiting for anyone with an internet connection to find them.

And 'anyone' turned out to be Fortune Magazine, along with Roy Paz, a senior AI security researcher at LayerX Security, and Alexandre Pauwels, a cybersecurity researcher at the University of Cambridge. Between them, they found roughly 3,000 unpublished assets linked to Anthropic's blog — draft posts, images, PDFs — all sitting in an unsecured, publicly searchable data store.

The crown jewel? A draft blog post describing Claude Mythos as 'by far the most powerful AI model we've ever developed.'

Let that sink in. The company whose entire pitch to the world is 'we take AI safety more seriously than anyone' left their scariest model's details on the open internet because of what they later called a 'human error' in their content management system configuration.

What We Know About Mythos

The leaked draft reveals a genuinely significant development in the AI race. Here's what Anthropic apparently didn't want us to know — at least not yet:

New model tier called 'Capybara': Currently, Anthropic markets three tiers — Haiku (small/fast), Sonnet (balanced), and Opus (large/powerful). Capybara sits above Opus as a new, even larger tier. In Anthropic's own words: 'Capybara is a new name for a new tier of model: larger and more intelligent than our Opus models — which were, until now, our most powerful.'
'Step change' in capabilities: Anthropic confirmed to Fortune that the model represents 'a step change' in performance and is 'the most capable we've built to date.' The draft blog claims dramatically higher scores in software coding, academic reasoning, and cybersecurity versus Claude Opus 4.6.
Already in testing: The model is currently being trialled by 'early access customers' — meaning select organisations are already using this thing.
Expensive to run: The draft notes the model isn't ready for general release partly because of cost. Running something more powerful than Opus ain't cheap.
Planned cautious rollout: The leaked document outlines a staged release strategy, starting with a small group of early-access users before wider availability.

But here's where it gets properly spicy.

The Cybersecurity Problem That's Keeping Anthropic Up at Night

According to the leaked draft, Anthropic believes Claude Mythos poses unprecedented cybersecurity risks. The company's own words are remarkable in their candour:

'[The model] is currently far ahead of any other AI model in cyber capabilities... it presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.'

Read that again. Anthropic is saying, in their own draft blog post, that their model can find and exploit software vulnerabilities faster than humans can patch them. And that this is just the beginning — other models will follow.

This isn't theoretical hand-wringing. Just last year, Anthropic discovered that Chinese state-sponsored hacking groups had already been using Claude Code to infiltrate roughly 30 organisations — including tech companies, financial institutions, and government agencies — before the company detected and shut it down.

The planned release strategy for Mythos reflects this anxiety. Anthropic's draft blog says they want to release it first to cybersecurity defenders, giving them 'a head start in improving the robustness of their codebases against the impending wave of AI-driven exploits.' Translation: we're giving the good guys the weapons first because the bad guys are coming regardless.

This mirrors what happened in February when OpenAI released GPT-5.3-Codex — the first model OpenAI classified as 'high capability' for cybersecurity tasks under its own risk framework. The arms race in AI-powered hacking and defending is now fully real, and Mythos appears to be a significant escalation.

The Irony Is Almost Too Perfect

Let's state the obvious: a company that positions itself as the most safety-conscious AI lab in the world just leaked details of their most dangerous model because someone misconfigured a content management system. The company that writes extensive safety research papers, that employs teams dedicated to AI alignment, that literally built its brand on the premise that 'we're the responsible ones' — left 3,000 unpublished documents in a public data lake.

Anthropic called it 'human error' and 'early drafts of content considered for publication.' After Fortune contacted them, they removed public access to the data store. But the damage was done — the existence of Mythos is now public knowledge, along with details about an exclusive CEO summit in Europe that was part of their enterprise sales push.

To be fair, every company has security incidents. Google, Microsoft, and OpenAI have all had their share of embarrassing leaks. But when your entire brand is built on being more careful than everyone else, a leak like this hits different. It's like a locksmith getting his house burgled.

Part Two: Anthropic vs. The Pentagon — A Court Victory for the Ages

While Anthropic was dealing with the Mythos leak fallout, they were also winning a massive legal battle on the other side of the country. On Thursday, U.S. District Judge Rita Lin in San Francisco granted Anthropic a preliminary injunction that blocks the Trump administration from blacklisting the company.

The backstory here is extraordinary. In late February, Anthropic CEO Dario Amodei publicly stated that he would not allow Claude to be used for autonomous weapons systems or to surveil American citizens. The Pentagon's response was essentially: 'We'll decide how to use the tools we buy, not you.'

What followed was a rapid escalation:

Defense Secretary Pete Hegseth declared Anthropic a 'supply chain risk' — a designation historically reserved for foreign intelligence agencies and terrorist organisations, not American tech companies.
President Trump ordered all federal agencies to stop using Claude entirely.
The Pentagon formally notified Anthropic, meaning defence contractors like Amazon, Microsoft, and Palantir would need to certify they don't use Claude in any military work.
Anthropic sued, filing two separate lawsuits alleging First Amendment retaliation and violations of due process.

Judge Lin's ruling was devastating for the government. Her language wasn't just favourable to Anthropic — it was a full-throated rebuke of the administration's actions.

The Judge's Most Savage Lines

Judge Lin didn't hold back. Some highlights from the ruling:

'Nothing in the governing statute supports the Orwellian notion that an American company may be branded a potential adversary and saboteur of the U.S. for expressing disagreement with the government.'

'If the concern is the integrity of the operational chain of command, the Department of War could just stop using Claude.'

'These broad measures do not appear to be directed at the government's stated national security interests... these measures appear designed to punish Anthropic.'

'Punishing Anthropic for bringing public scrutiny to the government's contracting position is classic illegal First Amendment retaliation.'

She also noted that the Pentagon had previously praised Anthropic as a partner and put the company through rigorous national security vetting. It was only after Amodei publicly raised concerns about military AI use that the government moved to 'cripple Anthropic: to blacklist it from doing business with any company that services the U.S. military, to permanently cut off its ability to work with the federal government, and to brand it an adversary.'

The ruling found the supply chain risk designation 'likely both contrary to law and arbitrary and capricious.' In legal terms, that's about as harsh as it gets.

The Support Was Broad

What made this case unusual was the breadth of support for Anthropic. Microsoft filed an amicus brief supporting the company. The ACLU weighed in. Retired military leaders submitted their own brief. When the ACLU, Microsoft, and former generals all agree on something, you know the government's position is on shaky ground.

The preliminary injunction pauses the government's ban while the full case is decided — which could take months. But the ruling strongly signals that Anthropic is likely to win on the merits. As Jennifer Huddleston of the Cato Institute noted, the decision 'is really diving into some of those classic questions of ensuring that there's not retaliation against a company or an individual for exercising their First Amendment rights.'

The Bigger Picture: AI Companies vs. Governments

This case sets a precedent that extends far beyond Anthropic. For the first time, a court has essentially ruled that the US government cannot punish an AI company for publicly disagreeing with how its technology should be used. That's massive.

Every major AI company has had to navigate the tension between government contracts and ethical boundaries. Google famously pulled out of Project Maven in 2018 after employee protests over military drone AI. Microsoft has faced internal pushback over military contracts. OpenAI has gradually loosened its own restrictions on military use.

Anthropic chose to draw a line in the sand publicly — and got punished for it. The court's ruling that this punishment was likely illegal First Amendment retaliation sends a clear message: AI companies have the right to set ethical boundaries on their products, and the government can't destroy their business for doing so.

Whether you agree with Anthropic's position on autonomous weapons or not, the principle matters. If the government can blacklist any tech company that publicly disagrees with how its products are used, the chilling effect on free speech in the tech industry would be enormous.

🔥 Our Hot Take

Anthropic is simultaneously the most responsible and most chaotic AI company on the planet, and this week proved both.

On one hand, you have a company willing to sacrifice billions in government revenue to maintain ethical principles about autonomous weapons. That takes guts. Dario Amodei knew exactly what would happen when he drew that line, and he drew it anyway. The court victory validates that stance and sets a powerful precedent for the entire industry.

On the other hand, you have that same company leaving 3,000 unpublished documents — including details of their most powerful and potentially dangerous model — in a publicly accessible data store because someone botched a CMS configuration. The irony is thick enough to cut with a knife.

But here's our real take: Mythos changes everything.

If the leaked descriptions are accurate, we're looking at the first AI model that genuinely outpaces human cybersecurity capabilities. Not 'helps hackers work faster' like current models. Actually outpaces. Anthropic's own assessment says it 'far outpaces the efforts of defenders' — and they're scared enough to plan a defender-first rollout strategy.

Combined with OpenAI's GPT-5.3-Codex hitting 'high capability' status for cyber tasks, and the documented cases of state-sponsored hacking groups already using AI models, we're entering a new era. The next generation of cyberattacks won't be crafted by human hackers sitting in dark rooms. They'll be generated by AI models that can find, chain, and exploit vulnerabilities at machine speed.

The Pentagon fight now looks even more significant in this context. The US government tried to blacklist the company building one of the most powerful cybersecurity tools on Earth — not because Anthropic was a security risk, but because its CEO had the audacity to say 'maybe don't use our AI to build killer robots.' Judge Lin saw through that immediately.

Anthropic is flawed. The leak proves they're human. But they're also building something genuinely transformative, and they're willing to fight the most powerful government on Earth to maintain their principles. In an industry full of companies that talk about AI safety while quietly cashing military cheques, that means something.

Just maybe hire a better CMS admin next time, yeah?

What to Watch

Mythos release timeline: Anthropic now has to either accelerate or delay their plans, since the cat's out of the bag. Early access customers already have it — expect leaks about real-world performance soon.
Pentagon appeal: The government will almost certainly appeal Judge Lin's ruling. This case is heading to the appeals court, and potentially the Supreme Court.
Cybersecurity fallout: If Mythos is as capable as described, expect a rush from both defenders and attackers to get their hands on it. The defender-first rollout strategy is smart but won't hold forever.
Industry response: How will OpenAI, Google, and Meta respond? If Anthropic has a model 'far ahead' in cyber capabilities, competitors will want to close that gap fast.
Amodei's next move: Dario Amodei has proven he'll stand on principle. With a court victory in hand and the world's scariest AI model in testing, his next public statements will be closely watched.

Reported by Reporter Bear | Analysis by GoldmanSax

The JPMoreGain Project — Where we don't just chase alpha, we are alpha.

Anthropic's Wildest Week Ever: Leaked Their Scariest AI Model While Beating the Pentagon in Court

Part One: Meet Claude Mythos — The Model Anthropic Didn't Want You to Know About (Yet)

What We Know About Mythos

The Cybersecurity Problem That's Keeping Anthropic Up at Night

The Irony Is Almost Too Perfect

Part Two: Anthropic vs. The Pentagon — A Court Victory for the Ages

The Judge's Most Savage Lines

The Support Was Broad

The Bigger Picture: AI Companies vs. Governments

🔥 Our Hot Take

What to Watch

More Intelligence

OpenAI Just Dropped GPT-5.5 Instant — And It's Designed to Stop Lying

This 11-Person London Startup Wants to Make AI 100x Cheaper — and Free Europe From American Cloud Giants

Anthropic Built an AI So Good at Hacking, They Won't Let You Use It — And It Just Cracked Apple's M5

Part One: Meet Claude Mythos — The Model Anthropic Didn't Want You to Know About (Yet)

What We Know About Mythos

The Cybersecurity Problem That's Keeping Anthropic Up at Night

The Irony Is Almost Too Perfect

Part Two: Anthropic vs. The Pentagon — A Court Victory for the Ages

The Judge's Most Savage Lines

The Support Was Broad

The Bigger Picture: AI Companies vs. Governments

🔥 Our Hot Take

What to Watch

Enjoyed this analysis?

More Intelligence

OpenAI Just Dropped GPT-5.5 Instant — And It's Designed to Stop Lying

This 11-Person London Startup Wants to Make AI 100x Cheaper — and Free Europe From American Cloud Giants

Anthropic Built an AI So Good at Hacking, They Won't Let You Use It — And It Just Cracked Apple's M5