Anthropic Drops Claude Fable 5 and Mythos 5: The Gap Just Got Brutal

Anthropic didn't just ship new models today. They dropped a benchmark bloodbath that makes the competition look like they're running in sand.

Claude Fable 5 and Mythos 5 are here — the fifth generation of Anthropic's frontier models. And the numbers are borderline offensive to anyone else in the AI race.

The Benchmark Massacre

On SWE-Bench Pro — the gold standard for real software engineering tasks — Fable 5 scores 80.3%. Let that sink in. Claude Opus 4.8 manages 69.2%. GPT 5.5? A distant 58.6%. Gemini 3.1 Pro? 54.2%. Fable 5 isn't just winning; it's making the field look like intern code.

But that's the easy benchmark. On Cognition's FrontierCode — which tests production-grade coding under real standards — Fable 5 hits 29.3%. Opus 4.8 gets 13.4%. GPT 5.5? A humiliating 5.7%. The gap isn't a gap; it's a canyon.

Stripe says Fable 5 compressed five months of engineering work into days. In a 50-million-line Ruby codebase, the model finished a migration in one day that would've taken a full team over two months. This isn't incremental improvement. This is a different species of model.

Vision, Memory, and the Pokemon Test

Fable 5 is now state-of-the-art on vision tasks too. It can pull precise figures from scientific illustrations and rebuild web app source code from screenshots alone. As a flex, Anthropic had it play through Pokemon FireRed using only game screenshots. Earlier models needed complex helper frameworks with extra tools and map data. Fable 5 just... played.

The model also stays focused across millions of tokens and boosts its own results by taking notes. Anthropic didn't share specific benchmarks here, but the implication is clear: this thing can hold context across entire codebases and research papers without losing the plot.

Mythos 5: The Science Machine

While Fable 5 ships with conservative safety guardrails for general use, Mythos 5 drops restrictions in areas like cybersecurity and biology. And the results are genuinely startling.

Anthropic's protein design experts say Mythos 5 sped up parts of the drug design process by 10x. In one test, equipped with protein design and bioinformatics tools but zero human help, it matched or beat experienced human operators. It picked binding sites, launched design tools, fixed errors autonomously. Nine out of 14 protein targets yielded strong drug design candidates — now being studied.

But here's the kicker: Anthropic claims Mythos 5 is the first model to consistently produce novel and convincing scientific hypotheses. In blinded comparisons, their scientists preferred Mythos' molecular biology hypotheses over Opus-class models about 80% of the time. One hypothesis — a novel mechanism for an E. coli protein — was backed by an independent study.

In genomics, Mythos 5 worked largely on its own for over a week. It compiled single-cell data for millions of cells from 138 animal species, then designed and trained its own ML model to identify cells with the same function across distantly related organisms. The result reportedly outperformed a model recently published in Science, despite being 100 times smaller.

Cybersecurity and the Government Lock

Mythos 5 stays locked behind Project Glasswing, Anthropic's partnership with the US government. It scored 78% on ExploitBench — up from 69% for Mythos Preview and 40% for Opus 4.8. Anthropic calls it "the world's strongest cybersecurity model." Access expands gradually in coordination with the US government, which tells you everything about how powerful this thing is considered.

A Trusted Access Program for biology is coming too. Select researchers will get Fable 5 without biology and chemistry safeguards, though cyber safeguards stay in place. The biology unlock is coming. The cyber unlock stays federal.

The Price Tag: Ouch

Both models cost $10 per million input tokens and $50 per million output tokens. Anthropic says that's "less than half the price of Claude Mythos Preview," but it's double what Opus 4.8 costs. On Claude.ai plans, the new models count as 2x usage.

Fable 5 is available now via API and usage-based Enterprise plans. Subscription plans (Pro, Max, Team) get a staggered rollout. Until June 22, Fable 5 is included at no extra cost. Starting June 23, access requires usage credits. Anthropic plans to fold it back into regular subscriptions once they have enough capacity — which is code for "we're capacity-constrained and this is expensive to run."

🔥 Hot Takes

1. OpenAI's GPT-5.5 is now officially mid. 58.6% vs 80.3% on SWE-Bench Pro isn't a gap you close with a patch. It's an architecture gap. Anthropic has built something fundamentally better at reasoning through complex tasks, and the lead is widening. OpenAI's "chat is dead" pivot to agents looks like a distraction from the fact that their models are losing on raw capability.

2. The "AI can't do real science" crowd just took a body blow. Autonomous genomics research for a week? Novel scientific hypotheses that pass blinded review? A model that outperforms a Science publication with 100x fewer parameters? The skeptics said LLMs were just stochastic parrots. Mythos 5 is doing original research while they update their Twitter threads.

3. Anthropic is building a two-tier AI world and the government is gatekeeping the good stuff. Fable 5 for the masses, Mythos 5 for the feds and select partners. This isn't just a safety play — it's a power consolidation. The most capable models are being kept from public hands, and the justification is always "safety." But when the government controls access to the best scientific and cybersecurity AI, that's not safety. That's a monopoly on the future.

Bottom line: Anthropic just moved the goalposts so far that the rest of the field is playing a different sport. Fable 5 is the best generally available coding model on Earth. Mythos 5 is doing science that would take human teams months — in days, unsupervised. The question isn't whether Anthropic is winning. It's whether anyone else can still see the finish line.

Anthropic Drops Claude Fable 5 and Mythos 5: The Gap Just Got Brutal

The Benchmark Massacre

Vision, Memory, and the Pokemon Test

Mythos 5: The Science Machine

Cybersecurity and the Government Lock

The Price Tag: Ouch

🔥 Hot Takes

More Intelligence

In China, People Are Renting Out Their Faces to AI — and the Price Starts at $15

DeepSeek's Liang Wenfeng: Low Profile, High Ambition — Restraint as Strategy

China’s AI Apps Now Process 140 Trillion Tokens a Day. That’s the Agent Economy in Real Time.

The Benchmark Massacre

Vision, Memory, and the Pokemon Test

Mythos 5: The Science Machine

Cybersecurity and the Government Lock

The Price Tag: Ouch

🔥 Hot Takes

Enjoyed this analysis?

More Intelligence

In China, People Are Renting Out Their Faces to AI — and the Price Starts at $15

DeepSeek's Liang Wenfeng: Low Profile, High Ambition — Restraint as Strategy

China’s AI Apps Now Process 140 Trillion Tokens a Day. That’s the Agent Economy in Real Time.