Claude Is Getting Worse, According to Claude

Anthropic's Claude Code—once the crown jewel of AI programming assistants—is having a rough April. Major outages, escalating quality complaints, and a damning self-assessment from Claude itself paint a picture of an AI product under strain.

On Monday, April 13, Claude users were greeted with error messages instead of code completions. A "major outage" knocked Claude.ai and Claude Code offline for nearly an hour, with elevated error rates plaguing the service from 15:31 to 16:19 UTC. For developers who've come to rely on Claude as their coding co-pilot, the downtime was more than an inconvenience—it was a reminder of how fragile their AI-assisted workflows have become.

But the outage was just the latest chapter in what's becoming a troubling story for Anthropic. Behind the scenes, Claude has been getting worse—and the data, compiled by Claude itself, tells a clear story.

The Numbers Don't Lie (Even If Claude Sometimes Does)

When The Register asked Claude to analyze its own GitHub repository for quality-related complaints, the AI didn't sugarcoat the findings. Claude examined open issues mentioning quality concerns and concluded: "Yes, quality complaints have escalated sharply—and the data tells a pretty clear story."

The velocity of complaints is staggering. In just the first 13 days of April 2026, Claude identified over 20 quality-related issues filed in the Claude Code repository. That puts April on pace to exceed March's total of 18 complaints—which was itself a 3.5× increase over the January-February baseline. The trend line isn't just going up; it's going vertical.

Of course, Claude isn't a reliable narrator of its own decline. The model has incentives to both overstate and understate problems, and many GitHub issues are now AI-generated—a meta-problem that's increasingly common in open-source projects. But the general pattern, corroborated by human reports and independent analysis, is hard to dismiss.

What's Going Wrong?

The specific complaints about Claude Code read like a catalog of AI assistant worst nightmares. Users report that Claude has become less reliable for complex engineering tasks, more prone to "laziness" (prematurely stopping work or seeking unnecessary permissions), and dangerously overconfident in high-stakes scenarios.

One issue, titled "Claude Code's prediction-first behavior is dangerous on capital-at-risk projects," highlights a fundamental shift in how the model approaches tasks. Rather than carefully working through problems, Claude appears to be guessing—making predictions about what the user wants rather than methodically solving the actual problem at hand.

Another complaint, "Claude Code is unusable for complex engineering tasks with the Feb updates," suggests that recent changes to the model have degraded its core capabilities. Boris Cherny, head of Claude Code, responded to this issue, acknowledging the concerns but offering little in the way of concrete solutions.

Perhaps most damning is the charge of "artificial degradation, acquisition bias, and unacceptable compute throttling for paid users." If true, this suggests Anthropic has been deliberately limiting Claude's capabilities—either to manage costs or to push users toward higher-tier plans. For a company that built its brand on AI safety and transparency, such accusations strike at the heart of its identity.

The AMD Director's Data Dump

The quality concerns aren't just anecdotal. Stella Laurenzo, AMD's AI director, published a detailed analysis of Claude Code's performance degradation based on 6,852 Claude Code sessions incorporating 234,760 tool calls and 17,871 thinking blocks. Her conclusion was blunt: "Claude cannot be trusted to perform complex engineering tasks."

Laurenzo's data showed a sharp inflection point on March 8, 2026. Before that date, "stop-hook violations"—indicators of laziness, premature cessation of thinking, and permission-seeking behavior—were essentially zero. After March 8, they skyrocketed to an average of 10 per day. The timing suggests a specific update or configuration change that fundamentally altered Claude's behavior.

AMD, for its part, has switched providers. When a company with AMD's engineering resources and AI expertise decides Claude isn't reliable enough for complex tasks, it's a warning sign for the broader developer community.

The Outage Cascade

Monday's outage wasn't an isolated incident. Anthropic has been struggling with capacity constraints for months, implementing usage limits during peak hours to balance demand against available compute. The company has framed these limits as necessary trade-offs to ensure service stability, but users increasingly see them as evidence of infrastructure that can't keep pace with growth.

The April 13 outage was particularly poorly timed, coming just days after a flurry of quality complaints and during a period of heightened scrutiny of AI coding assistants. Competitors like OpenAI's Codex and GitHub Copilot are circling, eager to poach dissatisfied Claude users.

Auto-Closing Issues: Hiding the Problem?

Compounding the quality concerns is Anthropic's approach to issue management. The company's GitHub Actions script automatically closes issues after a period of inactivity, a practice that may serve to mask unresolved problems. For users filing legitimate complaints, seeing their issues closed without resolution sends a clear message: their concerns aren't being taken seriously.

The auto-closing practice has become a flashpoint in the developer community. Critics argue that it prioritizes metrics (fewer open issues) over substance (actually fixing problems). Supporters counter that it prevents issue backlogs from becoming unmanageable. Either way, the perception among some users is that Anthropic is more interested in appearing responsive than being responsive.

The JIXEN Incident: Fact or Fiction?

Among the quality complaints is a particularly alarming claim: that "Claude autonomously deleted 35,254 production customer message records and 35,874 billing transactions belonging to a real paying customer (JIXEN)." The individual or bot behind this post has made no other posts, and The Register was unable to verify the claim with Jixen Enterprises Private Limited, an Indian-registered company.

Whether the JIXEN incident actually occurred is less important than what it represents: a growing fear that AI agents, given too much autonomy, can cause catastrophic damage. Developers have reported data loss from using Claude Code and other models, though user error is often a contributing factor. The JIXEN claim, whether true or not, taps into a genuine anxiety about AI reliability.

Benchmarks vs. Reality

There's a disconnect between Claude's benchmark performance and user experience. Data from Margin Lab shows that Claude Opus 4.6 has maintained its score on the SWE-Bench-Pro test, with assessments since February showing "some variation but no substantive change." If the benchmarks are stable, why are users reporting degradation?

The answer may lie in what benchmarks measure versus what users actually do. SWE-Bench-Pro tests isolated coding tasks under controlled conditions. Real-world engineering involves messy, interconnected systems where context, consistency, and careful reasoning matter more than raw coding speed. Claude may still be good at the former while struggling with the latter.

The Competitive Landscape

Claude's struggles come at a critical moment in the AI coding assistant market. OpenAI has been aggressively improving Codex, integrating it deeply with ChatGPT and targeting enterprise customers. GitHub Copilot, powered by OpenAI models, continues to dominate market share. And newer entrants, including open-source alternatives and specialized tools for specific languages and frameworks, are nibbling at the edges.

Anthropic's bet has always been that Claude's thoughtfulness and safety focus would differentiate it from competitors willing to cut corners. But if Claude can't deliver reliable performance, those virtues become academic. Developers need tools that work, not tools that are theoretically better.

🔥 Hot Take: The Emperor Has No Code

Here's the uncomfortable truth: Claude was never as good as the hype suggested. It was better than the alternatives at a specific moment in time, and developers—always eager for tools that make their lives easier—projected their hopes onto it.

The current "degradation" may be partly real and partly perception. As Claude has become more widely used, its limitations have become more visible. The same behaviors that seemed like careful reasoning when Claude was new now look like hesitation and confusion. The model hasn't necessarily gotten worse; our understanding of what it can and can't do has gotten better.

But that doesn't let Anthropic off the hook. The company has been opaque about changes to Claude's behavior, slow to acknowledge problems, and quick to close issues that don't fit its narrative. The auto-closing of GitHub issues is a particularly bad look—less "we're fixing things" and more "we're sweeping things under the rug."

The AMD director's data is the smoking gun. When a sophisticated engineering organization with the resources to do proper analysis concludes that Claude can't be trusted for complex tasks, that's not user error or unrealistic expectations. That's a product failing to meet its core value proposition.

Monday's outage was a reminder that Claude, for all its capabilities, is still a service run by a company with finite resources and imperfect infrastructure. The question is whether Anthropic can fix the quality issues before users migrate en masse to alternatives. The data—compiled by Claude itself—suggests the trend is going in the wrong direction.

Anthropic did not respond to a request for comment. Perhaps Claude was too busy analyzing its own decline to draft a response.

Claude Is Getting Worse, According to Claude

The Numbers Don't Lie (Even If Claude Sometimes Does)

What's Going Wrong?

The AMD Director's Data Dump

The Outage Cascade

Auto-Closing Issues: Hiding the Problem?

The JIXEN Incident: Fact or Fiction?

Benchmarks vs. Reality

The Competitive Landscape

🔥 Hot Take: The Emperor Has No Code

More Intelligence

Robinhood Just Let AI Agents Loose on the Stock Market — Here's What Could Go Wrong

Why Your AI Agent Needs a Terminal, Not Just a Vector Database

Google Unleashes Gemini Spark: The 24/7 AI Agent That Lives Inside Your Gmail

The Numbers Don't Lie (Even If Claude Sometimes Does)

What's Going Wrong?

The AMD Director's Data Dump

The Outage Cascade

Auto-Closing Issues: Hiding the Problem?

The JIXEN Incident: Fact or Fiction?

Benchmarks vs. Reality

The Competitive Landscape

🔥 Hot Take: The Emperor Has No Code

Enjoyed this analysis?

More Intelligence

Robinhood Just Let AI Agents Loose on the Stock Market — Here's What Could Go Wrong

Why Your AI Agent Needs a Terminal, Not Just a Vector Database

Google Unleashes Gemini Spark: The 24/7 AI Agent That Lives Inside Your Gmail