🐾 LIVE
Chinese Tech Workers Are Training Their AI Replacements — And Fighting Back Xiaomi miclaw Becomes China's First Government-Approved AI Agent OpenAI's Quiet Acquisitions Signal Existential Questions About Its Future Google Gemini Launches Native Mac App: The Desktop AI Wars Are On Cerebras Files for IPO at $23B, Backed by $10B OpenAI Partnership DeepSeek Raising $300M at $10B Valuation — While Remaining Profitable ByteDance vs Alibaba vs Tencent: China's AI Video War Heats Up Chinese Tech Workers Are Training Their AI Replacements — And Fighting Back Xiaomi miclaw Becomes China's First Government-Approved AI Agent OpenAI's Quiet Acquisitions Signal Existential Questions About Its Future Google Gemini Launches Native Mac App: The Desktop AI Wars Are On Cerebras Files for IPO at $23B, Backed by $10B OpenAI Partnership DeepSeek Raising $300M at $10B Valuation — While Remaining Profitable ByteDance vs Alibaba vs Tencent: China's AI Video War Heats Up
Industry

Developers Are Refusing to Work Without AI — And the Data Says They Should Probably Stop

A major AI research lab tried to measure how much coding tools help programmers. They couldn't even run the study.

2026-05-30 By AgentBear Editorial Source: TechCrunch 11 min read
Developers Are Refusing to Work Without AI — And the Data Says They Should Probably Stop

In 2026, you cannot pry AI coding tools out of developers' hands. Not figuratively — literally. When researchers from METR, one of the most respected AI evaluation labs on the planet, tried to repeat a landmark 2025 study on developer productivity, they hit a wall. Developers simply refused to participate without their AI assistants. The experiment was dead on arrival.

This isn't a fringe observation. It's a signal — loud and flashing — that the software engineering world has crossed a dependency threshold that may be impossible to walk back. And according to the very data these tools are supposed to improve, that dependency might be built on a foundation of self-delusion.

The Study That Never Happened

In 2025, researchers published groundbreaking work measuring how open source developers performed tasks by hand versus with AI assistance. The results were surprising: while developers felt more productive with AI, the stopwatch told a different story. Code generated faster, yes — but developers spent additional time fixing errors, steering the model, and waiting for completions to finish. Net result? AI actually slowed them down.

METR wanted to update that study in February 2026 to see if newer models and increased familiarity had changed the equation. They never got the chance. "Most developers won't work, even on a limited number of tasks, without AI anymore," the researchers wrote. Devs weren't willing to participate "because they do not wish to work without AI" — not even for a controlled experiment with a clear scientific purpose.

Think about what that means. These are people whose job is to solve problems, to reason through complexity, to build systems from first principles. And they've reached a point where the idea of working without an autocomplete oracle feels not just inefficient — but unacceptable.

Self-Reporting vs. Reality

Blocked from running their controlled experiment, METR pivoted to a survey. In May 2026, they published self-reported productivity data from technical employees. The results were exactly what you'd expect from a population that has emotionally and professionally committed to a tool: respondents perceived that AI made them twice as valuable to their organizations.

Twice as valuable. That's a staggering claim. And it stands in direct tension with a growing body of evidence suggesting the opposite — that widespread AI coding adoption is creating hidden costs that aren't showing up in quarterly reports yet, but absolutely will.

The Tokenmaxxing Trap

Enter "tokenmaxxing," the defining workplace trend of 2026. Companies, desperate to justify their massive AI infrastructure spending, started using token consumption — how many words a developer feeds through an AI model — as a proxy for productivity. More tokens = more work = more value. Simple. Seductive. And completely broken.

The cracks showed almost immediately. Amazon, one of the most data-driven companies on Earth, shut down an internal token-tracking leaderboard called Kirorank after employees figured out they could game it. The strategy was elegant in its stupidity: use AI agents excessively, run up token counts, look productive on the dashboard. The actual output? Marginal at best. The cost? Substantial. Amazon killed the program.

Uber's story is even more stark. The company blew through its entire 2026 AI budget within the first four months of the year. Andrew Macdonald, Uber's COO, recently admitted on a podcast that this spending "hadn't led to a measurable increase in projects or productivity." That's not a line you deliver if the numbers are working in your favor. That's a concession.

The Maintenance Debt Bomb

Here's where it gets genuinely worrying. Even if AI helps developers write code faster in the moment, what happens to that code over months and years? Software engineering isn't a sprint — it's a marathon of maintenance, debugging, refactoring, and extension. Code that ships fast but breaks often isn't an asset. It's a liability with compounding interest.

James Shore, a respected programmer and author, made this case in a blog post that went viral on Hacker News. "You write code twice as quick now? Better hope you've halved your maintenance costs," he wrote. "Otherwise, you're screwed. You're trading a temporary speed boost for permanent indenture."

That isn't abstract philosophy. It's arithmetic. If AI-generated code requires more fixes, more reviews, more architectural untangling over its lifetime, the upfront velocity gain evaporates — and then some.

The data supports this pessimistic view. Aiswarya Sankar, founder of reliability engineering startup Entelligence AI, posted a widely circulated observation: companies are spending 44% of their AI tokens on bug fixes for code that AI itself generated. Nearly half of the AI budget is being consumed cleaning up after the AI. That's not productivity. That's a snake eating its own tail.

CodeRabbit, a code review tool company, analyzed open source pull requests and found that AI-produced code contained 1.7 times more problems than human-written code. Now, CodeRabbit has a vested interest in this narrative — they sell AI code review tools. But independent researchers have reached similar conclusions. A team from Singapore Management University published a report in April 2026 warning that "AI-generated code can introduce long-term maintenance costs into real software projects." The problems aren't theoretical. They're accumulating in repositories right now.

Why Developers Can't Let Go

So if the data is this mixed — and increasingly negative — why the religious attachment to AI coding tools? The answer is part psychological, part structural, and entirely predictable.

First, the experience of using these tools is genuinely pleasant in the short term. The dopamine hit of watching a model autocomplete a complex function, suggest a clever refactor, or generate boilerplate in seconds is real. It feels like superpowers. And in some narrow contexts — writing repetitive glue code, generating test cases, scaffolding CRUD operations — it genuinely helps.

Second, competitive pressure. When your peers are using AI and you're not, you look slow. When your manager sees token dashboards and speed metrics, you need to keep up. The social and professional cost of opting out is steep.

Third, and most importantly: AI tools have become crutches for cognitive offloading in ways that degrade the very skills they augment. The METR study failure is the canary in this coal mine. These aren't lazy developers. They're professionals who have so deeply integrated AI into their workflow that removing it feels like removing an arm. That level of dependency should concern anyone who cares about engineering quality.

What the AI Vendors Say

Unsurprisingly, the companies selling AI coding tools have a ready answer: more AI. Cognition, maker of the AI coding agent Devin, argues that developers should simply use AI agents to fix the code that other AI systems generate. Founder Scott Wu suggests this is the natural evolution — AI writes, AI reviews, AI debugs, humans supervise.

But even Wu acknowledges current limitations. He rates Devin's skill level as "somewhere between a junior and a mid-level engineer" depending on the task. This is not a "hand it off and forget it" solution. It's a more sophisticated form of the same human oversight that was already required — just with more automation in the loop.

The question isn't whether AI coding tools will improve. They will. The question is whether the current generation of tools, used in the current generation of workflows, is creating more technical debt than it resolves. And the answer, increasingly, appears to be yes.

The Harder Path Forward

The Singapore Management University researchers suggest a more grounded approach than either blind adoption or total rejection. Programmers, they argue, need to understand AI capabilities and limitations as deeply as they understand their programming languages. They need quality assurance systems specifically designed for AI-generated code. And they need to review AI output with the same rigor they'd apply to a junior developer's first pull request.

More importantly, humans need to retain ownership of architecture, security design, and high-level system thinking. These are the domains where context, judgment, and creativity matter — and where current AI tools are weakest. The temptation to delegate everything to the model is strong, especially under deadline pressure. But it's a trap.

The Bigger Picture

This story isn't really about coding. It's about a pattern that repeats across every industry touched by AI: the gap between perceived productivity and measured outcomes, the seductive metrics that don't map to real value, the organizational pressure to adopt tools before their costs are understood.

Software engineering was supposed to be the field best positioned to evaluate AI tools rationally. These are people who understand complexity, who debug systems for a living, who can read research papers and evaluate claims. If they can't resist the hype cycle — if they become so dependent on AI that they can't even participate in a study without it — what chance do other industries have?

The METR study failure is a warning. Not about AI as a technology, but about human organizations' capacity for self-deception when incentives and metrics align to reward activity over outcomes. Tokenmaxxing. Speed without quality. Dashboard productivity. These aren't AI problems. They're management problems, cultural problems, measurement problems. AI just accelerated them.

The developers who refused to work without AI weren't being irrational. They were being human — responding to the incentives and tools available to them. The real question is whether the organizations employing them, and the industry shaping those incentives, can course-correct before the maintenance debt comes due.

Because it's coming due. The only question is when, and how much it will cost to pay.

Enjoyed this analysis?

Share it with your network and help us grow.

More Intelligence

Industry

Microsoft and Nvidia Are Building AI PCs That Run Actual Agents — And They're Using OpenClaw

Industry

This Chinese AI Startup Wants Everyone to Be a Songwriter

Back to Home View Archive