When AI Makes Technical Debt Worse (And How to Prevent It)

There's something counterintuitive about AI coding tools and technical debt. The promise is that they speed things up. And they do! But speed without direction is how debt compounds, not how it gets paid down.
Without guardrails, AI assistants learn context from the code around them. Which means they inherit the debt too. A deprecated API that appears in enough files gets used again. An inconsistent pattern gets replicated with confidence, at the speed of autocomplete. It doesn't introduce new debt out of nowhere. It faithfully copies and amplifies the old debt, everywhere, consistently.
The good news: this is a solvable problem. Specs are what solve it.
What Spec-Driven Development Actually Means
Before an AI assistant touches your codebase, you give it the context it needs to operate within the standards you care about. Conventions, constraints, what to avoid: written down somewhere the agent can actually read. A persistent, human-written spec that gets included as context with every interaction.
There are a few common formats. An agents.md file is a popular choice, and most tools have their own variation (claude.md, .rules, gemini.md). The format matters less than the habit. The point is that the AI has a definition of "right" and "wrong" before it starts, rather than inferring it from whatever's already there.
Without this, the AI mirrors the codebase as it is. With it, the AI works toward the codebase as it should be. That shift in framing changes everything.
Specs Belong in the Same Category as Linters
Linters, formatters, and CI checks all manage code quality, but they act after the fact. They catch problems that have already been written. Specs change what gets written in the first place.
That's a significant shift. It moves the cost to the left in the development workflow, which means fewer iterations, less context switching, and a smaller gap between intent and output. At scale, specs become a fundamental unit of programming. Not documentation, but infrastructure.
This also means specs should be treated like infrastructure. They need ownership, versioning, and a review rhythm. A spec that drifts from the codebase it governs is worse than no spec at all. It actively steers the AI in the wrong direction, and the AI will apply it confidently, every time.
Where to Start with Debt Reduction
The temptation is to spec out everything at once. Resist that. Start with the debt that actually hurts.
Think about what patterns show up most often, which ones add the most friction to daily work, and which ones slow down code review the most. The debt that scores high on all three is the best starting point and also the most measurable, because you've already described both the current state and the target, which means you can track the distance between them.
When writing specs for debt reduction, aim at patterns rather than individual files or functions. A spec that targets "replace all fetch calls using the old API wrapper with the new one" will outlast a spec that targets src/utils/api.ts.
Write at the level of intent, not implementation.
And version them alongside the code they govern. If a refactor changes the standard, the spec should be updated in the same PR. The spec is part of the code.
Fitting Specs Into the Workflow
The goal isn't to overhaul how a team works. It's to introduce specs at the points where they add value with the least friction.
In practice, that usually means three places: local development, pull requests, and CI/CD pipelines. Locally, specs shape what the agent suggests before anything lands in a shared branch. On PRs, they define the standard that contributions are reviewed against. In CI/CD, they validate that changes still conform to the target state. The process itself doesn't change. The standards just become explicit.
Individual ownership matters here too. Local specs (like an agents.md file) are owned by the developer or team that uses them. They're not org-wide standards, just focused, local context. Shared specs that apply across teams need a different ownership model: someone responsible for keeping them accurate, versioned, and actually enforced. For sharing specs across teams, a shared repository or runbook-style structure works well. The mechanism is less important than the habits: one place to collaborate and iterate, clear versioning, and an audit trail of decisions.
A spec targeting a specific migration (say, moving off a deprecated package to an LTS alternative) authored once and applied consistently across every affected repository is where the investment really pays off.
Measuring Whether It's Working
Tech debt has always been hard to measure. It tends to show up indirectly: slower velocity, longer reviews, a growing reluctance to touch certain parts of the codebase (which is a red flag, by the way — that reluctance is the debt speaking).
Specs help make this concrete. If a spec targets a specific anti-pattern, you can count how often that pattern appears and watch the number drop. If it defines a target state, you can measure how much of the codebase has reached it. Even fuzzy patterns become trackable when an AI-assisted tool is scanning for them.
Beyond pattern tracking, the signals that tell the clearest story are time-to-refactor per module, PR review cycle length, and overall velocity — ideally comparing areas with specs applied against areas without. Let the measurements guide you.
Specs improve through usage. Don't expect them to be perfect at first write.
If results vary too much, the specs are probably too broad. If they produce consistent but very narrow refactors, they may already be outdated. Measuring is the feedback loop that keeps specs useful.
The Common Pitfalls
Most teams that struggle with spec-driven development run into the same problems.
Too broad or too narrow. A broad spec gives the AI plenty of room for interpretation — so the output is less predictable. A narrow spec is precise but breaks as soon as something changes. The sweet spot is intent plus constraints: clear enough to produce consistent results, flexible enough that it doesn't need a rewrite with every contribution.
Stale specs. An outdated spec is worse than no spec. The AI applies it confidently regardless. If a refactor makes a rule obsolete, the spec should be updated or retired in the same PR.
Skipping review. Spec-driven development doesn't mean taking your hands off the wheel. All generated code should still pass a human review. The goal isn't to remove humans from the loop — it's to have them spend their time where it actually matters.
Treating specs as one-time documents. This one is the most common. A spec gets written, then never revisited. It drifts from the codebase, but the AI keeps applying it faithfully. Specs need an ownership model, a review rhythm, and regular assessment of whether they're still delivering value.
The Mindset Shift
Spec-driven development isn't just a technical practice. It's a change in how you think about technical debt.
Debt stops being an invisible accumulation of bad decisions and becomes a measurable gap between the current state and a defined target. You can let AI close that gap consistently, one file at a time. As long as the target is written down and the agent knows to work toward it.
That's the actual value. Not just faster code, not just less repetitive work: a codebase that stays coherent as it grows because the standards, the tooling, and the contributors are all pointed in the same direction.
The foundation doesn't need to be big. Start by writing down what already exists. Capture the context that only lives in people's heads. Define the quality bar. That's enough to improve output meaningfully from day one.
From there, it compounds and amplifies good practices across your code.
Enjoyed this article? I’m available for part-time remote work helping teams with architecture, web development (Vue/Nuxt preferred), performance and advocacy.