The Craft of AI

The AI Coding Maturity Curve

Addressing AI Slop Debt: to coding harnesses and beyond.

By Luke Lin · May 20, 2026 · 7 min read

I joined SoFi's data team in 2020, inheriting a Monday morning PagerDuty report with 70 disruptions on it from the week prior. Most of them traced back to the same root cause: critical user-facing features and operations running on SageMaker notebooks that nobody owned. We spent the next three years migrating that work into proper, observable, owned pipelines, with plenty of growing pains in between.

Two weeks ago a founder told me his team's exuberance about agentic coding had flipped into paranoia for the same reason. They'd unleashed agentic coding across their team and gotten real initial gains, with micro apps spinning up to run the business better.

The problem was that nobody knew how many of these vibe coded apps existed, who was maintaining them, whether they followed any sound engineering or product practices, or how they were handling data and security. Different decade, different tool, same movie.

This is what the next three years are about to look like for every team that handed Claude Code and Lovable to its engineers without making taste and rigor first-class citizens.

If you're shipping software on Claude Code raw or Lovable for anything beyond prototype, you've already lost the next twelve months. You just haven't seen the bill yet.

It's like giving a child as much candy as they want, then facing the dentist telling you there are seven cavities to fill*.

*This actually happened to me when I was 8. It was not fun.

You're Not Preventing AI Slop Debt. You're Industrializing It.

Every team using agentic coding is accruing AI Slop Debt: the future bugs and product rework that come due because you let an agent's default reasoning drive your build.

According to Carnegie Mellon, AI-generated code accumulates technical debt 3× faster than human-written code, because LLMs embed unstated assumptions invisible to code review, compounded by 3–4× higher code volume.

The slop-as-debt thesis isn't new. TechTarget, Pragmatic Engineer, Addy Osmani, and a dozen Medium essays a week have been making versions of it since Q1.

Debt isn't the only failure mode. Vibe coding on platforms like Lovable means you're also relying on their governance, and their governance has already been breached.

In April 2026, CVE-2025-48757 exposed missing Row Level Security policies in Lovable-generated projects. Any free Lovable account could read another user's source code, database credentials, AI chat history, and customer data across every project created before November 2025.

Two months earlier, 16 vulnerabilities in a single Lovable-hosted app leaked 18,000 users' data.

The Agentic Coding Maturity Curve

None of this means you should keep AI coding off your team. If you want to compete on velocity right now, you have to be building with AI.

On high-AI-adoption teams, developers complete 21% more tasks and merge 98% more pull requests than teams without it.

Falling behind on agentic coding isn't a choice anymore.

But there's a maturity curve your team has to climb if you want the velocity gains without the slop.

The Coding Agent Maturity Curve: Level 1 Replit/Lovable for non-technical builders, Level 2 raw coding agents, Level 3 coding harness, Level 4 autonomous agents with product sense. AI Slop Debt accumulates through Level 2, then drops once a harness manages it.

Level 1: Replit and Lovable for fully non-technical builders

This is where non-engineers prototype new UIs, proofs of concept, and operational tools. It empowers people who don't write code to participate in the build.

The line stops the moment sensitive data starts flowing in, or the moment you have something with real traction that you actually need to maintain.

The recommended pattern from AI-native dev teams is to use Lovable for the prototype, then bring in technical leadership to rebuild before the data and dependencies compound.

Level 2: Coding agents raw dog (Claude Code, Codex)

This is where most engineers started with agentic coding. Install the agent and start ripping out PRs.

The first few are pure magic, especially when the agent finds a bug that would have taken you two hours to chase down on your own. The trouble starts when you scale past one-off fixes.

Either you write specs so detailed you might as well have written the code yourself, or you let the agent invent design patterns that don't match your codebase and accrue AI Slop Debt at every commit.

Level 3: Coding harness (gstack, superpowers)

In early 2026, coding harnesses became the next standard. The two that have risen to the top are gstack and superpowers.

A harness is a deliberately designed set of skills that puts a real software development process around the agent, with rigor, checks, and reviews at every step. It enforces engineering practices like RLS policy checks and race-condition detection, automates QA with Playwright, and infuses product taste through upfront planning before the agent makes assumptions about what to build.

How a coding harness helps: gstack and Superpowers mapped across the build lifecycle from idea and problem through engineering plan, UX and design, code and review, to QA and ship — showing where each enforces product taste and engineering taste.

I've been using gstack religiously for the last two months. It's let me ship a production app with confidence. I'm a technical PM, not an engineer. I shared the Baohua codebase with two senior engineers I trust, and both signed off on the code quality.

If you're trying to build production software with AI today and you're not using a coding harness, you're shipping AI slop.

What's Still Missing From Coding Harnesses

Once you've shipped enough code on a harness, the next gap becomes obvious. My cofounder and I are building modastack, a coding harness that pushes past today's best practices, using our own builds and our clients as the guiding light. Two areas are getting most of our attention.

Stronger product sense, baked into the build

The /office-hours skill in gstack does a solid job of applying Brian Balfour's frameworks in spirit, but I've seen too many sloppy PRDs come out of it. I added an adversarial product reviewer, /office-helper, that runs every assumption through first-principles. Same prompt, same model, and the gap in output quality is stark.

Adversarial review for better product taste: a vague, overly broad problem statement without review, versus a sharp, prioritized product spec — who, context, struggle, cost — produced with adversarial product review.

I also added /frontdoor, a skill that interprets every incoming request and dispatches it to the right place, so the agent doesn't burn cycles solving the wrong problem.

Finally, I added /brand-identity. It pushes back on uninspired SaaS vanilla by forcing the builder to articulate the product's visual point of view before UI gets generated.

Better access, asynchronous by default

Today the harness works great when I'm in a terminal or driving Claude through remote control. We want to push past that.

Imagine an incoming bug that automatically creates a ticket in Linear, which spins up an agent following the modastack protocols, figures out the fix, tests it, and opens a PR. You get updated in Slack along the way. You review the PR at the end, or you hit approve and move on with your day.

That's where agentic coding goes once your foundations reinforce both rigor and taste. That's the next move: software factories dispatched through chat.

What This Changes On Monday

Last week I wrote about the practices a human operator needs to add to their build stack to overcome AI's mediocre defaults. That post is still the foundation.

Since then I've watched founders try to scale that discipline across a team. The next problem is brutal. Discipline that lives only in the human's head doesn't transfer. The senior engineer who knows when to push back on the agent leaves. The founder with the product taste to refuse the hallucinated AI feature gets pulled into fundraising. The discipline goes with them.

The only way to build without sacrificing quality is transferring both your product and engineering taste into a harness. That's the move.

If you're shipping solo today, here's a Monday-morning move that costs nothing. Write down the three product decisions you made this week that your agent would not have made on its own and use them to instruct your coding harness. That list is the seed of your taste layer. It's what your harness, whether you build one or buy one, needs to learn.

If you're running a team, do the same with your best engineers. Get the taste out of their heads and into the system so the entire team uplevels. That's how you prevent the SoFi PagerDuty Monday three months from now.

We're building the harness that does this at Moda Labs. More on that soon.

Your slop debt is accruing whether you've seen the invoice or not. The founders, CTOs, and builders who survive 2026 will be the ones who opened the envelope early.

Luke Lin

Co-founder & CEO, Moda Labs

Originally published on The Craft of AI.

All posts

KEEP READING

July 8, 2026 · 7 min read

A new way to work: autonomous agent teams

Friday of my first week running the GTM team, the weekly brief had merged context and learnings from every GTM activity that week — without me managing a single file. How I graduated from manually steering one Claude agent to running a coordinated, autonomous team of 11 from Slack.

June 17, 2026 · 6 min read

Harness engineering when reasoning is exponential

Same model, same task, two very different Mondays. A harness is the scaffolding around a model that turns raw reasoning into work that reliably gets done. Six principles for building one that delivers today and grows with the models of tomorrow — when reasoning 10x’s again.

June 10, 2026 · 6 min read

The Return of the Eval

A year ago the loudest voices called evals dead. At Arize:Observe, 700+ builders proved otherwise. The saga from “evals are everything” to “evals are a scam” to the position that settled: evals are the telemetry of agent performance — and three reasons you need them now.