What AI-built apps actually break on after the demo, and how to tell the holes that compound from the holes that can wait.
There is a moment almost every team building with an AI assistant runs into. The first weeks feel like magic: you describe a feature, the code appears, the demo works, people are impressed. Then you hit a wall. The wall is not the AI getting worse. It is the point where the work stops being snippets and starts being architecture: module boundaries, where state lives, how the data model evolves, what happens when two integrations fail at once.
The holes that sink AI-built apps come from missing a steward, not from a model that cannot code. And the holes that show up after the demo are two different kinds that get lumped together. Tell them apart and you know what to fix first. Confuse them and you either panic or ignore real debt.
When an AI-assisted MVP starts to strain, the symptoms are familiar:
None of this means the AI did something wrong. It is the predictable shape of a codebase whose structure emerged from a sequence of uncoordinated good answers, rather than from a person holding the whole product in their head. The framework became the architecture by default.
Split that list in two, because the items mean opposite things about the work.
The first group is structural: everything through one file, state everywhere, god services, ad-hoc schema, the general big ball of mud. These are the signature of an absent steward. Structural holes compound. Each new feature makes the next one harder, until a rewrite looks cheaper than another change.
The second group is resilience: thin tests, happy-path integrations, missing retries and idempotency. These are the signature of a product that has not met production scale yet. Every early product has them, including ones built by an expensive senior team. They are bounded, known, and schedulable.
The missing ingredient in the broken codebases is not model capability. It is a human mental model of the product and the domain that every AI answer has to respect. Architecture should be driven by the domain (billing, auth, content, the actual nouns of your business), not by whatever the last answer suggested. The framework is the delivery mechanism, not the design.
Concretely, the structure that prevents the structural holes looks like this:
This is the part SDS exists to do. When we build with AI, and we build with it every day, the model writes code inside a structure we own. We set the boundaries first and keep them. We keep state to one source of truth. We scope services to domains. We version migrations from the first one. We write tests on the paths where money moves or a person gets the wrong result. We treat each integration as part of a system that has to survive a retry and a failure, not just a happy-path demo. Every project passes our own internal multi-dimensional review before handoff, with the findings written down so you can see exactly what was checked.
Here is the part most firms would never put in writing. We recently ran this exact six-failure-mode lens across our own portfolio. The structural holes came back clean: real module boundaries, coherent state, focused services, ordered migrations. The gaps we found were the resilience kind, the ordinary debt every pre-revenue product carries, and the work to close them is already scheduled. We say that because a firm that hides its own gaps cannot be trusted to find yours.
The compounding payoff is real. Because the structure is deliberate, the second feature is cheaper than the first, and the second product launches faster than the first. That is not a heroic story. It is the predictable outcome of architecture someone is actually stewarding.
This is not an argument that AI coding is bad. The opposite. AI is the fastest set of hands we have ever worked with, and the gains are enormous when there is a steward directing it. It is not a claim that we ship zero gaps; we ship known, named, scheduled gaps, which is different from hidden ones. And it is not a one-size refactor prescription. The right path out of the holes depends on your domain. The wrong path is another round of snippets.
The holes that sink AI-built apps come from missing a steward, not a smarter model: separate the structural holes that compound from the resilience holes every young product has, and you know what to fix first.
If you have an AI-built app you are worried about, send it to us. In one call we will map it: which holes are structural and compounding, which are the ordinary resilience gaps, and what to fix first. A useful hour whether or not we end up working together.