AI Code Tsunami Hits the QA Dam: The End of Balanced Velocity

30.3.2026 | 8 minutes reading time

Note upfront: This article is specifically aimed at teams working on the modernization and further development of existing systems, not at greenfield projects where completely different rules apply.

Everyone is talking about the massive productivity boost from Artificial Intelligence in software development. Tools like GitHub Copilot and other AI assistants promise that we can implement features in a fraction of the time it used to take. And indeed, code generation has become incredibly fast.

But if we look at the reality of existing software projects and established teams, a different picture often emerges. Developers are faster, but the business value doesn't reach the end customer any quicker. Why is that? We are falling prey to an "AI Speed Illusion". We have forgotten that writing code is only one part of the equation.

Of course, software development has countless variables. When looking at delivery speed, however, four sectors are essential: Requirements, Development, Quality Assurance (QA), and Delivery. An isolated speed boost in just one of these areas does not make the overall system faster.

Often, the bottleneck is right at the beginning, when clarified requirements merely trickle into development like a small stream. But even if the requirements flow and the AI turns this stream into a code tsunami in record time, the next bottleneck awaits right before delivery. In this article, we want to focus on exactly this part of the equation: QA.

The Decade of Equilibrium

To understand the problem, we need to look at how existing teams work away from the greenfield. Many of these projects have been running smoothly for years or even decades. Over time, a natural equilibrium has established itself within these teams: a balanced velocity across all four sectors – from requirements to delivery.

Developers wrote code at a certain pace. The test infrastructure and the often manual QA instances were perfectly aligned to validate this output in a timely manner. It was a rhythm that worked. Dev speed and QA approval were in a perfect balance.

The Tsunami and the Dam

And then came AI.

Within a very short time, the speed at which code is produced has multiplied. (Just how extreme this speed can be in a greenfield scenario is shown in our experiment: A complete PoC in 5 minutes with AI-Assisted Coding). The historical equilibrium was literally washed away. Development is now generating a veritable code tsunami.

The problem for existing teams is that the processes, the test infrastructure, and the people in QA have remained exactly the same. The tsunami crashes unhindered against the existing QA dam. If test runs for a feature take hours or if manual testing instances by QA testers still exist, the code piles up. The features might be "Code Complete", but they are not in production. The actual added value for the business does not increase. It remains stuck in the QA bottleneck.

Important to understand: This is not a failure of QA! It is simply a system that was optimized for a completely different speed and for project situations of the past.

The Boomerang Effect: Why AI Code Without Fast Feedback Becomes Instant Legacy Code

This traffic jam in front of QA doesn't just impact time-to-market; it inevitably hits back at the development team. Because eventually, the code gets tested. And when bugs are found, the ticket wanders back to the developer. The problem here is that, due to the bottleneck, this feedback often arrives hours or days later.

Now, one might say: "Context switching and waiting for QA have always been exhausting and expensive." That's true. But when we have to fix a bug in AI-generated code, a new, dangerous dimension is added: the lack of mental anchoring.

When a developer types out a complex feature entirely on their own, they build a deep mental model of the logic, line by line. If this code fails in manual QA three days later, they remember their train of thought. However, when we build code with the help of AI, we often switch to the role of a reviewer. We accept blocks of code. The mental model we build in the process is significantly shallower.

If a test run now takes hours or QA feedback takes days, and we have to fix a subtle bug, a tedious reverse-engineering process begins for code that, strictly speaking, we never really wrote ourselves. The cognitive load is enormous. Without immediate feedback, we are essentially creating Instant Legacy Code – code that has just been generated, but which no one on the team understands in detail anymore. We are experiencing software entropy on steroids: the creeping decay of the codebase and the build-up of complexity, which used to take months, suddenly happens in a few weeks with AI-generated code. Fast and automated feedback in minutes is therefore the only way to catch bugs while this shallow AI context is still fresh in the developer's mind.

The Test Pyramid is More Important Than Ever: When is a Test Suite AI-Ready?

To truly practice AI-Powered Development in legacy projects, there is only one way out. We don't have to reinvent the test pyramid, but we have to take it more seriously than ever before. The classic test pyramid never went away, but in the age of AI, it is elevated from an important architectural concept to an absolute survival strategy.

Only those who can work without bottlenecks will be able to translate the AI's horsepower to the road. But what does that mean in concrete terms? A test suite is only truly AI-ready if it meets the following criteria:

Blazing Fast Base: Unit and component tests finish in under 5 minutes. They are the first and most important filter for AI-generated code.
Fast Feature Cycle: Feedback at the feature level arrives within a developer's time slot (under 30 minutes) at the latest.
Trustful Test Suite: It's not enough that "the code compiles". Developers must be able to trust it blindly. A green test run means the entire feature and application work flawlessly. Without this trust, developers will request manual re-tests at every step as a safety net.
Readable Error Messages: Error messages and log messages must be as clear as possible and provide concrete hints about the error. A generic "End-to-End test red, good luck" helps neither the developer nor the AI. The AI can only provide real value in automated debugging if the logs are precise.

AI Writing Our Tests? A Reality Check and the TDD Comeback

Now one might argue: "Then let's just have the AI write the tests too!"

Test generation has undoubtedly improved a lot. However, practice in grown systems shows that the more integrative and complex the dependencies become, the more the AI reaches its limits. (If you want to know exactly where the current limits of AI assistants lie in everyday development, I recommend our article Bugs, Refactoring, Tests: Where Chatbots Shine in Coding and Where They Fail).

Here, a surprising phenomenon reveals itself in practice: The return to Test-Driven Development (TDD), but in the new guise of AI. It is much more effective to define well-thought-out, sharp tests upfront. These tests provide the AI with the exact guardrails, business context, and edge cases it needs. The AI generates significantly better production code from pre-written, strong tests than when asked to invent tests retroactively for its own production code. Tests become the most powerful prompt a developer can have.

How well this principle works in practice and how to force the AI to write clean code against hard test specifications is shown in the excellent codecentric blog post No Cheating Allowed: Isolated Specification Tests with Claude Code.

Tests as Living Memory and Indispensable Safety Net

Another crucial and often overlooked point is that tests are not just pure quality assurance. They act as an external memory for context that the AI would otherwise swallow or simply not know.

This also shows why regression tests are so extremely important in the AI era. When the AI refactors existing code or integrates new features into a grown system in the future, comprehensive regression tests are the only safety net ensuring that historically, painstakingly developed functions do not break unnoticed. If agreements or complex edge cases are manifested in acceptance or unit tests, the AI will have much easier access to them in the future than to textual descriptions of edge cases gathering dust in some Confluence wiki.

A large test codebase is therefore absolutely not ballast. It is the long-term memory of the system, a living and machine-readable documentation, and a massive boost for future AI development.

Conclusion: No AI Success Without Test Excellence

"Cramming AI into old processes only creates traffic jams faster — we see this with customers every day. The real leverage doesn't lie in code generation, but in the question of whether the entire delivery pipeline can keep up. If you don't bring your test pyramid and CI/CD up to AI speed, you're merely shifting the problem from the developer to QA. Investing in test excellence is no longer optional; it is the ticket to AI-native development." – Kai Lichtenberg, Head of Business Unit at codecentric

We are in a time of upheaval. Anyone who wants to benefit from AI in software modernization must not just look at code generation.

Investing in AI development tools only truly pays off at the end of the day if parallel investments are made in the test infrastructure, CI/CD pipelines, and the execution times of test suites. If you neglect your test pyramid, you won't get faster. You merely shift the problem and produce a traffic jam in front of the dam gates. qa dam.jpeg

It has never been as important as it is today to have an excellent, fast test suite. It is the foundation upon which AI speed can happen in the first place.

Recommended reading: If AI takes over writing code and we have to focus more on architecture, tests, and quality assurance – what does that mean for our jobs? Read more in our article: Will AI Replace Software Developers?

Was this post helpful?

Blog author

Jan Rümenapf

Do you still have questions? Just send me a message.

Start Your AI Deep Dive

Ein aufgeklappter Laptop, vor dem man skizziert mehrere Programme sieht. Rechts daneben steht ein KI-Agent.

AI agents: faster decisions, smarter automations

We are your partner for the strategic implementation of AI agents, from proof of concept to company-wide rollout. This allows you to increase speed, boost your efficiency and achieve measurable results.

German content

Illustration einer künstlichen Intelligenz; eines Roboters und der Icons von Target, E-Mail, Dokument und Kalender

WEBINAR on demand

KI Agenten: Architektur, Tools und neue Möglichkeiten

Verstehe, wie KI-Agenten funktionieren, welche Tools du nutzen kannst – und ob MCP & A2A deine Strategie verändern.

on demand

Generative AI
AI

Daniel Töws

WORKSHOP

AI Enablement Workshop for Product Managers

AI Enablement for Product Managers & Product Owners – Using AI-Agents and Systems to Boost Efficiency and Success in Product Development.

online or inhouse

AI
Digital product developement

Autonomous development workflows with Claude Code

Most developers today use AI tools as faster autocomplete. Over the past few months, on a client project, I took a different path: multi-agent setups with Claude Code, where specialized agents work in parallel, review one another, and coordinate on their...

AI
Software development
Generative AI

22.6.2026 | 17 minutes reading time

Christoph Dalski

From prompt to product: Why the design step matters

Anyone working with AI-assisted coding assistants today knows the promise: Type a description, and seconds later a working interface appears. Tools like Cursor, Claude Code, or GitHub Copilot deliver increasingly impressive results. Yet what is convincing...

AI
UX/UI
Frontend
Generative AI

16.6.2026 | 9 minutes reading time

Michel Ehmen

Brainstorming With AI — When to Play Devil's Advocate

Brainstorming With AI — When to Play Devil’s Advocate Part of the series Domain-Driven Design Meets AI. Every project starts with a blank canvas, and the blank canvas is where good ideas go to die. You put 8–12 people in a room, point at an empty whiteboard...

DDD
Generative AI
LLM

15.6.2026 | 10 minutes reading time

Ensuring accessibility with AI: what works today (and what doesn't)

Since June 2025, the Barrierefreiheitsstärkungsgesetz (BFSG), Germany's law implementing the European Accessibility Act, has been in effect. Most teams know they should be doing something about it, but in day-to-day work, the topic usually falls by the...

Accessibility
AI
UX/UI
Testing

2.6.2026 | 11 minutes reading time

Playwright Auth Mocking Done Right: No Runtime Flags, No Factory Patterns...

When you work on a project that uses a third-party authentication provider, you will inevitably face this question: how do I run my Playwright tests without dealing with real login flows? Real authentication involves browser redirects, multi-factor prompts...

Frontend
Testing

28.5.2026 | 8 minutes reading time

Building MCP Servers with Spring AI

Introduction The Model Context Protocol (MCP) is an open standard that defines how AI models communicate with external tools, services, and data sources. It replaces ad-hoc integrations with a single, well-defined JSON-RPC 2.0 protocol, making it easy...

AI
Software development

17.5.2026 | 5 minutes reading time

Tobias Trelle

From Inference to Governance: Why Agent Metadata Matters When LLMs Already...

Modern LLMs demonstrate strong capability in inferring meaning from column names. A tool such as Genie can typically resolve pct_cust_attrit_q to "churn" or map rev_mrr_usd to a"MRR" through pattern recognition alone. On a small, well-structured table...

AI
LLM
Big Data
Database

15.5.2026 | 6 minutes reading time

Niklas Niggemann

The Accessible Domain: Knowledge Engineering for AI-Assisted Development

The Old Promise In the late 1970s, Stanford computer scientist Edward Feigenbaum coined the term "Knowledge Engineering". He described it as the process of extracting expert knowledge, structuring it, and making it usable within a software system. Central...

Generative AI
AI
LLM
Software Modernization
Software development

11.5.2026 | 10 minutes reading time

Johannes Barop

Benjamin Font Pera

Data Quality Powers AI Analytics: Building Trustworthy Genie Spaces in...

Garbage In, Garbage Out. This computing truism has never been more critical than in the age of AI. Large Language Models don't amplify poor data quality, they wrap it in confident-sounding prose that can mislead even experienced users. As organizations...

Generative AI
LLM
AI
Data

7.5.2026 | 8 minutes reading time

Niklas Niggemann

16,000 Tests in 4 Days – Reaching 80% Test Coverage with Claude Code

The Starting Point When we at codecentric recently took over a codebase from a previous service provider for a client, it quickly became clear that this would be no ordinary challenge. Backends, frontends, batch jobs, services — a grown application landscape...

AI
Software development
Testing

5.5.2026 | 12 minutes reading time

Selvarajah Sivarupan

Is Spring Boot Becoming Obsolete?

In March 2026, we kicked off a modernization project for a client. Spring Boot was an obvious choice. There was a strategic decision behind it. There was existing know-how. There was existing infrastructure. The team was set. The work began. One of the...

Generative AI
LLM
AI
Software development
Software architecture

27.4.2026 | 7 minutes reading time

Johannes Barop

EXACT Coding: AI-powered development that prioritizes quality over chaotic...

TL;DR Uncontrolled agentic coding (“vibe coding”) delivers code quickly—and often leads to security and maintenance issues as soon as the software goes live. EXACT Coding (Example-guided AI-Collaborative Test-driven Coding) combines best practices: ....

Generative AI
AI
Test Driven Development

22.4.2026 | 7 minutes reading time

Marco Emrich

Ferdinand Ade

The Ralph Wiggum Loop: Autonomous Code Generation with a Fresh Context

Ralph Wiggum is the simple-minded boy from The Simpsons who says things like "I'm learnding!" and eats glue. Of all people, he is now the namesake for a technique for autonomous code generation. The idea behind: If the thought of letting code be generated...

Generative AI
LLM
AI
Software development

6.4.2026 | 7 minutes reading time

Johannes Barop

KubeCon Europe 2026: AI agents go to production

tl;dr A summary of KubeCon Europe 2026: It is the year AI agents move from prototypes to production. This article covers what that means: giving agents verifiable identities, routing inference traffic with the new Gateway API Inference Extension, governing...

Cloud native
AI

31.3.2026 | 11 minutes reading time

Writing Platform Documentation That Developers Actually Want To Use

Many developer platforms already exist long before anyone calls them that. Over time, teams standardize deployment workflows, define infrastructure patterns, and accumulate operational knowledge about how systems are built and run. These pieces gradually...

Platform engineering
Documentation
DevOps

17.3.2026 | 7 minutes reading time

Marcus Legendre

DeepFake: Detect AI-Generated Images in 5 Steps

We live in a time when an image is no longer a reliable guarantee of truth. AI‑generated content floods social media feeds, news platforms and messenger groups every single day, and only very few people are able to tell the difference. What once required...

IT-Security
AI
Generative AI
Search
Google
data protection
Digitalization

16.3.2026 | 5 minutes reading time

Nested Fixture Pattern for JUnit

JUnit's [@Nested][nested] classes are usually presented as a way to group related tests. But combined with [@RegisterExtension][register-extension] and [ExtensionContext.Store][store], they become something more powerful: a declarative scenario tree ...

Testing
Java
Software development

9.3.2026 | 11 minutes reading time

Rüdiger zu Dohna

From Stories to Code: How Domain Storytelling and EventStorming Give LLMs...

The Broken Promise of AI-Assisted Development By now, most development teams have tried using an LLM to generate code. The results are familiar: syntactically correct, superficially plausible, and frequently wrong in ways that take hours to diagnose...

4.3.2026 | 15 minutes reading time

The Anatomy of Claude Code Workflows: Turning Slash Commands into an AI...

Update (May 2026): Following trust and ownership concerns regarding the original project (including a meme-coin incident), GSD is now being developed under open-gsd governance. The original author of GSD is no longer involved. Actively maintained forks...

LLM
Generative AI

3.3.2026 | 11 minutes reading time

Don't Let Your AI Cheat: Isolated Specification Testing with Claude Code

AI agents are powerful — but they will cheat if you let them. Letting the same agent develop and test your application risks one thing: it will no longer fulfill the specification, it will simply learn to pass the tests. This article shows how to ...

AI
LLM
Testing

2.3.2026 | 12 minutes reading time

Thomas Jaspers

AI Code Tsunami Hits the QA Dam: The End of Balanced Velocity

The Decade of Equilibrium

The Tsunami and the Dam

The Boomerang Effect: Why AI Code Without Fast Feedback Becomes Instant Legacy Code

The Test Pyramid is More Important Than Ever: When is a Test Suite AI-Ready?

AI Writing Our Tests? A Reality Check and the TDD Comeback

Tests as Living Memory and Indispensable Safety Net

Conclusion: No AI Success Without Test Excellence

Was this post helpful?

Blog author

Start Your AI Deep Dive

AI agents: faster decisions, smarter automations

KI Agenten: Architektur, Tools und neue Möglichkeiten

AI Enablement Workshop for Product Managers

More articles in this subject area

Autonomous development workflows with Claude Code

From prompt to product: Why the design step matters

Brainstorming With AI — When to Play Devil's Advocate

Ensuring accessibility with AI: what works today (and what doesn't)

Playwright Auth Mocking Done Right: No Runtime Flags, No Factory Patterns...

Building MCP Servers with Spring AI

From Inference to Governance: Why Agent Metadata Matters When LLMs Already...

The Accessible Domain: Knowledge Engineering for AI-Assisted Development

Data Quality Powers AI Analytics: Building Trustworthy Genie Spaces in...

16,000 Tests in 4 Days – Reaching 80% Test Coverage with Claude Code

Is Spring Boot Becoming Obsolete?

EXACT Coding: AI-powered development that prioritizes quality over chaotic...

The Ralph Wiggum Loop: Autonomous Code Generation with a Fresh Context

KubeCon Europe 2026: AI agents go to production

Writing Platform Documentation That Developers Actually Want To Use

DeepFake: Detect AI-Generated Images in 5 Steps

Nested Fixture Pattern for JUnit

From Stories to Code: How Domain Storytelling and EventStorming Give LLMs...

The Anatomy of Claude Code Workflows: Turning Slash Commands into an AI...

Don't Let Your AI Cheat: Isolated Specification Testing with Claude Code