Test-driven development: Theory and practice

12.6.2019 | 10 minutes reading time

As of today, test-driven development (TDD) is an integral practice in many software projects. However, this practice is still difficult to master and presents significant challenges and risks if not applied right. In this article we discuss the idea of test-driven development in general, present some common pitfalls to avoid, and eventually discuss the impact of test-driven development on design. The lessons learned are based on our own experience that we acquired in the course of software projects conducted in the last years.

Part 1: Theory

First, let us briefly review what test-driven development is all about. To keep the article short and on point, we will focus primarily on implementation and ignore other activities like requirements engineering, documentation, etc.

The problem

In a sequential predictive process, the simplified ideal life cycle for a software feature looks like this: first, you implement it, then you test it, fix detected bugs, and then you are done:

The problem with this approach is that it is generally very hard to predict how long a complex activity – like much of software development is – will take. When the deadline is approaching and development is taking longer than planned, there are basically two options: move the deadline, or cut testing activities. Usually, under time pressure, a decision is made to drop some testing, deliver what is available (promising to fix eventually discovered bugs later) and hope for the best:

We all know how this plays out. Essentially, this fixes scope and leaves quality variable. In the long term this leads to accumulation of bugs and technical debt, which, left unchecked, leads to degradation of team performance and morale. As the technical interest payments rise, eventually, the process breaks down because new features become too difficult to implement. At this stage of the project, life is hell.

How test-driven development helps

To avoid this kind of problems, test-driven development makes the following changes to the process:

split the process into many short micro-iterations
in each micro-iteration write test code before writing implementation code, make sure all tests pass, and refactor “mercilessly” to keep the design malleable

Since the test code is written first and the objective is to make and keep the tests green at all times, the development is said to be test-driven. Essentially, this trades fixed scope for fixed quality: if a feature cannot be delivered under given time constraints, it will be naturally de-scoped. This is usually OK, since most of the features will be completed in time and it is preferable to deliver some of the features with confidence that they are implemented correctly than trying to squeeze all the features in but failing to ensure consistent quality.

These micro-iterations should be really short; in fact the whole process can be viewed as consisting of three continuous sub-processes running in parallel:

These sub-processes form a symbiotic relationship, constantly affecting each other, which ultimately results in fewer defects, better design and higher productivity. This is the main value proposition of test-driven development.

In its radical form, test-driven development demands that absolutely no production code be written without having a failing test first. Though this will certainly ensure perfect coverage, we believe that it can be relaxed and that it is sufficient to require that:

eventually all production code is covered by automated tests
the development of production code and test code proceeds in parallel
continuous refactoring is part of the routine
test code receives the same treatment as production code

Failing to adhere to these practices will increase the risk of getting the overhead while not getting the promised benefits. In the next parts we will be looking at more concrete and subtle issues which we encountered adopting this method in our projects.

Part 2: Pitfalls of test-driven development

When we wrote our tests, we observed that despite high coverage there was a large number of defects that were not discovered by tests and surfaced only after deployment to production. This is obviously a problem since the ultimate goal of testing is to detect defects early! Furthermore, we discovered that the code quality did not increase as we expected. Drilling down on the actual causes of this, we discover three patterns.

Invalid assumptions

There has not been a clear understanding and specification of intended behavior of the units under test. The fixtures have been constructed based on assumptions which simply did not hold under real production conditions. For example, it was assumed that the input data to the system would be of higher quality than was actually the case, which caused many unexpected failures in production.

Ineffective tests

During testing, the units under test did in fact exhibit erroneous behavior, but the fixtures were not able to detect it. This indicates that having a high level of test coverage alone does not imply effectiveness of the test suite. Furthermore, it becomes evident that designing effective tests is indeed a challenging discipline, and requires the same level of care and thought as developing production code. Which leads us to the third issue.

Treating test code differently

This is perhaps the biggest problem of all. When we wrote our tests, we observed that the test code was designed differently than production code. Best practices usually applied to implementation design were not applied as rigorously to test code. For example, in comparison to production, duplication and coupling in the test code base were much higher. One possible explanation for this is the idea that contrary to production code, test code will only run during development and therefore it will never be exposed to real users. This is short-sighted because test code is an integral part of the whole code base and needs to be evolved during the life-cycle of the project. We will discuss this issue and further implications on design in the next part.

Part 3: Impact on design

In theory, test-driven development should have a positive impact on overall design quality, but contrary to our expectations, we observed that it has not always been the case. In some instances, automated tests even made it more difficult to improve the design, so there is something interesting going on. Again, drilling down on the actual issues, we identified two aspects.

Overhead

This may sound trivial, but effective tests take a lot of time to write. This leads to the following problem: simplistically speaking, since the amount of available time is limited, the more effort a developer puts in creating test fixtures, the less time is left for exploration and evaluation of design options.

Although test-driven development is a design practice, to be effectively applied, it requires thinking of test code not only as a verification mechanism but as a dedicated design tool. Effective design requires exploration, experimentation and iteration. In practice, however, test code is often written just to verify the implementation “as is”, implicitly assuming that it is fixed, instead of being just a transient point in the design space, which can be moved at any time during the process.

Writing test code alone does not magically increase the quality of the implementation. Applied sensibly, is has the potential to facilitate that goal, but applied mechanically, this could even lead to detrimental results. For example, bad testing code could inhibit refactoring by increasing the coupling of the whole code base. When this happens, the test code becomes the bottleneck: changing some aspect of the implementation leads to breakage of many tests at once. Thus, bad test quality has a direct impact on implementation quality! You cannot consider them separately.

On the other hand, focusing on getting the design right might result in an implementation that requires only a fraction of testing. If this seems implausible, consider very strictly typed and functional programming languages, for example. The programs written in such languages often require much more effort to get done right, but when the compiler is satisfied, they mostly “just work”. Unit testing such programs will not create as much additional value as those written in regular programming languages because large parts of the requirements and specification will be encoded in the types, and a powerful compiler will be able to verify the correctness of the implementation to a large extent by checking whether the types align. In this case, creating the right types is an essential design activity that minimizes the need for testing, but this principle scales to other design practices in general as well.

We believe that in order to actually create the synergy of the test-code-refactor cycle described in the first part, some basic principles must be obeyed, otherwise there is a significant risk of not getting the desired results or even obtaining detrimental results. Additionally, we discovered that paying attention to the following ideas is helpful:

First, assume variability per default. Trying to get the design right the first time is impossible. Therefore, tying the test code too much to a bad design will only hinder refactoring and severely limit agility. Remember that requirements and your understanding of the domain will change, so be prepared.

Second, iterate often. Never assume that after implementing, testing and refactoring some feature, you are done. In software development the most important activity is understanding the problem, and the more you iterate, the more you learn about the problem. Obviously there is a trade-off, and you should definitely stop and move on when the current design is “good enough”. However, in practice, we observed that especially under time pressure, we tend to gravitate towards the other side of the spectrum. In the extreme case, we stop with just the first idea that comes to mind, implement it, and never touch that code again.

Integrity

Having a comprehensive and passing test suite does not indicate architectural integrity of the system – only careful analysis and understanding of the problem domain and proper design can do that.

We discovered that there is sometimes a troubling misconception that having a comprehensive and passing test suite would indicate good design. The problematic reasoning is this: “we have a lot of tests and they are all green, therefore everything is fine”. Although in theory, any software developer will surely understand and agree that “you cannot test quality into product”, this truth sometimes gets ignored in practice.

In fact, these two concerns (test coverage and product quality) are independent: one can easily imagine a high-quality system running in production, delivering a lot of value to its users, and not having a single automated test at all – just delete all the tests after deploying and completing extensive validation of the system. On the other hand, it is conceivable to achieve 100% test coverage and have an extremely brittle or unusable system; in this case the tests do not provide any value at all, and are just as useless as the system itself.

Furthermore, the test code itself does not deliver any value to the user of the system; it is the execution of the test suite and the act of writing test code that create value, but once the tests are executed and the system is designed, deployed and put to use, they can be safely deleted. In other words, after developing the system and verifying that all tests pass, running the same test suite again generates no additional information about the system.

Of course, this view of the testing process is very simplistic, as the system usually must be continuously evolved in order to meet the ever-changing user needs, in which case building and maintaining a comprehensive test suite provides a safety net protecting from regression, and continues to assist the design process.

In total, we believe that the problems described above stem mostly from the fact that we did not consider test and implementation code equally. As mentioned in the first part, this is key for successful application of test-driven development in practice. However, it turned out to be very difficult to do that, because you really need discipline to follow through.

Conclusion

Test-driven development is an integral technique for achieving high quality and is part of everyday practice of most software developers today. However, it is still a challenging discipline that takes time and practice to master. You also need to pay attention to best practices and be very disciplined, otherwise you risk wasting time and not getting the expected benefits. Additionally, in order to be effectively applied, writing test code needs to be explicitly considered as design tool, and not just as verification mechanism. Finally, this practice can only complement other design activities, not replace it.

Was this post helpful?

Blog author

Andrey Skorikov

Do you still have questions? Just send me a message.

fromAndrey Skorikov

Pair programming without keyboard

Pairing in general—and pair programming in particular—is an essential practice of XP . Unfortunately, pairing is closely associated with coding. Take, for example, the definition of the driver role: it is the person in control of the keyboard (Beck 2...

Agile
Software development
Agile methods
Team Programming

29.3.2020 | 3 minutes reading time

Andrey Skorikov

Sustainable software development

Sustainable software development requires a theory of control that takes into account the complex nature of the development process. Simplistic models that do not account for interdependence often lead to unnecessary overhead and fail to recognize and...

Software development
Software architecture

21.10.2019 | 25 minutes reading time

Andrey Skorikov

How to implement responsive polling in Java

Let’s suppose you need to stream data from an external source, but that data source does not support push natively, so you are forced to resort to periodic polling. How do you implement this in Java, as simply as possible, while remaining responsive?...

20.9.2018 | 4 minutes reading time

Andrey Skorikov

Ad hoc polymorphism in Scala for the mere mortals

In this blog post we are going to discuss ad hoc polymorphism and the Type Class Pattern in Scala in very simple terms. No knowledge of algebraic structures is required. Starting with a simple function for adding a pair of integers, we will progress ...

Scala
Software development

23.2.2017 | 11 minutes reading time

Andrey Skorikov

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Implementing a consumer-driven contract testing workflow with Pact broker...

IntroductionIn the previous posts we learned that the Pact workflow requires you to exchange contracts and verification results between consumers and providers. We introduced two approaches on how the contract exchange can happen: 1) committing the ...

DevOps
API
Test Driven Development
Testing

24.2.2020 | 12 minutes reading time

Message Pact – Contract testing in event-driven applications

IntroductionIn the previous blog post we introduced contract testing with Pact as an alternative to end-to-end tests when developing distributed applications. Pact works great for interactions between services that follow a request-response pattern,...

Agile
Kotlin
Microservices
API
Test Driven Development

18.11.2019 | 9 minutes reading time

Consumer-driven contract testing with Pact

IntroductionConsumer-driven contract testing is an alternative to end-to-end tests where not all services have to be deployed at the same time. It enables testing a distributed system in a decoupled way by decomposing service interactions into consumer...

JavaScript
Kotlin
API
Test Driven Development

3.10.2019 | 11 minutes reading time

Taking Baby Steps++

It was at our Munich Softwerkskammer Software Craftsmanship Meetup in early 2013 that we first did a “Taking Baby Steps” TDD kata session based on a simple constraint: you have to time-box the phases when you’re in a “red state”, which is indicated...

Agile
Test Driven Development

4.5.2015 | 7 minutes reading time

Testing and Mocking of Static Methods in Java

Again and again I stumble upon the myth that static code is evil because it is hard to test and you can’t mock it. Architects and lead developers are telling that tale and the juniors are picking it up and repeating it: “Static code is evil. It is hard...

BDD
Java
Testing
Software development
Test Driven Development

10.11.2011 | 4 minutes reading time

Tobias Trelle

Continuous Testing Tools (not only) for Java – Infinitest vs JUnit Max

Tools like Infinitest and JUnit Max run Java unit tests as early as possible, immediately after a relevant code change has taken place. This way you get an immediate feedback and errors and their causes can be identified immediately. Especially when ...

Agile
Testing
Groovy
Java
Test Driven Development

25.1.2011 | 4 minutes reading time

Survey Result on Usage of TDD

Last week I conducted a little survey. The results I now would like to present and discuss here.In total 17 people shared their opinion. Thank you for taking the time. Of course this cannot lead to statistically sound statements, but we can show tendendcies...

Agile methods
Software development
Test Driven Development

26.5.2010 | 5 minutes reading time

Usage of TDD in practice

How much is TDD used in daily business? How much should it be used? To find that out, I started a little survey. The results will be made public on this blog soon. I’d love to get as much participation as possible, of course 🙂 So please continue reading...

Test Driven Development

20.5.2010 | 1 minutes reading time

Test-driven development: Theory and practice

Part 1: Theory

The problem

How test-driven development helps

Part 2: Pitfalls of test-driven development

Invalid assumptions

Ineffective tests

Treating test code differently

Part 3: Impact on design

Overhead

Integrity

Conclusion

Was this post helpful?

Blog author

More articles

Pair programming without keyboard

Sustainable software development

How to implement responsive polling in Java

Ad hoc polymorphism in Scala for the mere mortals

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

More articles in this subject area

Implementing a consumer-driven contract testing workflow with Pact broker...

Message Pact – Contract testing in event-driven applications

Consumer-driven contract testing with Pact

Taking Baby Steps++

Testing and Mocking of Static Methods in Java

Continuous Testing Tools (not only) for Java – Infinitest vs JUnit Max

Survey Result on Usage of TDD

Usage of TDD in practice