Top Excuses Why Automatic Builds Suddenly Fail

28.7.2010 | 5 minutes reading time

We at codecentric have hundreds of automatic builds run every day, sometimes they … fail. This post is not about lame excuses. “nah the build shouldn’t fail, that was a trivial change…” does not count. But there are situations where a build fails because … well nobody really knows.

Some people say: cosmic rays! But we know that is not true. To efficiently utilize a CI system without the need to troubleshoot a long time here some common issues we encountered and ideas how to mitigate them.

A test might do some time calculation and either in the test or in the code under test time is taken twice. While most of the time there is no difference, there might be a microsecond sometimes. A good indicator for this is a message like: Time was 23:30:00 but expected 23:30:00 (note microseconds are not shown)
Code, Testcode or Testrunner/CI code might leave files behind. Sometimes these are log files, sometimes files produced as test output. Take the time to search the whole server for all file writes made during a test run and take care to have a cleanup in place. Don’t forget to add disk space monitoring, because build machines have hard discs that tend to get full. (Hudson can do both)
Users logging into the CI system might lock resources, like files or ports, or do anything bad to the machine. You should not allow user logons. All “analysis” should be made read only. Note that also read access or parallel tests can lock files.
It is not necessarily a bug when your code or tests do not run when the system date is 1th of January 1970. It could be, but you should make sure the system always uses the current time. Set up a ntp daemon. If you must use specific points in time for testing, you should be able to set a time source for all of your code, like a spring bean called TimeProvider which normally resorts to the system time.
If your tests need to apply evil hacks to test your software (which might be required. If not get rid of the hacks) it is often safer to let test execution fork, so tests cannot introduce side effects via the JVM (like setting System properties). Code coverage tools using bytecode manipulation count as hacks.
If you have multiple build machines ensure that they are as similar as possible. If you can afford it, you can set up a farm of build machines with defined differences, so you can spread testing on varying hardware in case this cannot be simulated. You do not want any surprise differences on which you spend hours to find out.
Consider setting up the system under test for integration tests nightly from scratch. Those tests tend to get messed up by exploratory testing and ad hoc demos. It is easier to automate such stuff than one would think, though it takes some time.
If you do automatic deployment, you need to at least stop the server, copy changed artifacts and then restart the server. Any kind of “hot deployment” is unfortunately just to fragile for reliable results.
After doing any change to configuration or infrastructure of your CI system trigger a build manually. If not done the next normal developer checkin will cause a failing build and will leave the dev wondering how that change could break this stuff.
If you find your tests hanging in your code, especially if multiple tests were run at the same time, take a heap and thread dump of the JVM before restarting the tests. You might be lucky that you found by accident a real concurrency issue inside your code. You should be grateful for that because you hardly can deliberately test this.

Yes it is possible that the build breaks without any issues in your software, but it wastes a lot of time on investigation. Know the weaknesses of your system and try to fix them or at least document them.

We have an issue with one of our integration test suites, which connects to an external service. Sometimes this just hangs and results in a connection timeout. Until a while ago this always “broke” the build. The result of this was that every engineer had a look at the build, the logs and eventually found out it was a timeout on the external system- We discussed this and decided to stop wasting time on investigating this. So we added a mechanic that in this special timeout case the build does not turn red. It stays green, but creates a tag “timeouted” on the test. This of course has a problem. A green build with “timouted” can have a problem with the computation of the extern call results. It might be really red. But we cannot know this. Real green build are not allowed to have “timeouted” tests. But the important part is that we want “real red” builds, which turn only red when there is a issue we can fix. In RobotFramework you can define a third state for “noncritical failures” . Decide yourself if this is something for you.

But the most important takeaway is: If a broken build is not caused by test or production code, you must find the reason and address it. You cannot say: “cosmic rays” because that will lead everybody to say “broken build – cosmic rays” and you will have much less successful builds because eventually no one will care. Red should always mean: Team take action!

Was this post helpful?

Blog author

Fabian Lange

Do you still have questions? Just send me a message.

The Ralph Wiggum Loop: Autonomous Code Generation with a Fresh Context

Ralph Wiggum is the simple-minded boy from The Simpsons who says things like "I'm learnding!" and eats glue. Of all people, he is now the namesake for a technique for autonomous code generation. The idea behind: If the thought of letting code be generated...

Generative AI
LLM
AI
Software development

6.4.2026 | 7 minutes reading time

Johannes Barop

Nested Fixture Pattern for JUnit

JUnit's @Nested classes are usually presented as a way to group related tests. But combined with @RegisterExtension and ExtensionContext.Store, they become something more powerful: a declarative scenario tree where each level adds a scope in which fixtures...

Testing
Java
Software development

9.3.2026 | 11 minutes reading time

Rüdiger zu Dohna

Narwhals: Building Dataframe-Agnostic Libraries with Zero Dependencies

After the publication of our article about Ibis, Dr André Schemaitat pointed us to a similar tool with growing popularity – Narwhals. Narwhals describes itself as an "extremely lightweight and extensible compatibility layer between dataframe libraries...

Data
Python
Software development

3.3.2026 | 11 minutes reading time

Niklas Niggemann

How-To: Seamless development in WSL2 with git, SSH and podman desktop

Weather you want a more uniform development environment across your team to avoid compatibility issues between different operating systems, want to work closer to your target environment, or need to run a linux exclusive tool like Claude Code, an AI ...

Git
Microsoft
Software development

5.1.2026 | 5 minutes reading time

20 years of coding

We all grow older. It is simply inevitable. As the saying goes, The only way to not grow old is to die young. Recently, I've completed my 20th year in the development industry. Through academia, consulting, and a stint in product development, I've learned...

Software development
Training
Culture

11.4.2025 | 10 minutes reading time

Elisabeth Schulz

Hexagonal Architecture is just an island

Imagine an island called "Alistair Island." This island is a vibrant place with houses, fertile soil, and a well-coordinated community of residents who live by well-defined routines. Every activity on the island has significance and serves a specific...

Software architecture
Testing
Software development

22.1.2025 | 10 minutes reading time

Danny Keller

Spring and Vue - A setup for small projects (Part 2)

In the first part we presented a setup for a combination of Spring Boot and Vue.js. Now we have to look at how to connect two type-safe languages, TypeScript for the frontend and Java for the backend, through a REST-API and in a type-safe manner. We ...

Spring
Frontend
API
JavaScript
Java

17.1.2025 | 10 minutes reading time

Roger Butenuth

Nils Winking

Spring and Vue - A setup for small projects (Part 1)

Quickly adding a new Vue.js application to an existing Spring Boot project should be pretty easy, or at least a googleable problem, or so we thought. But in the end, it wasn't. However, with the right combination of configuration, components, and some...

Spring
Frontend
JavaScript
Java
API

10.1.2025 | 8 minutes reading time

Roger Butenuth

Nils Winking

ArchUnit in practice: Keep your Architecture Clean

Who hasn’t been there: A new project kicks off or the old code finally needs a cleanup. A big meeting with all the developers is called: “This time, we’ll do it right—clean, correct, and structured!” Architecture Decision Records (ADRs) are created to...

Software architecture
Java
Kotlin
Software development

20.9.2024 | 18 minutes reading time

Danny Keller

Integrating Dapr with Azure Kubernetes Service (AKS): Portability is key

In a recent blog post, we explored how Dapr works and how to test it on a simple local Kubernetes cluster. One of Dapr's key advantages is its component system, which enhances portability. In this post, we'll take our previously daperized demo app and...

Software development
Cloud
Azure
Cloud native

22.7.2024 | 10 minutes reading time

Manuel Zapf

React is dead, long live React - React 19 is here

The world of frontend development has changed once again, and this time React 19 is leading the way. This version brings a variety of new features and improvements, but the most exciting innovation is the brand new compiler, which already requires React...

React
Frontend
Software development
JavaScript
Webdevelopment

19.7.2024 | 6 minutes reading time

Michel Ehmen

Exploring Dapr: A Deep Dive into Distributed Application Runtime

In a recent blog post, we introduced Dapr (Distributed Application Runtime) and highlighted its potential as a valuable tool for cloud-native applications, in combination with Aspire. This post dives deeper into the inner workings of Dapr, explaining...

Software development
Cloud native
Software architecture
Open Source

10.7.2024 | 10 minutes reading time

Manuel Zapf

Spring Boot and HTMX: The boring app

Motivation Most apps I touched in the wild follow the same two tiered approach. A backend delivering JSON (some may call this REST) and a frontend framework, consuming JSON from the backend converting it to the HTML displayed to the user. Worst case,...

Software architecture
Software development
Spring
Kotlin

28.6.2024 | 16 minutes reading time

How to validate your Spring Boot implementation when choosing an API first...

When choosing to follow the API First approach, ensuring that the actual implementation follows the defined specification can present a significant challenge. Achieving alignment between the specification and implementation is crucial, as it greatly...

Spring
API
Java
Validation

7.6.2024 | 6 minutes reading time

Hendrik Kamp

Charge your APIs Volume 25: Contract Testing

I feel the way we do integration testing is sort of like setting your house on fire to test your smoke alarm. It is excessive, tiresome and way too costly. This is not a quote from myself. I typically don't come up with such good ideas when I need....

Testing
Software development
API

2.4.2024 | 11 minutes reading time

Pasquale Brunelli

How to gain visibility as a software developer?

No matter if junior, medior or senior, introverted or extroverted: Every software developer can increase their visibility with different tools and should treat the topic as important. The only question is: how and with what effort? In this blog post,...

Training
Software development
Community
Open Source

21.2.2024 | 6 minutes reading time

Count your queries! Repository integration tests with Hibernate Statistics

If you are using Spring Data JPA as a data access framework, Hibernate is almost certainly hiding under the hood. And although this setup takes a lot of work off your hands by doing a lot of awesome things, the final outcome should better be checked....

Java
Testing
Spring
Database

7.8.2023 | 6 minutes reading time

Kevin Peters

Compile once, run anywhere with WebAssembly and WASI

WebAssembly was initially created to bring languages other than JavaScript to the browser. Its design goals include portability, safety and performance. WASI (WebAssembly System Interface) lifts those capabilities to the world outside the browser. This...

Go
Java

3.2.2023 | 10 minutes reading time

Julian Arz

The best of both worlds: Harnessing the benefits of object-oriented and...

Functional programming and OOP are often viewed as two separate paradigms in programming. And it is true that programming languages lean more towards one or the other, which influences how we are "supposed to" solve a problem in this language. In this...

Pattern
Functional programming
Software development

1.2.2023 | 8 minutes reading time

Thomas Buß

Microstream – the end of O/R mappers?

Searching for alternatives to O/R mappers and persistence frameworks for NoSQL databases, I came across Microstream and was interested pretty quickly. On the one hand because Microstream is being developed in my home region Oberpfalz, but mainly because...

Java
Database
Software architecture

29.9.2022 | 14 minutes reading time