Count your queries! Repository integration tests with Hibernate Statistics

7.8.2023 | 6 minutes of reading time

If you are using Spring Data JPA as a data access framework, Hibernate is almost certainly hiding under the hood. And although this setup takes a lot of work off your hands by doing a lot of awesome things, the final outcome should better be checked. In this article we want to show you how to avoid pitfalls such as the N+1 problem by counting executed queries within automated tests using the Hibernate Statistics.

Still a thing: The N+1 problem

The N+1 problem describes the behavior of an ORM tool such as Hibernate (or similar data access frameworks, it's not bound to this specific product) to reload related entities by executing “unexpected”, additional SQL queries when loading data that was supposed to be loaded together in one go.

Of course, loading data little by little can be intentional within a lazy loading context. For example, when related entities are associated with the fetch type LAZY, the additional information should be reloaded only when it’s needed, which inevitably requires additional queries in the aftermath. However, there might be a good reason to load the linked data upfront. Therefore, the fetch type EAGER is used and one could expect all data being loaded together, for example using an SQL join to avoid additional database round trips.

Sadly, this is not the case when using abstractions such as Spring Data JPA, its feature to derive queries by repository method names and Hibernate in combination. Due to the lack of Hibernate's opportunities for optimization within this constellation, a lot of unwanted queries are executed, for example, one query to fetch the initial data collection (having the size N) and another query for each of the N collection items to retrieve the related items. After all, N + 1 queries hit the database although only a single query was expected. In the worst case this can have a huge performance impact.

An example

Let's suppose we have a very simple example without any practical relevance, serving only for demonstration purposes. We have a primary entity with a one-to-one relationship (whose default fetch type is EAGER) to a secondary entity.

1@Entity
2public class PrimaryEntity {
3
4    @Id
5    private Long id;
6
7    private String category;
8
9    @OneToOne
10    private SecondaryEntity secondaryEntity;
11
12    public SecondaryEntity getSecondaryEntity() {
13        return secondaryEntity;
14    }
15
16}

Furthermore, we use a very simple Spring Data repository to retrieve primary entities by its category.

1interface PrimaryEntityRepository extends JpaRepository<PrimaryEntity, Long> {
2
3    List<PrimaryEntity> findByCategory(String category);
4
5}

If we now invoke this method – having the eager fetching in mind – we might expect the ORM framework to use a join to query the data as a whole. Surprisingly Hibernate performs a lot more SQL queries.

Take a look inside!

To check SQL statements generated by Spring Data JPA you can activate the SQL log.

spring.jpa.show-sql=true

Executing the findByCategory(String category) method of our example repository which for instance returns a list of three primary entities, you will see the following:

Hibernate: select p1_0.id,p1_0.category,p1_0.secondary_entity_id from primary_entity p1_0 where p1_0.category=?
Hibernate: select s1_0.id,s1_0.additional_data from secondary_entity s1_0 where s1_0.id=?
Hibernate: select s1_0.id,s1_0.additional_data from secondary_entity s1_0 where s1_0.id=?
Hibernate: select s1_0.id,s1_0.additional_data from secondary_entity s1_0 where s1_0.id=?

But why are there four database queries? We configured an eager fetching strategy, didn’t we?

This happens because Spring Data JPA leverages the JPA Criteria API to access data and due to its explicit fetch plan Hibernate cannot add the crucial JOIN FETCH to this plan afterwards. So for every of our three primary entities another query is triggered to fetch the related secondary entity.

1@Query("FROM PrimaryEntity p JOIN FETCH p.secondaryEntity s WHERE p.category=:category")
2List<PrimaryEntity> findByCategoryJoinFetch(String category);

After that, the data is fetched together in one single query by joining the two tables:

Hibernate: select p1_0.id,p1_0.category,s1_0.id,s1_0.additional_data from primary_entity p1_0 join secondary_entity s1_0 on s1_0.id=p1_0.secondary_entity_id where p1_0.category=?

How to avoid regression?

Checking the queries via logfiles is a good start. But although it was easy to fix this issue, it would be more sustainable to ensure this behavior with an automated test, wouldn't it? One of the simplest solutions would be to count the queries generated by the framework and compare them with our expectation. And this is where Hibernate Statistics enters the stage.

The Hibernate Statistics interface

When you're using Hibernate and you have to gain insights about the database footprint of your application, there is a nice tool for that. The Hibernate Statistics interface is a very powerful helper you should definitely get familiar with. It offers a lot of features such as second-level cache and concurrency-control metrics, but that is beyond what we would need for our simple query count test. Luckily, it also provides 'simpler' information like the amount of executed queries and additional fetch operations. That is exactly what we need. And the best part is, compared to other libraries such as the DataSourceProxy shown in our article about Hibernate caching, it's built-in.

You can just activate Hibernate Statistics for your Spring Boot project as follows:

spring.jpa.properties.hibernate.generate_statistics=true

After that, the org.hibernate.stat.Statistics interface – obtained via an instance of the Hibernate SessionFactory – can be used to write some simple integration tests.

Among others, it offers the following extremely useful methods to count the executed queries.

1/**
2* The global number of executed queries.
3*/
4long getQueryExecutionCount();

This method tells us how many queries are executed to load our primary entities.

1/**
2* The global number of entity fetches.
3*/
4long getEntityFetchCount();

An indicator if there are additional queries to fetch related secondary entities.

1/**
2* The number of prepared statements that were acquired.
3*/
4long getPrepareStatementCount();

Provides information about how many statements are executed at all. In other words, the sum of the both methods above.

With that tooling, it is very easy to write tests for our repositories. At the end of the day, in our use case, the method getQueryExecution() should always return 1 while getEntityFetchCount() should always return 0. And to rule out anything else happening on the database, getPreparedStatementCount() should sum to both. A corresponding test could therefore look like this.

1@DataJpaTest(properties = "spring.jpa.properties.hibernate.generate_statistics=true")
2class PrimaryEntityRepositoryTest {
3
4    @Autowired
5    public PrimaryEntityRepositoryTest(PrimaryEntityRepository primaryEntityRepository, SessionFactory sessionFactory) {
6        this.primaryEntityRepository = primaryEntityRepository;
7        this.sessionFactory = sessionFactory;
8    }
9
10    private final PrimaryEntityRepository primaryEntityRepository;
11
12    private final SessionFactory sessionFactory;
13
14    private Statistics stats;
15
16    @BeforeEach
17    void setUp() {
18        stats = sessionFactory.getStatistics();
19        stats.clear();
20    }
21
22    @Test
23    void derivedQueryRunsIntoNPlusOneProblem() {
24        final List<PrimaryEntity> primaryEntities = primaryEntityRepository.findByCategory("CAT1");
25
26        assertThat(primaryEntities).hasSize(3);
27        assertThat(primaryEntities.stream().map(primaryEntity -> primaryEntity.getSecondaryEntity().getAdditionalData()).collect(Collectors.toSet())).contains("A", "B", "C");
28        assertThat(stats.getQueryExecutionCount()).isOne();
29        assertThat(stats.getEntityFetchCount()).isEqualTo(primaryEntities.size());
30        assertThat(stats.getPrepareStatementCount()).isEqualTo(stats.getQueryExecutionCount() + stats.getEntityFetchCount());
31    }
32
33    @Test
34    void jpqlWithJoinFetchExecutesSingleQuery() {
35        final List<PrimaryEntity> primaryEntities = primaryEntityRepository.findByCategoryJoinFetch("CAT1");
36
37        assertThat(primaryEntities).hasSize(3);
38        assertThat(primaryEntities.stream().map(primaryEntity -> primaryEntity.getSecondaryEntity().getAdditionalData()).collect(Collectors.toSet())).contains("A", "B", "C");
39        assertThat(stats.getQueryExecutionCount()).isOne();
40        assertThat(stats.getEntityFetchCount()).isZero();
41        assertThat(stats.getPrepareStatementCount()).isEqualTo(stats.getQueryExecutionCount() + stats.getEntityFetchCount());
42    }
43
44}

Implementing such tests for all your critical database operations should help prevent you from falling into performance traps caused by unexpected operations.

You can check out the complete example from our GitHub repository.

Takeaway: Do not believe in any magic!

As the saying goes, 'trust is good, control is better'. This is true in many cases, especially when working with JPA. While I love Spring Data, JPA, Hibernate and related ORM stuff, no tool is perfect, in particular not when several abstractions sit on top of each other. In the best case they all do their job well, but unfortunately you get into trouble faster than you think if you do not know what happens behind the scenes. It is really important to validate generated database statements, so take a look at Hibernate Statistics!

Was this post helpful?

Likes

Blog author

Kevin Peters

Senior IT Software Engineer / Consultant

Do you still have questions? Just send me a message.

fromKevin Peters

Goldene Wasserhähne – Wie wichtig ist Qualität in der Softwareentwicklung...

Stellt man Projektbeteiligten die Frage, ob Qualität von Software wichtig ist, antwortet ein Großteil der Befragten vermutlich mit „Ja”. Jede andere Antwort würde sicherlich weitere, unangenehme Fragen aufkommen lassen. Aber was bedeutet Qualität im ...

Testing
Softwareentwicklung

18.10.2023 | 9 Minuten Lesezeit

Kevin Peters

Threat Modeling 101 – Wie fange ich eigentlich an?

In einem früheren Blogpost haben wir bereits erklärt, wie wichtig Awareness im Bereich IT-Security im agilen Projekt ist. Ein Kernthema war das Threat Modeling. Doch wie genau funktioniert das? Wie bewerte ich, welche Bereiche meiner Applikation unter...

Agilität
IT-Security
Softwareentwicklung

27.2.2023 | 13 Minuten Lesezeit

Kevin Peters

Shift left security – Sicherheit ist Daily-Business

IT-Security ist ein Thema, das nicht ausschließlich InfoSec-Expert*innen angeht. Auch als Entwickler*in muss man diese Thematik auf dem Schirm haben. Security gehört zum grundlegenden Prozess der Softwareentwicklung und von Beginn an zum Daily-Business...

Agilität
IT-Security

19.7.2022 | 15 Minuten Lesezeit

Kevin Peters

Spy vs. spy – aka “The two sides of the testing coin”

When you ask ten developers about unit testing, you will definitely get at least eleven opinions on how to do testing right. As for every other topic, there is also no silver bullet approach for testing, but there are some practices which have become...

Software development
Java
Testing

27.2.2020 | 5 Minuten Lesezeit

Kevin Peters

DON’T make an ASS out of U and ME when dealing with Hibernate caching!

In my current project a simple question came up. “Is Hibernate’s first-level cache limited to a certain transaction?” Intuitively my answer was: “No, the first-level cache is also called session cache. So it should rather be bound to a Hibernate session...

Software development
Java
Database

16.7.2019 | 4 Minuten Lesezeit

Kevin Peters

IntelliJ IDEA 2018.2 auto-detection for Spring Boot and the ‘Run Dashboard...

In an earlier post we explained how to enable the Spring Boot Run Dashboard in IntelliJ IDEA 2017.2. The workaround at that time was not really easy, and due to the ever-increasing popularity of Spring Boot, the community certainly expected IDEA to ...

Java
Spring

18.9.2018 | 1 Minuten Lesezeit

Kevin Peters

Performance measurement with JMH – Java Microbenchmark Harness

What is benchmarking and why should we do that? If there are multiple ways to implement a feature or if we have serious doubts about performance while using a certain technology, special implementation patterns or a new “cutting edge” library, we have...

Java
APM

22.10.2017 | 7 Minuten Lesezeit

Kevin Peters

Polite HTTP API design – “Use the headers, Luke!”

We are developers and largely we are also practitioners. This means, in general we want to get things done, preferably quickly. But apart from doing our daily work and implementing required features, thinking a few steps ahead is really valuable. Can...

Frontend
API
Spring

25.9.2017 | 2 Minuten Lesezeit

Kevin Peters

How to enable the Spring Boot ‘Run Dashboard’ in IntelliJ IDEA 2017.2....

Since JetBrains released IntelliJ IDEA 2017.2.1 – which contains the so called ‘Run Dashboard’ – maintaining (start, stop, debug, etc.) local Spring Boot services in a convenient manner is not exclusively available to STS / Eclipse users anymore. ...

Software development
Java
Spring

17.9.2017 | 1 Minuten Lesezeit

Kevin Peters

How to mix Java and Kotlin within one Spring Boot Application

This blog post will show how to mix Java and Kotlin within one Spring Boot Web Application. We will use Kotlin data classes side by side with Lombok backed Java POJOs, for example as value objects or Jackson marshalling purposes. Since Kotlin is supported...

Kotlin
Java
Spring

12.9.2017 | 2 Minuten Lesezeit

Kevin Peters

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Mule4: Maßgeschneiderte Assertions mit MUnit Custom Matchers

Jeder, der über einen längeren Zeitraum mit Mule gearbeitet hat und (hoffentlich!) MUnit-Tests geschrieben hat, ist vermutlich auch schon einmal auf die so genannten Matcher gestoßen. Falls nicht, ist das auch nicht schlimm. Matcher sind im Prinzip ...

Testing
Softwareentwicklung
Integration

28.12.2024 | 4 Minuten Lesezeit

Pasquale Brunelli

Test Fixtures mit JUnit 5

Wir Softwareentwickler leben in einem ständigen Dilemma. Jede Funktionalität der Software sollte durch Unit-Tests und Integrationstest abgesichert werden. Es sollten dabei so viel Tests wie nötig, aber nur so wenige wie möglich geschrieben werden. Schreiben...

Java
Testing
Framework
Softwareentwicklung

25.3.2024 | 7 Minuten Lesezeit

Jens Kaiser

Charge your APIs Volume 23: REST vs. gRPC

APIs dienen als Verbindungsstück zwischen Daten und Verarbeitung und erlauben uns damit, Daten im richtigen Kontext als Informationen zu interpretieren. Passende fachliche Themen sind dabei präsenter denn je und erreichen bald auch den Endverbraucher...

Java
Softwareentwicklung
Spring
Softwarearchitektur
API
Data

11.2.2024 | 7 Minuten Lesezeit

Sebastian Tiemann

Datenbanken testen mit Testcontainers in Mule4

Hier erfährst du die Möglichkeiten Testcontainers in Mule4 zu nutzen, um deine Datenbankaufrufe zu testen. Vor einiger Zeit hat mein Kollege Christian Langmann eine Blogartikelserie veröffentlicht, in welcher er aufzeigt, wie man in Mule3 Munit-Tests...

Community
Softwareentwicklung
Testing
API
Open Source
Datenbank
Container
Integration

19.1.2024 | 3 Minuten Lesezeit

Benjamin Lüdicke

Ein tolles Paar: Spring Webflux und Kotlin Coroutines

In diesem Artikel gehen wir darauf ein, wie mithilfe des Spring-Webflux-Projekts eine reaktive Anwendung erstellt werden kann und welche Herausforderungen dieser Ansatz mit sich bringt. Wir erläutern kurz, was Kotlin Coroutines sind und zeigen, wie die...

Kotlin
Spring
Reactive Programming

18.12.2023 | 7 Minuten Lesezeit

Christian Franzen

Ferdinand Ade

Reactive Programming mit Spring Webflux

In diesem Artikel geben wir einen Überblick über Reactive Programming, erläutern, welche Prinzipien diesem zugrunde liegen und wann ein Einsatz sinnvoll sein kann. Anschließend zeigen wir, wie mithilfe des Spring-Webflux-Projekts eine reaktive Anwendung...

Spring
Java
Reactive Programming

11.12.2023 | 13 Minuten Lesezeit

Christian Franzen

Ferdinand Ade

Goldene Wasserhähne – Wie wichtig ist Qualität in der Softwareentwicklung...

Testing
Softwareentwicklung

18.10.2023 | 9 Minuten Lesezeit

Kevin Peters

Die Bingo Bongo-Methode: ein spielerischer Software-Testing-Ansatz

Software-Testing kann zur Herausforderung werden. Aber was wäre, wenn es weniger wie Arbeit und mehr wie ein Spiel wäre? Etwas, das das ganze Team einbezieht und sogar Spaß macht? In diesem Beitrag stellen wir Bingo Bongo vor, einen spielerischen Ansatz...

Testing
Agile Methoden
Agilität

31.7.2023 | 4 Minuten Lesezeit

Benjamin Knauer

Test-Fixtures: Wozu denn überhaupt?

Für uns Softwareentwickler ist der ultimative Endgegner immer die Komplexität. Wir haben zahlreiche, teils ziemlich mächtige Waffen gesammelt, um in diesen Kämpfen bestehen zu können: Dinge wie Modularisierung, Abstraktion, Lean Development, iteratives...

Testing
Java
Test Driven Development

12.5.2023 | 19 Minuten Lesezeit

Rüdiger zu Dohna

Microservice Integration Testing done right

In diesem Artikel beschreiben wir gesammelte Best Practices für das Integration Testing von Microservices. Zu diesem Zweck haben wir ein Projekt namens toti-example-service erstellt und auf GitHub veröffentlicht. Wir werden uns in diesem Beitrag immer...

Testing
Microservices
Spring
Kotlin

11.4.2023 | 7 Minuten Lesezeit

Tobias Dittrich

Till Voß

Mule 4: Test-Parametrisierung – ein Flow für viele Fälle

Immer wieder entdecke ich bei Code-Reviews, dass für verschiedene Testfälle, die sich prinzipiell nur durch die Ein- und Ausgabedaten unterscheiden, eine Vielzahl von MUnit-Tests angelegt werden. Diese Flows werden dann mühselig kopiert, um jeden Testfall...

Integration
API
Testing

16.2.2023 | 5 Minuten Lesezeit

Pasquale Brunelli

AWS CloudFront Functions testen

Mit den CloudFront Functions bietet AWS die Möglichkeit, den Funktionsumfang von CloudFront um kleine JavaScript-Funktionen zu erweitern. AWS führt diese Funktionen direkt an den Edge-Locations aus und ermöglicht es dadurch, alle ankommenden Requests...

Cloud
AWS
Testing
Softwareentwicklung

4.10.2022 | 3 Minuten Lesezeit

Dennis

Microstream – das Ende der O/R-Mapper?

Über eine Suche nach Alternativen zu O/R-Mappern und Persistenz-Frameworks für NoSQL-Datenbanken bin ich auf Microstream aufmerksam geworden und war ziemlich schnell interessiert. Zum einen, weil Microstream wie ich aus der Oberpfalz kommt, aber haupts...

Java
Datenbank
Softwarearchitektur

29.9.2022 | 13 Minuten Lesezeit

Felix Rieß

Streaming Wikipedia mit Apache Kafka

Apache Kafka ist in aller Munde und entwickelt sich im Kontext von verteilten Systemen zum De-facto-Standard als Plattform für Event Streaming. Im Rahmen unserer OffProject Time (Weiterbildungszeit) haben wir uns die Plattform auch näher angeschaut und...

Kotlin
Data
Java
Messaging
Spring

15.8.2022 | 10 Minuten Lesezeit

Christoph Metzger

Felix Rieß

Vom PoC zu Produktionssoftware: Trinke, bactane, programmiere, refaktoriere...

In diesem Text richte ich meinen Blick auf den Übergang vom Proof of Concept (PoC) zu Produktionssoftware. Speziell in kleinen Teams sind die Ressourcen nicht vorhanden, Software umfassend zu refaktorisieren, und der eine oder andere PoC landet in Produktion...

Softwareentwicklung
Testing
Agile Methoden
Test Driven Development

20.7.2022 | 7 Minuten Lesezeit

Robert Meißner

Die Zukunft der IDEs – aus Sicht eines „Java-EE-Entwicklers“

Bei unseren Kunden und auch bei codecentric dreht sich alles um den besten und schnellsten Weg, die richtige Software zu entwickeln – und das natürlich in hoher Qualität. Von daher bin ich auch ein fleißiger Leser des „State of DevOps“-Report (hier zum...

Cloud
Java
Remote Work

16.5.2022 | 11 Minuten Lesezeit

Rainer Vehns

Keycloak.X, aber sicher – ohne bekannte Sicherheitslücken!

TLDR: Wie man die bekannten CVEs (Common Vulnerabilities and Exposures) mit einer eigenen Keycloak-Distribution auf null* reduziert.EinführungKeycloak (s. Website) wird durch die Umstellung auf Quarkus einfacher und robuster, so das Versprechen. Wie...

Java
IT-Security
Keycloak

9.5.2022 | 9 Minuten Lesezeit

Sebastian Rose

Thomas Darimont

Stream Processing mit Kafka Streams und Spring Boot

Kontinuierliche Datenströme in verteilten Systemen ohne Zeitverzögerung zu verarbeiten, birgt einige Herausforderungen. Wir zeigen euch, wie Stream Processing mit Kafka Streams und Spring Boot gelingen kann. Alles im Fluss: Betrachtet man Daten als fortlaufenden...

Softwarearchitektur
Cloud
IoT
Messaging
Kotlin
Spring

20.12.2021 | 20 Minuten Lesezeit

Maik Fleuter

Lukas Maier

Wie man Java-Klassen in Python benutzt

Generell sollte man zwar für jedes Problem das passende Werkzeug nutzen. Aber oftmals wird man gezwungen, den Hammer Java zu nutzen, weil der Rest des Hauses mit diesem Hammer gebaut wurde. Eine moderne Lösung dieses Problems ist natürlich die Microservice...

Künstliche Intelligenz
Java
Python

15.11.2021 | 8 Minuten Lesezeit

Hendrik Schawe

Effizient mit Text, Code und IDEs arbeiten

Hast du dich schon immer gefragt, warum andere Leute ihre Entwicklungsumgebung (Integrated Development Environment, IDE) anders nutzen als du? Ist dir aufgefallen, dass andere beim Programmieren deutlich langsamer oder schneller sind? Kennst du auch ...

Softwareentwicklung
Java

6.10.2021 | 12 Minuten Lesezeit

Jonas Verhoelen

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.

Contact

Send

Count your queries! Repository integration tests with Hibernate Statistics

Still a thing: The N+1 problem

An example

Take a look inside!

How to avoid regression?

The Hibernate Statistics interface

Takeaway: Do not believe in any magic!

Was this post helpful?

Ja

Blog author

Get in contact

Get in contact

More articles

Goldene Wasserhähne – Wie wichtig ist Qualität in der Softwareentwicklung...

Threat Modeling 101 – Wie fange ich eigentlich an?

Shift left security – Sicherheit ist Daily-Business

Spy vs. spy – aka “The two sides of the testing coin”

DON’T make an ASS out of U and ME when dealing with Hibernate caching!

IntelliJ IDEA 2018.2 auto-detection for Spring Boot and the ‘Run Dashboard...

Performance measurement with JMH – Java Microbenchmark Harness

Polite HTTP API design – “Use the headers, Luke!”

How to enable the Spring Boot ‘Run Dashboard’ in IntelliJ IDEA 2017.2....

How to mix Java and Kotlin within one Spring Boot Application

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

View Job

More articles in this subject area

Mule4: Maßgeschneiderte Assertions mit MUnit Custom Matchers

Test Fixtures mit JUnit 5

Charge your APIs Volume 23: REST vs. gRPC

Datenbanken testen mit Testcontainers in Mule4

Ein tolles Paar: Spring Webflux und Kotlin Coroutines

Reactive Programming mit Spring Webflux

Goldene Wasserhähne – Wie wichtig ist Qualität in der Softwareentwicklung...

Die Bingo Bongo-Methode: ein spielerischer Software-Testing-Ansatz

Test-Fixtures: Wozu denn überhaupt?

Microservice Integration Testing done right

Mule 4: Test-Parametrisierung – ein Flow für viele Fälle

AWS CloudFront Functions testen

Microstream – das Ende der O/R-Mapper?

Streaming Wikipedia mit Apache Kafka

Vom PoC zu Produktionssoftware: Trinke, bactane, programmiere, refaktoriere...

Die Zukunft der IDEs – aus Sicht eines „Java-EE-Entwicklers“

Keycloak.X, aber sicher – ohne bekannte Sicherheitslücken!

Stream Processing mit Kafka Streams und Spring Boot

Wie man Java-Klassen in Python benutzt

Effizient mit Text, Code und IDEs arbeiten

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Unsere Leistungen

Hilf uns, noch besser zu werden.

Zu den Jobangeboten