A/B Testing: Tool support and testing GrowthBook

18.3.2024 | 19 minutes of reading time

In the previous blog post we introduced some general concepts of A/B testing: we explored the main aspects, defined test types and explained the most common statistical methods.

Now we want to explore the areas in which A/B testing tools can provide assistance for setup, execution and statistical analysis. The general idea is to test and compare different tools that are common in the market through practical application by simulating various scenarios of A/B tests.

In this blog post we will start with a general section on tools support and proceed by testing our first tool, GrowthBook. We start with GrowthBook as it is open source, easy to install and one of our client chose it as A/B testing tools for their experiments.

Tool support

Tools can aid in experiment design (and setup), experiment execution and statistical analysis. While all of these steps (except experiment execution) can be performed “manually”, an A/B testing platform can simplify processes, increase efficiency and reduce the likelihood of human error. Furthermore, tool support enables scaling, making it feasible to conduct dozens of parallel experiments simultaneously and reduce the dependency of the experimenter to the developers to adjust the software or make data available.

In general, an A/B testing tool will provide the following functionalities:

model one or more experiments by specifying the number of buckets and the probability of assignment to a bucket
integrate with the application under test to perform bucket assignment and collect data (metric to optimize for, general number of users, bucket assignment). This is usually done with a Source Development Kit (SDK)
start and stop experiments
examine running experiments
support in the statistical evaluation of the results

Evaluating A/B testing tools

In this section we will introduce our approach for evaluating and comparing different A/B testing tools. The main idea is to build a web application where two variants (baseline and challenger) are implemented and different A/B testing scenarios can be analyzed with the help of a tool.

The dummy application

To do any evaluation, we need an application that we can test. As web pages / web applications are the most common use case for A/B tests and supported by all relevant A/B testing tools, we created a simple dummy web page simulating a landing page with a subscribe button which will be using the tool-specific SDK.

In addition, we created a driver (using selenium and python) which simulates the users. We will use the click rate as the metric to optimize for.

As this is a dummy page, the only difference between the variants is the text after the list which is removed in the challenger.

Baseline and challenger variants for the dummy page.

So the driver knows the real conversion rates in addition to the number of users to simulate. The driver reacts to the bucket assignment by the tool SDK to reach the desired conversion rates. We (the experimenter) then use the tool to compare the displayed values with our expectations.

Evaluation aspects

Our goal is to evaluate the tools mainly in the following areas:

Statistical evaluation: presentation and reliability of the test results
Developer experience: integration of the SDK into the existing application

Any other relevant aspect for our evaluation will be also considered.

Testing scenarios

With the setup described above we have the ability to compare the interactions recorded by the driver with the ones displayed in the tool and additionally examine the interpretation of the results.

To make sure we cover different aspects of A/B testing, we will consider multiple scenarios:

A/A Test Scenario. As we mentioned in the previous article, A/A tests can be useful to verify that the tools are working as expected. In our case for the A/A test we will use the same conversion rate of 5% for both buckets. We will use a sample size of 8 144 users per variant, as the first A/B test scenario described below.
A/B Test Scenarios. Following the table with the different conversion rates and sample sizes introduced in the previous article, we will evaluate the A/B testing tools in these two cases:
- Baseline with 5% conversion rate. Challenger with 6% conversion rate. This gives a MDE (minimum detectable effect, i.e. the percentage change between conversion rates) of 20% and results in a sample size of 8 144 per variant in the frequentist approach and of 2 859 per variant in the Bayesian approach (to reach a significant chance to beat control).
- Baseline with 3% conversion rate. Challenger with 3.3% conversion rate (MDE of 10%). This results in a sample size of 53 183 per variant for the frequentist approach and of 18 743 for the Bayesian approach (to reach a significant chance to beat control).

GrowthBook: Introduction

GrowthBook is an open source A/B testing tool which can be self-hosted although a hosted variant is available. You can manage experiments in GrowthBook and the SDK will do the assignment into buckets based on consistent hashing. GrowthBook doesn’t have any data storage for the bucket assignments or metrics but a wide support and a nice interface to import this data from multiple source types.

Use cases

Like other A/B testing tools, GrowthBook allows developers to control the visibility of specific variations to different segments of users using Feature Flags.

There are three main use cases in which GrowthBook’s functionalities can be applied:

Full Experimentation Platform:
- Employ feature flags and GrowthBook's SDKs to conduct experiments within an application
- Analyze experiment results using GrowthBook's Experiment Analysis feature to determine the most successful variant
- This approach is suitable for companies new to experimentation or seeking a comprehensive switch from existing practices
Feature Flags Only: Exclusively utilize GrowthBook Feature flags within an engineering team. This approach is:
- Suitable for companies with insufficient traffic for full experiments but still desiring feature flags’ benefits
- Suitable for companies preparing for future experiments by adding GrowthBook controlled feature flags to their application now
Experiment Analysis Only: Uses GrowthBook exclusively to enhance and automate the analysis process of experiments that are already implemented and running. This approach is:
- Particularly beneficial for companies with established experiment procedures, aiming to save time and enhance decision-making.
- Ideal for those currently relying on home-built reporting systems or self-configured data analytics reports like Jupyter notebooks.

In the next sections we will focus on the first use case (Full Experimentation Platform).

General design

This is how GrowthBook works as full experimentation platform:

The application under test uses the GrowthBook SDK (available for 11 programming languages) to contact the GrowthBook Server to fetch the experiment setup like buckets and their corresponding assignment probability (weight). While the experiment is running, the application under test has to write the bucket assignment information and any data to compute the goal metric (for example: conversion rate).

The GrowthBook Server is used by the experimenter to set up the experiments but also to examine the results. To calculate the results, GrowthBook will import data from the Tracking Database.

For the Server you can choose between the managed GrowthBook Server as in the picture above (requires payment) and hosting your own server (open source). For the Tracking Database you can use a basic SQL database (which you may host yourself as part of your application) or import the data from specialized tracking systems like Google Analytics.

Experiment setup

We will skip initial setup steps like installation or user management. For more information on this, you can have a look at the GrowthBook documentation. We will touch another important preparation step (define the import of data from the tracking database and also define how the target metric is calculated) below. To launch and evaluate an A/B test with GrowthBook the following steps are usually taken:

Step 1: Add a new feature

In this step you have to define everything required to make a bucket assignment decision. This includes:

A Feature Key for the feature
A Value Type like boolean, string or number
An Attribute to enable consistent hashing like user-id or session-id
The variations with their values and weights which defines the traffic split. In addition other advanced features (limit overall exposure, limit to a subset of users, limit to certain environments, etc.) can be configured here.

Adding a Feature that defines the color of a button on your page.

Step 2: Integrate SDK into application

Change the application under test to include the SDK, query the status of the feature during runtime and react in different ways to the feature value. Deploy the application. More details about this will be given in a section below.

Now the data for your A/B test should start to be collected! However, a further step is needed to be able to analyse and visualise the results with GrowthBook:

Step 3: Define an experiment

This is a manual step since features and experiments are split in GrowthBook.

Add an Experiment for an existing Feature.

Notice that there are differences in design between Features and Experiments: this is likely caused by the desired support for the various use cases outlined above. However, there is still some confusion present, and we hope that developers will refine this aspect in future iterations.

Non-covered Features

In addition to the general properties outlined above GrowthBook has features which are not covered here. These include:

Segments: A segment defines a sub-group of users on which you wish to focus your analysis (for example: in case you want to compare baseline and challenger for users with a specific device or from a particular geographical location).
Dimensions: A dimension is a user attribute set by the experimenter to further drill down into the results. It can be retrieved from an external database (for example: number of purchases) or tracked (for example: browser of the user).
Support for guardrail metrics and the basic security concept.
Premium features available only on the hosted and paid product: This includes a visual A/B Test Editor, CUPED Variance Reduction, SSO integration, team support and more.

Statistical model support

As stated in the documentation, GrowthBook provides both frequentist and Bayesian stats engines, although the latter is preferred and strongly recommended by the tools vendors for its advantages of requiring fewer samples and offering easier interpretation. For the Bayesian approach in the case of binomial data (like in our experiment click/do-not click on a button), it uses a beta distribution with parameters a=b=1 (uninformative prior). At the moment using an informative prior is not supported. As soon as data comes in (number of clicks, number of users), it updates the distribution to get the posterior (see Bayesian Approach section from our previous post to have a look how the posterior is calculated). In case of metrics like count, duration, revenue Growthbook uses a Normal prior.

For a final evaluation and decision making GrowthBook calculates the chance to beat control. Additionally it uses a violin plot to show the distribution of relative uplift (how much better is it?) and provides an estimation of the risk (how many conversions would I lose if I choose B and it’s actually worse?).

The percentage change value is more likely to be in the thicker part of the graph. The shorter the tails of the graph, the less uncertainty there is.

On the other hand, by choosing the frequentist engine, GrowthBook will show: the p-value (instead of chance to beat control) and the 95% confidence interval of the percentage change (instead of the violin plot). See wikipedia for a correct interpretation of the confidence interval.

If Bayesian is not feasible for your experiment, consider Sequential Frequency Testing (which is a premium feature). Plain frequentist testing is recommended as a last resort due to potential issues related to the peeking problem.

GrowthBook: Evaluation

Following the Evaluation approach we have introduced before, we now present our review of Growthbook as A/B testing tool.

Preparing the application for GrowthBook

For the experiment with GrowthBook we had to implement a simple backend (we used Django here) which stores the assignment and subscription events in a PostgreSQL database. The latter can be read without issues from GrowthBook. At least two small SQL statements have to be written. The GUI has very nice support in the construction of the statements.

The first query is for the bucket assignment:

1SELECT session_id   as session_id,
2      timestamp     as timestamp,
3      experiment_id as experiment_id,
4      variant_id    as variation_id
5FROM abtest_backend_app_variantassignment

The second is the source query for the click rate metric:

1SELECT session_id as session_id,
2      timestamp   as timestamp
3FROM abtest_backend_app_event
4WHERE type = 'subscribed'

Following the documentation, the integration of the JavaScript SDK per se was easy: construct the GrowthBook object, load the features and determine the bucket-id based on experiment-id and session-id. Finally react with some DOM changes to the bucket-id.

1const GROWTHBOOK_URL = …
2const CLIENT_KEY = …
3const sessionId = …
4const featureId = …
5
6// Create a GrowthBook instance
7const gb = new window.growthbook.GrowthBook({
8   apiHost: GROWTHBOOK_URL,
9   clientKey: CLIENT_KEY,
10   // Enable easier debugging during development
11   enableDevMode: true,
12   // Targeting attributes
13   attributes: {
14       sessionId: sessionId
15   },
16   trackingCallback: (experiment, result) => {
17       // TODO: Use your real analytics tracking system
18       console.log("trackingCallback: Viewed Experiment", {
19           experimentId: experiment.key,
20           variationId: result.value
21       });
22   }
23});
24
25// Wait for features to be available
26let loadFeatures = gb.loadFeatures({autoRefresh: true});
27loadFeatures.then(value => {
28   console.log("features loaded. Asking for a variant ... ");
29   const variant = gb.getFeatureValue(
30       featureId,
31       "fallback-param"
32   );
33   console.log("got variant:", variant);
34   if (variant === "fallback-param") {
35      console.log('Unknown feature id ' + featureId);
36   }
37   // Insert here: Change the DOM in same way based on variant
38});

Statistical presentation and reliability

In our Evaluation section, we devised some different A/B testing scenarios to be used for evaluating the tools. Following the instructions given in the Experiment setup, we created a Feature and a corresponding Experiment in the Growthbook UI for each of these scenarios. After sending the users data to the backend we can now view the results in GrowthBook and compare them with our own evaluation.

Expected results and comparing them

Similarly to what we did in the statistical section of our previous blog post, we compare the evaluation of the A/B test scenarios given by GrowthBook with our own analysis using Python libraries. Notice that one should be able to reproduce exactly the same results given in the GrowthBook Experiments UI by using the GrowthBook stats engine Python library gbstats.

First of all, some general remarks about comparing the values displayed by GrowthBook with the Python-calculated ones are important:

In all cases we expect the significance of the test to match.
For the frequentist approach we expect the p-value to be slightly different since the estimation done by GrowthBook does not seem to follow the classical formula for the p-value in an independent two-sided t-test (what we chose for our Python evaluation). Have a look at the source code of the gstats library developed by GrowthBook for more information on that.
For the Bayesian approach it is a bit more complex since the chance to beat control is the outcome of one or multiple simulation runs. However simulation runs, use a pseudo random number generator (PRNG) which can cause the result to vary. For this reason, we run multiple simulation runs and get a range of chances to beat control. We expect the value shown in the A/B testing tool to be inside this range. Also the bayesian-testing library and GrowthBook use different methods to estimate the chance to beat control. Have a look at the GrowthBook whitepaper and this post to see a detailed explanation on how this estimation is done by GrowthBook. Finally we use different simulation sizes in our calculation as this may also have an effect on the results.

For the Frequentist approach we used the statsmodels library to calculate test statistics and p-value for randomly generated data.

For the Bayesian approach we used the bayesian_testing library to evaluate a Bayesian test.

1import numpy as np
2
3
4SEED = 42
5
6
7def create_random_rawdata(num_users, num_clicks):
8   """Generate an array of zeros and ones having:
9      length = num_users and num_clicks random ones"""
10   np.random.seed(SEED)
11   rawdata = np.zeros(num_users)
12   random_indices = np.random.choice(num_users, num_clicks, replace=False)
13   rawdata[random_indices] = 1
14   return rawdata
15
16
17def print_frequentist_evaluation(users_baseline, users_challenger, clicks_baseline, clicks_challenger):
18   """Print click rates, p-value and test statistic for a two-sided two-sampled t-test"""
19   from statsmodels.stats import weightstats
20
21
22   # Create "fake" rawdata using randomly generated arrays of zeros and ones
23   data_baseline = create_random_rawdata(users_baseline, clicks_baseline)
24   data_challenger = create_random_rawdata(users_challenger, clicks_challenger)
25
26
27   cr_baseline = np.mean(data_baseline)
28   cr_challenger = np.mean(data_challenger)
29
30
31   # Calculate test statistic and p value
32   tstat, p, _ = weightstats.ttest_ind(
33       x1=data_challenger,
34       x2=data_baseline,
35       alternative='two-sided',
36       usevar='pooled',
37       weights=(None, None),
38       value=0
39   )
40
41
42   print("Click-rate baseline: {:.2f}%".format(cr_baseline * 100))
43   print("Click-rate challenger: {:.2f}%".format(cr_challenger * 100))
44   print()
45   print("T-test statistics: ")
46   print("  p-value: {:.3f}".format(p))
47   print("  tstat: {:.2f}".format(tstat))
48
49def print_bayesian_evaluation(users_baseline, users_challenger, clicks_baseline, clicks_challenger):
50   """Start a binary bayesian test between baseline and challenger for different simulation
51    sizes and return min and max of chance to beat baseline"""
52
53   from bayesian_testing.experiments import BinaryDataTest
54
55   bayesian_test_agg = BinaryDataTest()
56
57   bayesian_test_agg.add_variant_data_agg(name="baseline",
58                                          totals=users_baseline,
59                                          positives=clicks_baseline,
60                                          a_prior=1,
61                                          b_prior=1)
62
63
64   bayesian_test_agg.add_variant_data_agg(name="challenger",
65                                          totals=users_challenger,
66                                          positives=clicks_challenger,
67                                          a_prior=1,
68                                          b_prior=1)
69
70
71   chances_to_beat_control = []
72   min_sim_size = 10000
73   max_sim_size = 100000
74   num_sizes_to_test=10
75   for sim_size in np.linspace(min_sim_size, max_sim_size, num=num_sizes_to_test).astype(int):
76       # Get bayesian test evaluation
77       evaluation = bayesian_test_agg.evaluate(sim_count=sim_size, seed=SEED)
78       # Extract chance to beat control
79       chances_to_beat_control.append(evaluation[1]["prob_being_best"] * 100)
80
81
82    print(
83        f"For a sim size between {min_sim_size} and {max_sim_size} ({num_sizes_to_test} different values tested), the "
84        f"chance\n  to beat control is between {min(chances_to_beat_control):.2f}%"
85        f" and {max(chances_to_beat_control):.2f}%.")
86
87    chances_to_beat_control = []
88    num_seeds_to_test = 10
89    for i in range(num_seeds_to_test):
90        # Get bayesian test evaluation
91        evaluation = bayesian_test_agg.evaluate(sim_count=min_sim_size, seed=SEED + i)
92        # Extract chance to beat control
93        chances_to_beat_control.append(evaluation[1]["prob_being_best"] * 100)
94
95    print(f"For {num_seeds_to_test} different seeds and a sim size of {min_sim_size}, the "
96          f"chance to beat control is\n  between {min(chances_to_beat_control):.2f}%"
97          f" and {max(chances_to_beat_control):.2f}%.")

A/A Test Scenario

Frequentist Engine

We see a p-value (0.94) close to 1, which indicates no significant difference between the variants. Also the percentage change between the conversion rate is centered at zero.

To verify we call:

1print_frequentist_evaluation(8181, 8163, 410, 407)

which outputs:

Click-rate baseline: 5.01%
Click-rate challenger: 4.99%

T-test statistics: 
  p-value: 0.940
  tstat: -0.08

and we see that the p-values are equal.

Bayesian engine

For the Bayesian approach we see a chance to beat control close to 50% and a percent change centered at zero. This clearly shows no difference between the variants..

To verify we call the other helper method:

1print_bayesian_evaluation(8181, 8163, 410, 407)

and get:

For a sim size between 10000 and 100000 (10 different values tested), the chance
  to beat control is between 46.51% and 47.14%.
For 10 different seeds and a sim size of 10000, the chance to beat control is
  between 46.51% and 47.18%.

We see that the value computed by GrowthBook (47%) is within the range.

In summary we see that the tool doesn’t find any significant difference between the variants in both approaches. This meets our expectations.

First A/B Test Scenario (Baseline 5% - Challenger 6%)

Frequentist Engine

The Python code will be in this case called with print_frequentist_evaluation(8161, 8127, 408, 487) and outputs:

Click-rate baseline: 5.00%
Click-rate challenger: 5.99%

T-test statistics: 
  p-value: 0.005
  tstat: 2.78

As we have mentioned before, we see a minor difference between the p-values. But the final decision on the statistical significance is the same.

Bayesian Engine

The Python code will be in this case called with print_bayesian_evaluation(3134, 3166, 157, 190) and outputs:

For a sim size between 10000 and 100000 (10 different values tested), the chance
  to beat control is between 95.71% and 95.86%.
For 10 different seeds and a sim size of 10000, the chance to beat control is
  between 95.46% and 96.09%.

We see that the evaluation result of GrowthBook is in both Python-computed ranges.

It is possible to change the way GrowthBook analyzes data even after the experiment is finished. When we look at the frequentist data from above with the Bayesian approach, we get the following picture. In this we see that with more sessions (8 144 sessions per bucket) the chance to beat control gets higher and the violin plot of the percent change gets more narrow (more certainty).

Second A/B test scenario (Baseline 3% – Challenger 3.3%)

Frequentist engine

The Python function call of print_frequentist_evaluation(53076, 53290, 1592, 1759) outputs:

Click-rate baseline: 3.00%
Click-rate challenger: 3.30%

T-test statistics: 
  p-value: 0.005
  tstat: 2.81

The situation is similar to the one in the first A/B test scenario.

Bayesian engine

The output of print_bayesian_evaluation(53076, 53290, 1592, 1759) is:

For a sim size between 10000 and 100000 (10 different values tested), the chance
  to beat control is between 95.71% and 96.10%.
For 10 different seeds and a sim size of 10000, the chance to beat control is
  between 95.80% and 96.33%.

Here the GrowthBook value is also in range.

Summary of test scenarios

Testing GrowthBook with our 3 scenarios showed that while there are some minor (and expected) differences for the frequentists approach the statistical results match our computations. The way GrowthBook presents the results is also clear.

Our opinion on GrowthBook

We summarize here some Pros and Cons that we have observed using GrowthBook for implementing our tests.

Pros

Versatility: One significant advantage of GrowthBook is its versatility. Unlike other analytics platforms, it is not limited solely to web applications. This flexibility allows businesses operating across various domains to benefit from its features. On the other hand, this means that you also have to create some glue-code even for the basic web application case.
Solid Statistical Methods: The platform uses validated statistical methods to evaluate the results. In this way businesses can make informed decisions based on trustworthy insights.
Flexible Statistical Approach: Although GrowthBook favors the Bayesian approach, it is possible to switch the engine and evaluate your Test using frequentist methodologies. Additionally, GrowthBook offers the possibility to enable sequential testing for the frequentist approach (although this is only available as a premium feature). Changing the engine (Bayesian to frequentist and vice versa) can be made by a simple change in the settings menu. No restart required. A minor con here is that this setting is global for all experiments.
Cost and Setup: All the features we have tested are free to use and easy to set up. This makes the tool very appealing, in particular for private users or small businesses but also for teams who want to start using a tool to get their feet wet.

Cons

Missing details in Documentation: Although most of the instructions and concepts are documented on the docs page, some valuable details are missing. For example, while trying to download our results as a Jupyter Notebook we found some difficulties in setting up the right configuration. After searching for more details we ended up modifying the data source settings directly on the config.yml.
Requirement for Custom Data Tracking: Users have to implement their own data tracking mechanisms or rely on third-party solutions for bucket assignment. While not necessarily a dealbreaker, this additional requirement can add complexity to the setup process.
Feature vs. Experiment: Distinguishing between features and experiments within GrowthBook may be required to support the different use cases but it is not intuitive for a new user and the relationship between them becomes clear only over time.

Summary

In summary, GrowthBook presents an appealing entry point into the world of A/B testing as an open-source solution, eliminating the need for an initial purchase. Developed by individuals well-versed in the intricacies of A/B testing, its self-hosted or cloud-hosted options offer users added flexibility. It serves as an excellent introductory tool for those beginning their experimentation journey. Stay tuned for our exploration of the next tool in our upcoming blog post.

Was this post helpful?

Likes

Blog authors

Francesca Diana

Data Scientist

Do you still have questions? Just send me a message.

Raimar Falke

Senior IT consultant

Do you still have questions? Just send me a message.

fromFrancesca Diana & Raimar Falke

A/B Testing: An introduction

This blog series aims to aid teams who are contemplating adding A/B testing to their toolkit but are unsure of which tool to use. In addition to helping with tool selection, the series also provides the entire team with a consistent initial understanding...

Testing
Data
UX/UI
Analysis

6.2.2024 | 27 Minuten Lesezeit

Raimar Falke

Francesca Diana

The universal recommender in Action(ML)

Introduction Recommender systems have become crucial for many different businesses. E-commerce uses recommenders to guide their customers in finding the right products and to assure they stay on the site. Newspapers or entertainment websites want to ...

AI
NoSQL
Data
Machine Learning
Python

18.4.2021 | 11 Minuten Lesezeit

Francesca Diana

A/B Testing: An introduction

Testing
Data
UX/UI
Analysis

6.2.2024 | 27 Minuten Lesezeit

Raimar Falke

Francesca Diana

How development for AWS changes the local setup

When a new project is set up in the cloud these days, this usually means that AWS is used, that the backend is split into multiple services (e.g. microservices), the frontend and backend communicate using REST and multiple managed AWS services are used...

DevOps
Infrastructure
Microservices
AWS
Cloud
Software development
Testing

2.2.2018 | 8 Minuten Lesezeit

Raimar Falke

AWS CodePipeline Dashboard

In modern stacks, the application often consists of multiple microservices. For one of our projects we had to manage about 20 of these. Not only because of the partnership between codecentric and AWS we decided to use the AWS platform. That also led...

Agile
Cloud
DevOps
Microservices
AWS
Software development

30.1.2018 | 2 Minuten Lesezeit

Oliver Hoogvliet

Raimar Falke

Automatic Testing of Logstash Configuration

In the second half I show how you can test your Logstash configuration. However first I want to show why automatic tests for configuration files are important. Feel free to skip this part if you already know this. Configuration is source code and should...

Agile
Infrastructure
Open Source
Search
CI/CD
DevOps
NoSQL
Logging
Testing

20.6.2016 | 5 Minuten Lesezeit

Raimar Falke

Using Exceptions to Write Robust Software for Stable Production

A study shows that the cause for almost all critical faults is bad error handling. I can back this up with my own experience in various projects: the feature is implemented and there are tests in place which verify the correctness of the implementation...

Agile
Agile methods
CI/CD
Java

20.1.2016 | 16 Minuten Lesezeit

Raimar Falke

Why agile development needs automatic tests

Test the basics There are multiple reasons for tests. Two major reasons are: To prove that a change of the software adds the desired functionality.To ensure that a change does not break the existing functionality (regression testing).It is possible in...

Agile methods
Agile
Software architecture
Testing

16.7.2014 | 3 Minuten Lesezeit

Raimar Falke

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Test Fixtures mit JUnit 5

Wir Softwareentwickler leben in einem ständigen Dilemma. Jede Funktionalität der Software sollte durch Unit-Tests und Integrationstest abgesichert werden. Es sollten dabei so viel Tests wie nötig, aber nur so wenige wie möglich geschrieben werden. Schreiben...

Java
Testing
Framework
Softwareentwicklung

25.3.2024 | 7 Minuten Lesezeit

Jens Kaiser

Green Cloud: Daten und Emissionen sparen

Das Internet produziert jährlich 900 Millionen Tonnen CO₂ – das ist deutlich mehr als Deutschland insgesamt emittiert. Hauptverantwortlich ist der immer weiter steigende Stromverbrauch beim Transport und der Speicherung von Daten. Wenn ihr kurz darüber...

Cloud
Green IT
Softwarearchitektur
Data

11.3.2024 | 5 Minuten Lesezeit

Dennis

Charge your APIs Volume 23: REST vs. gRPC

APIs dienen als Verbindungsstück zwischen Daten und Verarbeitung und erlauben uns damit, Daten im richtigen Kontext als Informationen zu interpretieren. Passende fachliche Themen sind dabei präsenter denn je und erreichen bald auch den Endverbraucher...

Java
Softwareentwicklung
Spring
Softwarearchitektur
API
Data

11.2.2024 | 7 Minuten Lesezeit

Sebastian Tiemann

Datenbanken testen mit Testcontainers in Mule4

Hier erfährst du die Möglichkeiten Testcontainers in Mule4 zu nutzen, um deine Datenbankaufrufe zu testen. Vor einiger Zeit hat mein Kollege Christian Langmann eine Blogartikelserie veröffentlicht, in welcher er aufzeigt, wie man in Mule3 Munit-Tests...

Community
Softwareentwicklung
Testing
API
Open Source
Datenbank
Container
Integration

19.1.2024 | 3 Minuten Lesezeit

Benjamin Lüdicke

Goldene Wasserhähne – Wie wichtig ist Qualität in der Softwareentwicklung...

Stellt man Projektbeteiligten die Frage, ob Qualität von Software wichtig ist, antwortet ein Großteil der Befragten vermutlich mit „Ja”. Jede andere Antwort würde sicherlich weitere, unangenehme Fragen aufkommen lassen. Aber was bedeutet Qualität im ...

Testing
Softwareentwicklung

18.10.2023 | 9 Minuten Lesezeit

Kevin Peters

Eine Einführung in Federated Learning im industriellen Kontext: Fortgeschritten

Im Bereich des maschinellen Lernens wurde eine lange Zeit angenommen, dass die Eingabedaten von Modellen und Gewichten sicher sei und nicht extrahiert werden könnten. In den letzten Jahren veröffentlichte Forschung hat diese Annahme in Frage gestellt...

Machine Learning
Big Data
Data Science
Data

18.9.2023 | 8 Minuten Lesezeit

Ihsan Kisi

Eine Einführung in Federated Learning im industriellen Kontext: Grundlagen

Mithilfe von Daten können Unternehmen fundiertere Entscheidungen treffen, ihre Arbeitsabläufe optimieren und mit der Kraft des maschinellen Lernens (ML) einen Vorteil in der wettbewerbsintensiven Geschäftswelt erlangen. Allerdings ist der Umgang mit ...

Machine Learning
Data Science
Data
Big Data

25.8.2023 | 7 Minuten Lesezeit

Ihsan Kisi

Die Bingo Bongo-Methode: ein spielerischer Software-Testing-Ansatz

Software-Testing kann zur Herausforderung werden. Aber was wäre, wenn es weniger wie Arbeit und mehr wie ein Spiel wäre? Etwas, das das ganze Team einbezieht und sogar Spaß macht? In diesem Beitrag stellen wir Bingo Bongo vor, einen spielerischen Ansatz...

Testing
Agile Methoden
Agilität

31.7.2023 | 4 Minuten Lesezeit

Benjamin Knauer

Test-Fixtures: Wozu denn überhaupt?

Für uns Softwareentwickler ist der ultimative Endgegner immer die Komplexität. Wir haben zahlreiche, teils ziemlich mächtige Waffen gesammelt, um in diesen Kämpfen bestehen zu können: Dinge wie Modularisierung, Abstraktion, Lean Development, iteratives...

Testing
Java
Test Driven Development

12.5.2023 | 19 Minuten Lesezeit

Rüdiger zu Dohna

Jetpack Compose: So gestaltest du deklarativ User Interfaces für Android...

Anfang 2023 hatte ich ein paar kleine Ideen für spaßige Apps. Daher fing ich an, sie mit Kotlin und Android Studio umzusetzen. Leider fiel es mir von Anfang an schwer, mich in die XML-Dateien hineinzudenken und bei der Navigation ging es endgültig durcheinander...

Android
Kotlin
UX/UI

4.5.2023 | 5 Minuten Lesezeit

Robert Meißner

Charts im Browser – Eine Einführung in AG Grid (Teil 2)

Nachdem wir in Teil 1 unserer kleinen Reihe zum AG-Grid-Framework gezeigt haben, wie man damit schnell interaktive Tabellen erstellt, geht es in diesem Beitrag darum, wie man die gleichen Daten auch in Grafiken (wie Balkendiagramme, Pie Charts oder Zeitserien...

React
Frontend
JavaScript
Framework
Softwareentwicklung

2.5.2023 | 6 Minuten Lesezeit

Daniel Töws

Selvarajah Sivarupan

Astro – Mit der Insellösung zur Lichtgeschwindigkeit

Astro stellt sich als „All-in-one Web Framework“ vor, das „designed for speed“ ist. Große Versprechen wie „Pull your content from anywhere“, „Deploy everywhere“ und „Use whatever frontend library you want“ prangen offensiv auf der Startseite. Eine eierlegende...

Frontend
JavaScript
Webdevelopment
Framework
Softwareentwicklung

14.4.2023 | 4 Minuten Lesezeit

Stephan Köninger

Microservice Integration Testing done right

In diesem Artikel beschreiben wir gesammelte Best Practices für das Integration Testing von Microservices. Zu diesem Zweck haben wir ein Projekt namens toti-example-service erstellt und auf GitHub veröffentlicht. Wir werden uns in diesem Beitrag immer...

Testing
Microservices
Spring
Kotlin

11.4.2023 | 7 Minuten Lesezeit

Tobias Dittrich

Till Voß

Modernes Data Fetching mit Redux Toolkit Query

Das vor sieben Jahren erstmals veröffentlichte Redux wurde bereits vor vier Jahren mit Redux Toolkit (RTK) modernisiert. Im Juni 2021 erreichte Redux dann die nächste Evolutionsstufe, indem mit Redux Toolkit Query eine dedizierte Data-Fetching-Lösung...

React
JavaScript
Frontend

28.2.2023 | 10 Minuten Lesezeit

Christoph Butschkau

Björn Böing

Bessere SQL-Datenpipelines mit dbt

SQL ist weiterhin aus der Datenanalyse nicht wegzudenken – es ist vergleichsweise einfach zu lernen und Anwender können es ohne zusätzliche Werkzeuge auf einer Datenbank ausführen. Entsprechend ist es bei vielen Datenanalysten und Engineers beliebt. ...

Data

22.2.2023 | 2 Minuten Lesezeit

Matthias Niehoff

Tabellen im Browser – Eine Einführung in AG Grid (Teil 1)

Die heutige Datenflut hat Software und Frameworks, wie Tableau, D3 und viele andere, hervorgebracht, deren Aufgabe es ist, die Visualisierung von Daten zu verbessern. Doch trotz der teilweise sehr ausgefallenen Darstellungsformen ist manchmal die simple...

Framework
Frontend
JavaScript
React
Softwareentwicklung

17.2.2023 | 6 Minuten Lesezeit

Daniel Töws

Selvarajah Sivarupan

Mule 4: Test-Parametrisierung – ein Flow für viele Fälle

Immer wieder entdecke ich bei Code-Reviews, dass für verschiedene Testfälle, die sich prinzipiell nur durch die Ein- und Ausgabedaten unterscheiden, eine Vielzahl von MUnit-Tests angelegt werden. Diese Flows werden dann mühselig kopiert, um jeden Testfall...

Integration
API
Testing

16.2.2023 | 5 Minuten Lesezeit

Pasquale Brunelli

ChatGPT im Alltag eines Python-Entwicklers

Seit einigen Tagen spiele ich mit ChatGPT herum. Beruflich und privat konnte ich damit einige Fragen bearbeiten, bspw. welche Alternativen es zu bestimmten Tools gibt, was Vorteile von Teilzeit für den Arbeitgeber sind oder wer ich bin. Leider weiß ChatGPT...

NLP
Python
Künstliche Intelligenz

27.1.2023 | 7 Minuten Lesezeit

Robert Meißner

Manches gehört zusammen, manches besser nicht - Konnaszenz in Python

Wir alle kennen es. Wir bekommen neuen Code und irgendwie macht der merkwürdige Sachen. Teilweise müssen wir Reverse Engineering betreiben. Wir wundern uns, warum eine Umgebungsvariable nicht korrekt gesetzt wird oder der Login schief geht. Bis wir merken...

Python
Softwareentwicklung
Softwarearchitektur

30.11.2022 | 7 Minuten Lesezeit

Robert Meißner

P5.JS: Zeichnen mit der Open-Source-JavaScript-Bibliothek

Im Rahmen eines kleinen Projekts, bei dem es um das Thema Berechnung von Flugrouten ging, brauchten wir eine einfache und leichtgewichtige Möglichkeit, die Route und andere Bereiche auf der Karte zu visualisieren. Bei der Suche nach einem passenden ...

JavaScript
Framework
Open Source

28.11.2022 | 14 Minuten Lesezeit

Danny Steinbrecher

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.

Contact

Send

A/B Testing: Tool support and testing GrowthBook

Tool support

Evaluating A/B testing tools

The dummy application

Evaluation aspects

Testing scenarios

GrowthBook: Introduction

Use cases

General design

Experiment setup

Step 1: Add a new feature

Step 2: Integrate SDK into application

Step 3: Define an experiment

Non-covered Features

Statistical model support

GrowthBook: Evaluation

Preparing the application for GrowthBook

Statistical presentation and reliability

Expected results and comparing them

A/A Test Scenario

Frequentist Engine

Bayesian engine

First A/B Test Scenario (Baseline 5% - Challenger 6%)

Frequentist Engine

Bayesian Engine

Second A/B test scenario (Baseline 3% – Challenger 3.3%)

Frequentist engine

Bayesian engine

Summary of test scenarios

Our opinion on GrowthBook

Pros

Cons

Summary

Was this post helpful?

Ja

Blog authors

Get in contact

Get in contact

Contact Francesca

Contact Raimar

More articles

A/B Testing: An introduction

The universal recommender in Action(ML)

A/B Testing: An introduction

How development for AWS changes the local setup

AWS CodePipeline Dashboard

Automatic Testing of Logstash Configuration

Using Exceptions to Write Robust Software for Stable Production

Why agile development needs automatic tests

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

View Job

More articles in this subject area

Test Fixtures mit JUnit 5

Green Cloud: Daten und Emissionen sparen

Charge your APIs Volume 23: REST vs. gRPC

Datenbanken testen mit Testcontainers in Mule4

Goldene Wasserhähne – Wie wichtig ist Qualität in der Softwareentwicklung...

Eine Einführung in Federated Learning im industriellen Kontext: Fortgeschritten

Eine Einführung in Federated Learning im industriellen Kontext: Grundlagen

Die Bingo Bongo-Methode: ein spielerischer Software-Testing-Ansatz

Test-Fixtures: Wozu denn überhaupt?

Jetpack Compose: So gestaltest du deklarativ User Interfaces für Android...

Charts im Browser – Eine Einführung in AG Grid (Teil 2)

Astro – Mit der Insellösung zur Lichtgeschwindigkeit

Microservice Integration Testing done right

Modernes Data Fetching mit Redux Toolkit Query

Bessere SQL-Datenpipelines mit dbt

Tabellen im Browser – Eine Einführung in AG Grid (Teil 1)

Mule 4: Test-Parametrisierung – ein Flow für viele Fälle

ChatGPT im Alltag eines Python-Entwicklers

Manches gehört zusammen, manches besser nicht - Konnaszenz in Python

P5.JS: Zeichnen mit der Open-Source-JavaScript-Bibliothek

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Unsere Leistungen

Hilf uns, noch besser zu werden.

Zu den Jobangeboten