Behaviour-driven development (BDD) of an Alexa Skill with Cucumber.js – Part 1

4.3.2019 | 11 minutes reading time

In this blog article we will learn how to do behaviour-driven development (BDD) of an Alexa Skill utilizing the Cucumber.js framework.

Why do you even want to do this?

I didn’t even intend to do a blog article on BDD of Alexa Skills. I just wanted to get into voice UIs and learn more about the current state of the art. I must admit: I even tried to skip writing tests altogether. But I quickly learned that there are just too many things that could possibly go wrong, and the turnaround times from writing code to seeing whether it actually works are just too long:

define the voice UI
write the code locally
build the code
upload your code
start the Alexa Skill in the Alexa Simulator or on your Echo device
Alexa reports an error
search for the log file
realize you forgot to register your new intent handler
fix your code and restart the loop

Imagine we are building an Alexa Skill to write down points in a game of dice. What if we could define our voice interaction upfront like this…


 Scenario: A new game can be started, the number of players is stored
    Given the user has opened the skill
    When the user says: Start a new game with 4 players
    Then Alexa replies with: Okay, I started a new game for 4 players
    When the user says: How many players do I have?
    Then Alexa replies with: You are playing with 4 players

…and then be able to actually run this as an automated acceptance test?

This is exactly what we are going to do in this blog entry.

About Cucumber.js

Cucumber.js is the JavaScript version Cucumber, a framwork for running automated acceptance tests. There are already a lot of good introductions for Cucumber, so I will keep this part short. If you are new to Cucumber, I recommend reading their very good introduction first.

Cucumber allows us to define test scenarios in a text file.

Each test scenario starts with the keyword Scenario, followed by the name of the test.

The test itself consists of different steps. Each step has to be started with one of these keywords:

Given: Defines the initial context of the system prior to our test. In the example above, we assume that the user has already opened our Alexa Skill.
When: Describes an action taken in the test. In our case, this could be the user saying something to their echo device.
Then: Specifies the expected outcome of the prior steps. For us, this would typically be the reply Alexa gave to the user.

There is more to it, but this is a very short description of how a Cucumber test might be structured.

But out of the box Cucumber has no clue how to execute these steps. That’s something we need to do, and that’s what this article is about.

Ingredients of an Alexa Skill

To be able to implement the Cucumber steps, we first need to look at the different parts of an Alexa Skill. To create a new skill, we have to create at least two artifacts:

The first part is the Voice Interaction Model. This model defines all commands (intents) a user can trigger in our skill and which speech input (utterances) will trigger which intent.
The second part is the code handling these intents. Typically, this code will receive an intent request and will answer with a voice reply (though there are other possible results). It could be any webservice, but Amazon makes our life easier if we use an AWS Lambda for this purpose (which is what we will do here).

So a user interaction is processed as follows:
Processing of a user interaction with an Alexa Skill

The user talks to their Echo device.
From the device the speech input is sent to the Alexa Voice Service.
The Alexa Voice Service uses our Voice Interaction Model to find a fitting intent and (if one is found) creates a proper intent request.
Our Lambda function receives the intent request. Executes the appropriate intent handler which generates a response sent back to the device

Our voice interaction model

Before we take care of the Alexa Voice Service in our tests, let’s take a look at our interaction model. We will need this locally for the tests, so we first make a local copy:

Open the Alexa Developer Console, navigate to the Alexa Skill and open the JSON editor. From there you can copy the content and create a local version of the voice interaction model.

The JSON editor of the voice interaction model

The code of this blog article is available in this GitLab repository . The example is localized to English and German, so there are actually two voice interaction models: the English one (interactionModel.en.json) and the German one (interactionModel.de.json).

Our skill for the blog entry has two intents: one to start a new game (remember: it’s about a fictional game of dice) and one to ask how many players there are.

The English voice interaction model looks like this:

1{
2    "interactionModel": {
3    "languageModel": {
4        "invocationName": "five of a kind",
5            "intents": [
6            {
7                "name": "AMAZON.FallbackIntent",
8                "samples": []
9            },
10            {
11                "name": "AMAZON.CancelIntent",
12                "samples": []
13            },
14            {
15                "name": "AMAZON.HelpIntent",
16                "samples": []
17            },
18            {
19                "name": "AMAZON.StopIntent",
20                "samples": []
21            },
22            {
23                "name": "AMAZON.NavigateHomeIntent",
24                "samples": []
25            },
26            {
27                "name": "starteSpiel",
28                "slots": [
29                    {
30                        "name": "spieleranzahl",
31                        "type": "AMAZON.NUMBER"
32                    },
33                    {
34                        "name": "players",
35                        "type": "playersSlot"
36                    }
37                ],
38                "samples": [
39                    "Start a new game with {spieleranzahl} {players}"
40                ]
41            },
42            {
43                "name": "wieVieleSpieler",
44                "slots": [],
45                "samples": [
46                    "How many players do we have",
47                    "How many players do I have"
48                ]
49            }
50        ],
51            "types": [
52            {
53                "name": "playersSlot",
54                "values": [
55                    {
56                        "name": {
57                            "value": "persons"
58                        }
59                    },
60                    {
61                        "name": {
62                            "value": "people"
63                        }
64                    },
65                    {
66                        "name": {
67                            "value": "players"
68                        }
69                    }
70                ]
71            }
72        ]
73    }
74}
75}

We see some default AMAZON intents (like Cancel or Stop) and our game intents starteSpiel (startGame) and wieVieleSpieler. Intent names and slot names are in German, because that is the voice interaction model I started with and localizing the internal values is probably not a good idea (note to myself: start with the English blog next time).

The starteSpiel intent can be triggered with:

Start a new game with {spieleranzahl} {players}

There are two slots (the values in curly braces):

spieleranzahl (number of players) of type AMAZON.number and
players (in English, because it is specific to the English voice interaction model) of type playersSlot.

The type playersSlot is a user-defined slot type which could be replaced by the values “players”, “persons” or “people”. We don’t actually care which of these words the user uses to address players, we just want to support a large variety of different voice inputs.

The mock voice service

To be able to run our acceptance tests locally, we have to replace Amazon’s Alexa Voice Service with a local mock voice service.

Interaction flow in Cucumber tests with our mock voice service

A generic `When the user says` step

Of course we could create Cucumber steps specifically for each individual intent, however, we strive for a more generic approach. We will create a generic When the user says step and the mock voice service will decide which intent should be triggered.

To do this, our mock voice service reads the voice interaction model and stores all possible utterances our Alexa Skill understands, and for each utterance the associated intent.

For our Cucumber step, we have to decide whether the given phrase from the test step matches any of our utterances. Therefore, we convert each utterance to a regular expression. For me, regular expressions are always a bit scary, because I seldom use them and therefore find them hard to read. But this is pretty simple, I promise. It’s best described with an example:

The utterance:

Start a new game with {spieleranzahl} {players}

is converted to the following regular expression:

^Start a new game with (.*) (players|persons|people)$

While converting the utterance from the voice interaction model to a regular expression, we did the following:

We added a ^ at the beginning and a $ at the end, so our regular expression will only match on input with exact this beginnig and end, we don’t match on substrings.
Additionally, we replaced all predefined slots (of a type beginning with AMAZON.) with the expression: (.*), matching any input.
And finally we replace all user-defined slots with an expression matching all possible values (in the given example: “players”, “persons” or “people”).

To execute a test, we only have to match the speech input from the test step against all regular expressions generated from our voice interaction model. If we have a match, we know the intent we have to call and the values used for the different slots. With this information, we can build our own IntentRequest which is then given to the Lambda function that is being tested.

To build the IntentRequest (or any other request, like the LaunchRequest), there is a folder with request templates in the GitLab project. These templates are read during test execution and intent names and slot values are replaced with the values we determined before.

The implementation of the When the user says step looks like this:

1async function theUserSays(utterance, locale: string) {
2    const voiceUiModel = getVoiceUiModel(locale);
3 
4    const allUtterances = getAllUtterances(voiceUiModel.interactionModel);
5 
6    const matchingIntent: ?IntentInvocation = findMatchingIntent(allUtterances, utterance);
7 
8    expect(allUtterances).toHaveMatchingIntentFor(utterance);
9    if (!matchingIntent) return;
10 
11    const slots = matchingIntent && matchingIntent.slots.reduce((acc, cur: Slot) => ({
12        ...acc,
13        [cur.name]: {
14            name: [cur.name],
15            value: cur.value,
16            confirmationStatus: 'NONE',
17            source: 'USER'
18        }
19    }), {});
20 
21    const json = fs.readFileSync('src/test/mockVoiceService/requestJsonTemplates/intentRequest.json', 'utf-8');
22    const intentRequest = JSON.parse(json);
23 
24    intentRequest.request.intent.name = matchingIntent.intentName;
25    intentRequest.request.intent.slots = slots;
26    intentRequest.request.locale = locale;
27 
28    this.lastRequest = intentRequest;
29 
30    await executeRequest(this, this.skill, intentRequest);
31}
32 
33When(/^der Anwender sagt[:]? (.*)$/, async function(utterance) {
34    await theUserSays.call(this, utterance, 'de');
35});
36 
37When(/^the user says[:]? (.*)$/, async function(utterance) {
38    await theUserSays.call(this, utterance, 'en');
39});

The function theUserSays does all the heavy lifting. But it is not directly the Cucumber.js step implementation. Actually, two steps are implemented, one for the German and one for the English language. Since the skill is localized, it makes sense that we also localize our acceptance tests. Both step definitions just call the theUserSays function and set locale parameter.

In the theUserSays function, we start with getAllUtterances, which will get us all utterances of our Alexa Skill as regular expressions according to the description above.

The function findMatchingIntent will return the matching intent and all slot values.

expect(allUtterances).toHaveMatchingIntentFor(utterance) is a custom matcher (we use the expect package from jest here) to make sure we actually have a match, otherwise the test will fail.

In the following reduce, we build a dictionary object with all given slot values (as expected by the skill request).

Finally we execute the built request with executeRequest.

Checking the response of our Alexa Skill

So far, we can successfully call the Lambda function of our skill from within our test. But we also need to look at the response to be able to check whether it meets our expectations.

Let’s take a look at the executeRequest method first:

1async function executeRequest(world, skill, request) {
2    return new Promise((resolve) => {
3        skill(request, {}, (error, result) => {
4            world.lastError = error;
5            world.lastResult = result;
6            resolve();
7        });
8    });
9}

The executeRequest function is a Promise which will be resolved as soon as we got a reply from our skill. This reply is given by calling the callback function we provide as the third parameter. The callback function receives two parameters. Depending on the success of the call, either the parameter error or the parameter result is set. Either way, we store both values in the global world object. This world object is where we store the current state of our test and Cucumber.js will take care of providing each Cucumber step implementation with this world object.

So to implement an Alexa replies with step is now actually quite easy:

1function alexaReplies(expectedResponse) {
2    expect(this.lastResult.response.outputSpeech.ssml).toEqual(`<speak>${expectedResponse}</speak>`);
3}
4Then(/^antwortet Alexa mit[:]? (.*)$/, alexaReplies);
5Then(/^Alexa replies with[:]? (.*)$/, alexaReplies);

Again, the main implementation is an extra function called by Cucumber steps for each language. The expectation is using the this reference to get access to the world object, which is Cucumber.js’s way of providing us with the test context (that’s the reason why we don’t use arrow functions for Cucumber steps since arrow functions have a different way of handling the this keyword).

Since we expect speech output in this step, we know this is given in the outputSpeech attribute of the response object.

Conclusion and outlook

With the described technique we already have a good base framework to work with Cucumber.js acceptance tests against our Alexa Skill. Adding more Cucumber steps for slot confirmation or screen output is very straightforward and you can see more implemented steps in the linked GitLab project .

Handling attributes for storing values (per request, session, or permanent) is something we’ll look into in part 2 of this blog series. Until then I’m happy to hear your thoughts and feedback in the comments below.

Was this post helpful?

Blog author

Stefan Spittank

Do you still have questions? Just send me a message.

fromStefan Spittank

Behaviour-driven development (BDD) of an Alexa Skill with Cucumber.js ...

In the first post of this blog series we established a framework to easily write acceptance tests for an Alexa Skill in Cucumber.js. This second part will be about enhancing this framework, so that our skill is able to use state handling. Ways to store...

AWS
BDD
Testing
JavaScript
Voice UI

11.3.2019 | 9 minutes reading time

Stefan Spittank

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Hexagonal Architecture is just an island

Imagine an island called "Alistair Island." This island is a vibrant place with houses, fertile soil, and a well-coordinated community of residents who live by well-defined routines. Every activity on the island has significance and serves a specific...

Software architecture
Testing
Software development

22.1.2025 | 10 minutes reading time

Danny Keller

Spring and Vue - A setup for small projects (Part 2)

In the first part we presented a setup for a combination of Spring Boot and Vue.js. Now we have to look at how to connect two type-safe languages, TypeScript for the frontend and Java for the backend, through a REST-API and in a type-safe manner. We ...

Spring
Frontend
API
JavaScript
Java

17.1.2025 | 10 minutes reading time

Roger Butenuth

Nils Winking

Spring and Vue - A setup for small projects (Part 1)

Quickly adding a new Vue.js application to an existing Spring Boot project should be pretty easy, or at least a googleable problem, or so we thought. But in the end, it wasn't. However, with the right combination of configuration, components, and some...

Spring
Frontend
JavaScript
Java
API

10.1.2025 | 8 minutes reading time

Roger Butenuth

Nils Winking

We deployed our SaaS Application on fly.io (and it was great).

How we deployed our application in a fraction of the time while saving 100% of the cost. Our team, a bunch of experienced software engineers without prior contact to cloud deployments, wanted to deploy our OCPP-compliant EV Charging Station Simulator...

AWS
Cloud

23.10.2024 | 4 minutes reading time

Jannis Mainczyk

Charge your APIs Volume 33 - Definition-Based API Mocking, Simulation,...

Key TakeawaysThis article is the third and last one in a three-part series about definition-based API mocking, simulation, and testing with Microcks (make sure you have read the first and second article)The previous articles focused on (i) Microcks’ ...

Testing
API

23.10.2024 | 11 minutes reading time

Dr. Florian Rademacher

Charge your APIs Volume 32 - Definition-Based API Mocking, Simulation,...

Key TakeawaysThis article is the second one in a three-part series about definition-based API mocking, simulation, and testing with Microcks (make sure you have read the first article)While the previous article concentrated on Microcks’ architecture,...

API
Testing

16.10.2024 | 11 minutes reading time

Dr. Florian Rademacher

Charge your APIs Volume 31 - Definition-Based API Mocking, Simulation,...

Key TakeawaysAPI mocking used, e.g., for integration testing, is challenging as it assumes conformance to mocked API functionality, which can incur significant costs as mock complexity increases with API complexityDefinition-based API mocking can reduce...

API
Testing

9.10.2024 | 9 minutes reading time

Dr. Florian Rademacher

Dangling DNS in cloud infrastructures

Dangling DNS entries are nothing new. Forgotten, outdated or incorrect DNS records can lead to subdomains being taken over and used in phishing campaigns, for example, to steal employee secrets. Due to dynamic IP addresses of rapidly changing resources...

IT-Security
Validation
Cloud
AWS
Infrastructure

5.9.2024 | 4 minutes reading time

Markus Höfer

Spring Boot and HTMX: Deployment to AWS Lambda

This is the next part of my series about Spring Boot and HTMX. In this post, I will show you how to deploy the application created in the previous post to AWS Lambda. If you're in a hurry or impatient, you can simply check out the accompanying Git Repo...

Serverless
Spring
AWS
DevOps
Cloud

30.7.2024 | 5 minutes reading time

React is dead, long live React - React 19 is here

The world of frontend development has changed once again, and this time React 19 is leading the way. This version brings a variety of new features and improvements, but the most exciting innovation is the brand new compiler, which already requires React...

React
Frontend
Software development
JavaScript
Webdevelopment

19.7.2024 | 6 minutes reading time

Michel Ehmen

Server Actions in Next.js 14

Server Actions were introduced in Next.js 14 as a new method to send data to the server (see the documentation). They are asynchronous functions that can be used in server components, within server-side forms, as well as in client-side components. While...

Webdevelopment
React
JavaScript

10.6.2024 | 9 minutes reading time

Lukas Lehmann

Charge your APIs Volume 27: Transition from COE/C4E to an API Platform...

The Center of Excellence (COE) focuses on centralised expertise, ensuring best practices and governance, while the Center for Enablement (C4E) empowers teams with tools, guidance, and support for API development. Although beneficial, these models face...

API
Platform engineering
Agile transformation
Agile

24.5.2024 | 10 minutes reading time

Daniel Kocot

Becoming a Data-Driven Company with Applied Data Products

In recent years, the hype surrounding the value of data has grown continuously, and a multitude of concepts and methods have emerged on how companies can become 'data-driven'. From strategic top management to detail-oriented data analysts attempts are...

Agile
Big Data
Data
Product management
Digitalization
Data Science
Business Intelligence

18.5.2024 | 9 minutes reading time

Dr. Florian Rademacher

Playwright tests and API Mocking

Problem definition Playwright tests can sometimes depend on external services such as APIs, which might happen to be unavailable at times. In this case there are several options for executing these tests adequately, as described below. Actually call ...

Testing

10.5.2024 | 4 minutes reading time

Ege Inanc

Charge your APIs Volume 25: Contract Testing

I feel the way we do integration testing is sort of like setting your house on fire to test your smoke alarm. It is excessive, tiresome and way too costly. This is not a quote from myself. I typically don't come up with such good ideas when I need....

Testing
Software development
API

2.4.2024 | 11 minutes reading time

Pasquale Brunelli

A/B Testing: Tool support and testing GrowthBook

In the previous blog post we introduced some general concepts of A/B testing: we explored the main aspects, defined test types and explained the most common statistical methods. Now we want to explore the areas in which A/B testing tools can provide...

Testing
Python
Data
UX/UI
Analysis
JavaScript

18.3.2024 | 20 minutes reading time

Francesca Diana

A/B Testing: An introduction

This blog series aims to aid teams who are contemplating adding A/B testing to their toolkit but are unsure of which tool to use. In addition to helping with tool selection, the series also provides the entire team with a consistent initial understanding...

Testing
Data
UX/UI
Analysis

6.2.2024 | 29 minutes reading time

Francesca Diana

Building desktop apps with web technologies

Building desktop apps with web technologies In this article I share insights into Electron and what to consider when shipping an desktop app with Electron. After that I introduce you to a new alternative called Tauri. It the end I provide an estimation...

Frontend
JavaScript
Node.js
Open Source
Webdevelopment

20.9.2023 | 13 minutes reading time

Count your queries! Repository integration tests with Hibernate Statistics

If you are using Spring Data JPA as a data access framework, Hibernate is almost certainly hiding under the hood. And although this setup takes a lot of work off your hands by doing a lot of awesome things, the final outcome should better be checked....

Java
Testing
Spring
Database

7.8.2023 | 6 minutes reading time

Kevin Peters

Charge your APIs Volume 6: Perfecting Your APIOps - Harnessing the Power...

Our journey through the expansive landscape of API Operations (APIOps) has led us through various territories. We've delved into Continuous Integration and Deployment, ensuring seamless transitions from coding to production-ready APIs with minimal friction...

API
Testing
GitHub

14.6.2023 | 2 minutes reading time

Daniel Kocot

Behaviour-driven development (BDD) of an Alexa Skill with Cucumber.js – Part 1

Why do you even want to do this?

About Cucumber.js

Ingredients of an Alexa Skill

Our voice interaction model

The mock voice service

A generic `When the user says` step

Checking the response of our Alexa Skill

Conclusion and outlook

Was this post helpful?

Blog author

More articles

Behaviour-driven development (BDD) of an Alexa Skill with Cucumber.js ...

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

More articles in this subject area

Hexagonal Architecture is just an island

Spring and Vue - A setup for small projects (Part 2)

Spring and Vue - A setup for small projects (Part 1)

We deployed our SaaS Application on fly.io (and it was great).

Charge your APIs Volume 33 - Definition-Based API Mocking, Simulation,...

Charge your APIs Volume 32 - Definition-Based API Mocking, Simulation,...

Charge your APIs Volume 31 - Definition-Based API Mocking, Simulation,...

Dangling DNS in cloud infrastructures

Spring Boot and HTMX: Deployment to AWS Lambda

React is dead, long live React - React 19 is here

Server Actions in Next.js 14

Charge your APIs Volume 27: Transition from COE/C4E to an API Platform...

Becoming a Data-Driven Company with Applied Data Products

Playwright tests and API Mocking

Charge your APIs Volume 25: Contract Testing

A/B Testing: Tool support and testing GrowthBook

A/B Testing: An introduction

Building desktop apps with web technologies

Count your queries! Repository integration tests with Hibernate Statistics

Charge your APIs Volume 6: Perfecting Your APIOps - Harnessing the Power...

Behaviour-driven development (BDD) of an Alexa Skill with Cucumber.js – Part 1

Why do you even want to do this?

About Cucumber.js

Ingredients of an Alexa Skill

Our voice interaction model

The mock voice service

A generic When the user says step

Checking the response of our Alexa Skill

Conclusion and outlook

Was this post helpful?

Blog author

More articles

Behaviour-driven development (BDD) of an Alexa Skill with Cucumber.js ...

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

More articles in this subject area

Hexagonal Architecture is just an island

Spring and Vue - A setup for small projects (Part 2)

Spring and Vue - A setup for small projects (Part 1)

We deployed our SaaS Application on fly.io (and it was great).

Charge your APIs Volume 33 - Definition-Based API Mocking, Simulation,...

Charge your APIs Volume 32 - Definition-Based API Mocking, Simulation,...

Charge your APIs Volume 31 - Definition-Based API Mocking, Simulation,...

Dangling DNS in cloud infrastructures

Spring Boot and HTMX: Deployment to AWS Lambda

React is dead, long live React - React 19 is here

Server Actions in Next.js 14

Charge your APIs Volume 27: Transition from COE/C4E to an API Platform...

Becoming a Data-Driven Company with Applied Data Products

Playwright tests and API Mocking

Charge your APIs Volume 25: Contract Testing

A/B Testing: Tool support and testing GrowthBook

A/B Testing: An introduction

Building desktop apps with web technologies

Count your queries! Repository integration tests with Hibernate Statistics

Charge your APIs Volume 6: Perfecting Your APIOps - Harnessing the Power...

A generic `When the user says` step