AWS Lambda: Cold boot and mean response times in Scala vs. Java

1.2.2019 | 6 minutes reading time

AWS Lambda is a popular service for hosting microservice functions in the cloud without provisioning actual servers. It supports Node.js, Python, Go, C#, PowerShell and Java – more specifically: java-1.8.0-openjdk. As Scala 2.12 is compatible with JVM 8, we can also run Scala code serverless in the cloud! But does using Scala have any impact on the performance over using plain old Java? How are the cold start and mean response times? Let’s find out!

tl;dr: Mean response times are equal, cold start times are slower with Scala than with Java, but improve with increased memory.

Project structure

First we create two projects: one Java project using Maven and one Scala project using sbt to build completely independent JAR files. When using AWS Lambda, we have to supply all dependencies in a fat JAR and by splitting the projects, we have a minimal JAR for each Lambda function. Both build files contain dependencies to the AWS lambda libraries com.amazonaws » aws-lambda-java-core and com.amazonaws » aws-lambda-java-events to provide the application with the APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent and Context data structures. Those encapsulate the http request and response from an AWS API Gateway and provide a safe way to get the http request and provide a valid response. The API Gateway is the gate between the internet and our functions. The Scala JAR file additionally includes the Scala library.

build.sbt

1lazy val root = (project in file("."))
2  .settings(
3    name := "aws_lambda_bench_scala",
4    organization := "de.codecentric.amuttsch",
5    description := "Benchmark Service for AWS Lambda written in Scala",
6    licenses += "Apache License, Version 2.0" -> url("https://www.apache.org/licenses/LICENSE-2.0"),
7 
8    version := "0.1",
9    scalaVersion := "2.12.8",
10 
11    assemblyJarName in assembly := "aws_lambda_bench_scala.jar",
12 
13    libraryDependencies ++= Seq(
14      "com.amazonaws" % "aws-lambda-java-core" % "1.2.0",
15      "com.amazonaws" % "aws-lambda-java-events" % "2.2.5",
16    )
17  )

pom.xml

1<?xml version="1.0" encoding="UTF-8"?>
2<project xmlns="http://maven.apache.org/POM/4.0.0"
3         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
4         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
5    <modelVersion>4.0.0</modelVersion>
6 
7    <groupId>de.codecentric.amuttsch</groupId>
8    <artifactId>aws_lambda_bench_java</artifactId>
9    <version>0.1</version>
10 
11    <packaging>jar</packaging>
12 
13    <properties>
14        <maven.compiler.source>1.8</maven.compiler.source>
15        <maven.compiler.target>1.8</maven.compiler.target>
16        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
17    </properties>
18 
19    <dependencies>
20        <dependency>
21            <groupId>com.amazonaws</groupId>
22            <artifactId>aws-lambda-java-core</artifactId>
23            <version>1.2.0</version>
24        </dependency>
25        <dependency>
26            <groupId>com.amazonaws</groupId>
27            <artifactId>aws-lambda-java-events</artifactId>
28            <version>2.2.5</version>
29        </dependency>
30    </dependencies>
31 
32    <build>
33        <plugins>
34            <plugin>
35                <groupId>org.apache.maven.plugins</groupId>
36                <artifactId>maven-shade-plugin</artifactId>
37                <version>3.2.1</version>
38 
39                <configuration>
40                    <createDependencyReducedPom>false</createDependencyReducedPom>
41                </configuration>
42                <executions>
43                    <execution>
44                        <phase>package</phase>
45                        <goals>
46                            <goal>shade</goal>
47                        </goals>
48                    </execution>
49                </executions>
50            </plugin>
51        </plugins>
52    </build>
53</project>

Lambda functions

Next, we implement the actual handler functions in both Scala and Java. They just return a http 200 response and don’t do any processing to see the actual impact of the language, rather than from some any arbitrary computations.

ScalaLambda.scala

1package de.codecentric.amuttsch.awsbench.scala
2 
3import com.amazonaws.services.lambda.runtime.Context
4import com.amazonaws.services.lambda.runtime.events.{APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent}
5 
6class ScalaLambda {
7  def handleRequest(event: APIGatewayProxyRequestEvent, context: Context): APIGatewayProxyResponseEvent = {
8    new APIGatewayProxyResponseEvent()
9      .withStatusCode(200)
10  }
11}

JavaLambda.java

1package de.codecentric.amuttsch.awsbench.java;
2 
3import com.amazonaws.services.lambda.runtime.Context;
4import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyRequestEvent;
5import com.amazonaws.services.lambda.runtime.events.APIGatewayProxyResponseEvent;
6 
7public class JavaLambda {
8    public APIGatewayProxyResponseEvent handleRequest(APIGatewayProxyRequestEvent event, Context context) {
9        return new APIGatewayProxyResponseEvent()
10                .withStatusCode(200);
11    }
12}

The bytecode of the functions are almost similar. The only difference is how Scala and Java handle the 200 argument of withStatusCode. Java uses java.lang.Integer.valueOf, whereas Scala makes use of its implicit conversation scala.Predef.int2Integer.

After building the fat JARs with sbt assembly and mvn package, we see the first big difference: the Scala JAR is almost 10 times larger than the Java one – 5.8MB vs 0.7MB. This is due to the included Scala library, which is around 5 MB large.

Serverless

Now we have to deploy the services to the cloud. For this we use Serverless , a toolkit for building serverless applications. We can define our two functions in a YML configuration file and define a separate API Gateway http endpoint for each of them. With only one command we can deploy our serverless application to the cloud.

serverless.yml

1service: lambda-java-scala-bench
2
3provider:
4  name: aws
5  runtime: java8
6  region: eu-central-1
7  logRetentionInDays: 1
8
9package:
10  individually: true
11
12functions:
13  ScalaLambda:
14    handler: de.codecentric.amuttsch.awsbench.scala.ScalaLambda::handleRequest
15    reservedConcurrency: 1
16    package:
17      artifact: scala/target/scala-2.12/aws_lambda_bench_scala.jar
18    events:
19    - http:
20        path: scala
21        method: get
22  JavaLambda:
23    handler: de.codecentric.amuttsch.awsbench.java.JavaLambda::handleRequest
24    reservedConcurrency: 1
25    package:
26      artifact: java/target/aws_lambda_bench_java-0.1.jar
27    events:
28    - http:
29        path: java
30        method: get

After defining the name of our service, we set the provider to AWS and the runtime to java8. Since we use separate JAR files for our services, we have to set the individually key to true in the package section. Otherwise Serverless will look for a gobal package. In the functions themselves we set the handler, package and a http event. We do not take concurrent execution into consideration, so we limit the number of simultaneously active Lambdas to one using the reservedConcurrency key. We use the default memorySize of 1024 MB.

Now we deploy our stack with serverless deploy. After successful execution we get our service information containing the URLs to our functions:

1endpoints:
2  GET - https://example.execute-api.eu-central-1.amazonaws.com/dev/scala
3  GET - https://example.execute-api.eu-central-1.amazonaws.com/dev/java

Using curl, we can test if they are available and return a 200 http response: curl -v https://example.execute-api.eu-central-1.amazonaws.com/dev/java.

Benchmarking

The next step is to build a benchmark. For this we use Gatling , a load testing tool written in Scala. It is easy to build a load test and export a graphical report after the execution. For our case we are interested in two metrics: response time on cold and warm Lambdas. AWS kills inactive Lambda instances after some (not specified) time to free up resources. Afterwards, when the function is triggered, the JVM has to start up again which takes some time. So we create a third project and build a test case:

LambdaBench.scala

1package de.codecentric.amuttsch.awsbench
2 
3import ch.qos.logback.classic.{Level, LoggerContext}
4import io.gatling.core.Predef._
5import io.gatling.http.Predef._
6import org.slf4j.LoggerFactory
7 
8import scala.concurrent.duration._
9 
10class LambdaBench extends Simulation {
11  val context: LoggerContext = LoggerFactory.getILoggerFactory.asInstanceOf[LoggerContext]
12  // Suppress logging
13  context.getLogger("io.gatling").setLevel(Level.valueOf("WARN"))
14  context.getLogger("io.netty").setLevel(Level.valueOf("WARN"))
15 
16  val baseFunctionUrl: String = sys.env("AWS_BENCH_BASE_URL")
17 
18  val httpProtocol = http
19    .baseUrl(baseFunctionUrl)
20    .acceptHeader("text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8")
21    .acceptLanguageHeader("en-US,en;q=0.5")
22    .acceptEncodingHeader("gzip, deflate")
23    .userAgentHeader("Mozilla/5.0 (X11; Linux x86_64; rv:64.0) Gecko/20100101 Firefox/64.0")
24 
25  val scalaScenario = scenario("ScalaScenario")
26    .exec(http("Scala")
27      .get("/scala"))
28 
29  val javaScenario = scenario("JavaScenario")
30    .exec(http("Java")
31      .get("/java"))
32 
33  setUp(
34    scalaScenario.inject(constantConcurrentUsers(1) during(120 seconds)),
35    javaScenario.inject(constantConcurrentUsers(1) during(120 seconds))
36  ).protocols(httpProtocol)
37}

First we suppress some logging as Gatling logs every request to the console. We get our endpoint URL from the environment variable AWS_BENCH_BASE_URL and define a http protocol. In there we set the base URL, some headers and the user agent. It is later used for executing the specific requests. Next, we define two scenarios that point to the scala and Java http endpoint of our serverless application. In the last step we set up both scenarios and constantly have one open active request in the duration of 120 seconds. Now we can start sbt and run the benchmark using gatling:test. We have to make sure the Lambdas are cold, otherwise we won’t get any cold boot timings. We can either wait for a few minutes or remove and redeploy the stack. As soon as it finishes running, it prints a text report and provides us with a URL to the graphical report:

Benchmark with 1024MB RAM

Each function was called around 3100 times within the two-minute time span. The time in the max column is the time of the first request when the Lambda function was cold. We can observe that the time until the first response is around 1.6 times as long for Scala as it is for Java. This observation holds true for multiple runs. The mean response time for both Scala and Java is around 38 ms.

Assigning 2048 MB RAM improved the startup time by ~300ms for the Scala and ~200ms for the Java functions. The mean function response time improved only slightly and is negligible:

Benchmark with 2048MB RAM

Conclusion

Scala works great with AWS Lambda as it can be compiled to compatible Java 8 bytecode. You can use all the great features of the language when programming Serverless applications. The startup time for a cold function is a bit longer than the Java counterpart, but improves when the function memory is increased. This test only focuses on the overhead of using the Scala runtime on top of the JVM. The results may vary on production grade functions that actually perform CPU- or network-intensive tasks and depend heavily on the implementation and the used libraries.

You can find the code of the projects and the benchmark here: GitLab

Was this post helpful?

Blog author

Andreas Muttscheller

Do you still have questions? Just send me a message.

fromAndreas Muttscheller

Remix Run: A new React framework on the horizon

When you are developing React apps, you have heard about Create React App , Next.js , Gatsby and others. Those frameworks make it super easy to develop React applications rapidly by providing a foundation for single page applications, static and/or ...

JavaScript
React

1.6.2021 | 4 minutes reading time

Andreas Muttscheller

Building your own serverless functions with k3s and OpenFaaS on Raspberry...

In recent years, lots of new programming paradigms have emerged – going from monolithic architectures towards microservices and now serverless functions. As a result, less code needs to be deployed, and updating an application becomes easier and faster...

Cloud
DevOps
Open Source
Database
Kubernetes
Raspberry Pi
Serverless

6.8.2019 | 18 minutes reading time

Andreas Muttscheller

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Living on the edge: building serverless applications with Cloudflare Workers

Cloudflare is best known for its CDN, DNS server (1.1.1.1) or WAF/DDos mitigation services. These services are highly predicated on “Edge Computing”, bringing data closer to the user interested in those services – a user in Australia will be happier ...

Cloud native
Cloud
Serverless

28.11.2024 | 14 minutes reading time

We deployed our SaaS Application on fly.io (and it was great).

How we deployed our application in a fraction of the time while saving 100% of the cost. Our team, a bunch of experienced software engineers without prior contact to cloud deployments, wanted to deploy our OCPP-compliant EV Charging Station Simulator...

AWS
Cloud

23.10.2024 | 4 minutes reading time

Jannis Mainczyk

Dangling DNS in cloud infrastructures

Dangling DNS entries are nothing new. Forgotten, outdated or incorrect DNS records can lead to subdomains being taken over and used in phishing campaigns, for example, to steal employee secrets. Due to dynamic IP addresses of rapidly changing resources...

IT-Security
Validation
Cloud
AWS
Infrastructure

5.9.2024 | 4 minutes reading time

Markus Höfer

Spring Boot and HTMX: Deployment to AWS Lambda

This is the next part of my series about Spring Boot and HTMX. In this post, I will show you how to deploy the application created in the previous post to AWS Lambda. If you're in a hurry or impatient, you can simply check out the accompanying Git Repo...

Serverless
Spring
AWS
DevOps
Cloud

30.7.2024 | 5 minutes reading time

Python and CDK (Part 2): Taking control of Python dependencies in AWS ...

In Part 1 of this series, Developing AWS Lambda Functions with Python and CDK, we covered the initial setup of a CDK and Python project. We walked through the process of creating a basic Hello World* Lambda function, testing it with a unit test, defining...

AWS
Serverless
Python

2.6.2023 | 2 minutes reading time

Python and CDK (Part 1): Developing AWS Lambda functions with Python and...

This blog post assumes that you are familiar with Python development and know the basic concepts of Amazon CDK. What's more, you should have an AWS account and have configured the AWS CLI. If you're new to CDK, go here, if you need to configure the AWS...

AWS
Serverless
Python

6.3.2023 | 6 minutes reading time

How to upgrade your Aurora Serverless database schema using CDK and Lambda

Imagine the following situation: You are building a serverless application using e.g. lambdas, you setup your system using CDK (or CloudFormation) and you store your data in Aurora Serverless. How would you automate your database schema adaptations or...

Cloud
Database
AWS
Infrastructure as Code
Serverless

16.1.2023 | 12 minutes reading time

CloudWatch on AWS: How to tackle high-security requirements

If you build cloud-native applications, you will also generate log output. Log outputs are essential to log the functionality of the application and to be able to localize errors very quickly in the event of a crash. However, log outputs of any kind ...

AWS
Cloud
IT-Security

23.8.2022 | 15 minutes reading time

Jörg Riegel

Tame the multi-cloud beast with Crossplane: Let’s start with AWS S3

What if learning the Kubernetes API is all you need to provision any infrastructure? And we’re not only talking about AWS, Azure & Google – but also IONOS, DigitalOcean and even vSphere. Let’s have a look at Crossplane and how we can create an S3 Bucket...

AWS
CI/CD
Cloud
DevOps

3.7.2022 | 21 minutes reading time

Building an instant noodles DevOps starter pack with Terraform and AWS

How can we help a fictitious startup kickstart its software development process? Using Terraform and AWS services, we’ll build an IT infrastructure that is ready within minutes and ticks quite a few boxes on the technical DevOps capabilities list. Just...

Cloud
Infrastructure
AWS
CI/CD
DevOps

27.6.2022 | 21 minutes reading time

Secretless connections from GitHub Actions to AWS using OIDC

Imagine the following scenario: You set up your GitHub Actions in your repository. And it’s all cool until you want to access your cloud provider resources. Now you might be tempted to create an access key and secret access key, place it as a secret ...

Azure
Cloud
AWS
CI/CD
DevOps
GitHub

29.5.2022 | 8 minutes reading time

Manuel

Functions vs. containers – which is better?

According to the Lünendonk study 2021 “Cloud-Native Software Development” , 64% of the study participants are already partially or completely “cloud-native” in the private or public cloud. Products such as AWS Elastic Container Service (ECS) or Managed...

AWS
Cloud
Container
Serverless

24.2.2022 | 12 minutes reading time

From specification to infrastructure – automated API deployments

Deploying an API into the various stages of a software development pipeline involves not only the aspect of writing (designing) an API specification, but also having or simultaneously deploying a corresponding infrastructure. This article describes possible...

AWS
CI/CD
Infrastructure
Infrastructure as Code
API

27.1.2022 | 11 minutes reading time

Daniel Kocot

Structuring serverless applications in the cloud

Serverless is a model in which cloud providers are solely responsible for operating the infrastructure. Compute resources are structured into functions with the Serverless approach. Therefore, this is called Functions as a Service (FaaS). The costs for...

Software architecture
AWS
Cloud
Serverless

14.6.2021 | 10 minutes reading time

Processing protobufs messages with AWS IoT Core

IntroductionThe Internet of Things (IoT) is gradually changing an ever increasing number of aspects of modern day life. From connected vehicles to sensors monitoring all sorts of metrics in our homes: chips can be put to use almost everywhere. They are...

AWS
Go
IoT
Serverless

2.7.2020 | 15 minutes reading time

Cost-effective batch jobs on AWS’ serverless infrastructure

There are batch jobs that require much engineering and fine-tuning on serious hardware to make them feasible. However, many batch jobs run on oversized infrastructure and accumulate much more costs than necessary. Migrating these jobs to a serverless...

Software architecture
AWS
Cloud
Serverless

3.6.2020 | 7 minutes reading time

PayPal integration with React Native

INTROIn this blog post we will share some of our learnings during the process of integrating PayPal into a React Native application. We will address some problems that we encountered, reveal how we solved them, and give you an insight into what you need...

Serverless
AWS
React

25.5.2020 | 6 minutes reading time

From PDF data sheets to shared understanding with serverless SHACL

Knowledge contained in PDF filesWhen crawling the web for information about products of a specific category, may it be instances of industrial machine parts, chemical components, or even household goods, manufacturers of such goods often provide the ...

NoSQL
AWS
Big Data
Data
API
Microservices
Python
Serverless
Webdevelopment

1.4.2020 | 12 minutes reading time

Physical regression testing for the Thermomix

Automating physical regression testing of products with computer vision and roboticsTesting a physical product can be a highly manual task. The advances in Deep Learning techniques and computer vision have led to a situation where we can start to strive...

AWS
IoT
Computer Vision
Product management
AI
Testing

31.3.2020 | 8 minutes reading time

Testing AWS Python code with moto

In this article I want to share with you how moto hooks into boto3 and how you can use it to test existing Python code which interacts with your AWS infrastructure. Recently I have been in a project in which we were working on machine learning pipelines...

Cloud
Testing
AWS
Python

20.1.2020 | 9 minutes reading time

Kai Brandes

AWS Lambda: Cold boot and mean response times in Scala vs. Java

Project structure

Lambda functions

Serverless

Benchmarking

Conclusion

Was this post helpful?

Blog author

More articles

Remix Run: A new React framework on the horizon

Building your own serverless functions with k3s and OpenFaaS on Raspberry...

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

More articles in this subject area

Living on the edge: building serverless applications with Cloudflare Workers

We deployed our SaaS Application on fly.io (and it was great).

Dangling DNS in cloud infrastructures

Spring Boot and HTMX: Deployment to AWS Lambda

Python and CDK (Part 2): Taking control of Python dependencies in AWS ...

Python and CDK (Part 1): Developing AWS Lambda functions with Python and...

How to upgrade your Aurora Serverless database schema using CDK and Lambda

CloudWatch on AWS: How to tackle high-security requirements

Tame the multi-cloud beast with Crossplane: Let’s start with AWS S3

Building an instant noodles DevOps starter pack with Terraform and AWS

Secretless connections from GitHub Actions to AWS using OIDC

Functions vs. containers – which is better?

From specification to infrastructure – automated API deployments

Structuring serverless applications in the cloud

Processing protobufs messages with AWS IoT Core

Cost-effective batch jobs on AWS’ serverless infrastructure

PayPal integration with React Native

From PDF data sheets to shared understanding with serverless SHACL

Physical regression testing for the Thermomix

Testing AWS Python code with moto