Cost-effective batch jobs on AWS’ serverless infrastructure

3.6.2020 | 7 minutes reading time

There are batch jobs that require much engineering and fine-tuning on serious hardware to make them feasible. However, many batch jobs run on oversized infrastructure and accumulate much more costs than necessary. Migrating these jobs to a serverless approach generates two advantages: a simplified workflow and massive savings in the long run. In this blog post, I will walk you through one exemplary architecture on AWS. It will showcase how batch jobs can run both reliably and cost-effectively in a serverless environment.

Note: this post provides an overview of the architecture and the basic principles. If you want to reproduce the architecture, take a look at the repository here which includes all the necessary code.

What are the requirements?

For the sake of our example, let’s consider three requirements:

The interface for data in- and output of the service should be S3 buckets, i.e., general cloud storage.
The service needs to run daily and should be as cheap as possible.
The client’s developers want to focus on the processing logic, not the cloud infrastructure.

I will show you how these requirements translate to a serverless cloud architecture in a minute. But first, let’s talk about why serverless lends itself as a design approach.

Why choose a serverless approach?

There is a myriad of content about what serverless can and cannot do. I will, therefore, not spend time with generalized ruminations. Instead, let me justify serverless along the requirements defined above. One of the main reasons links to the development workflow; the other one is aboutcosts.

From a workflow perspective, a serverless approach provides a level of abstraction that developers highly appreciate. It frees them from managing and maintaining infrastructure. Since the cloud providers shoulder most of the usual responsibilities, developers can focus on implementing new features and adding business value. Such focus accelerates deployment and builds strong ownership.

When it comes to costs, serverless is – as a rule of thumb – cheaper than self-managed alternatives. Even when self-managed infrastructure seems to be more affordable, the wages for specialized teams typically outweigh the savings. Given the current state of the industry, it is tough to be more effective than the workforce behind the cloud services of AWS, Azure, or Google.

How to build the architecture for a serverless batch job?

Let me walk you through the proposed architecture in three steps that cover the organizational context, the overall architecture, and the specific building blocks and workflows.

Who will work with the implemented architecture?

Frequently, people develop cloud architectures without thinking about their users. In our case, we have to account for two roles to make it work. First, developers implement the processing logic. They should not be bothered with infrastructure details. Second, cloud engineers keep the system running. They should not be bothered with the details of the implemented logic.

In other words, we want to build a system with distinct realms of responsibilities and unambiguous touchpoints between them. To be clear, I do not advocate an organizational or cultural border between two camps. Instead, such a design aims at streamlining communication to decrease frustration and delays.

How is the overall architecture structured?

The big picture of this architecture contains three parts: networking, IAM roles and policies, and cloud services. The networking layer enables the services to communicate and protects against unauthorized access. The IAM roles and policies configure which actions services are allowed to take. The cloud services provide the infrastructure to integrate the business logic and execute the batch job. Due to the complexity in each part, this blog post focuses on the general workflow to keep things straightforward. As mentioned above, details will be available in related blog posts. Here is a graphical overview of the proposed architecture:

Blue indicates parts of the architecture that belong to development. In contrast, the orange box shows which building blocks run the batch job during operations. As you can see, both workflows share only one building block.

By conceptualizing the service in this way, developers and cloud engineers are largely decoupled and can focus on their respective responsibilities. Whenever developers push code to the master branch, the system builds a new Docker image. The next time the batch job runs, it will use the updated container image, i.e., run the updated business logic.

Developers need cloud engineers’ direct support in just two scenarios:

An update to the business logic requires access to additional cloud resources. In this case, a cloud engineer adjusts the IAM roles and policies.
An update necessitates different computational resources to run than the previous version. In this case, a cloud engineer needs to change the configuration of the hardware assigned to the batch job.

Both of these scenarios are easy to communicate and to account for by developers and cloud engineers.

Which role do the individual building blocks play?

The four development components

CodeCommit repositories are git repositories hosted on AWS with limited functionalities. Yet, they are a simple solution for keeping all the code within the AWS ecosystem. It is also possible to connect other versioning services, such as GitHub or Bitbucket, into the service via webhooks.

CodeBuild is the AWS flavor of a CI pipeline’s building part. It reads a specification file directly from the connected repository. CodeBuild then executes the instructions. Here, the CodeBuild project uses Docker to build a container image and pushes it to the registry for later use.

CodePipeline takes care of the orchestration and artifacts of CodeCommit and CodeBuild. In this case, it reacts to updates in the master branch of the code repository and triggers a new build.

The Elastics Container Registry (ECR) is the final building block of the developer workflow. The ECR is where CodeBuild pushes the new container images. The batch job collects it later on from here. The ECR is also where the realms of developers and engineers overlap.

The two batch job components

The computational heart of the service is the Elastic Container Service (ECS) in its Fargateflavor. Fargate is the serverless capacity provider of AWS. The user does not have to allocate VMs or other resources. Instead, a task definition includes specifications on how much computational power and memory are needed.

The second building block is a CloudWatch Trigger. Triggers can fire based on cron expressions or in intervals. Depending on whether the timing is essential, both can be viable options. When the trigger fires, a new instance of the task is sent to Fargate and executed.

These are already the main building blocks of this architecture. In production, some more services are necessary. For instance, there are S3 buckets for data storage, connections to CloudWatch logs for monitoring and debugging, and network configurations, such as VPC, subnets, and routing tables.

What are the alternatives?

Serverless is not a panacea; neither is this architecture. To leave you with an idea of when it is promising to follow this path, I want to sketch two alternative approaches.

Virtual Machines

The first alternative is to execute the workload on a dedicated virtual machine, i.e., an EC2 instance on AWS. There are two main ways to do this for the scenario described above.

First, one can keep an instance running and implement a Cron job (or something comparable) on it. I saw this pattern in a previous project, but it is a terrible choice for (at least) two reasons. The costs are way higher than the serverless approach because the VM costs money no matter whether its load is high or low. From the point of flexibility, changing the logic only works by terminating the process, updating the code, and restarting again.

Second, one could boot an instance for the batch job and terminate it afterward. That is similar to the approach above, but with one significant caveat: there are way more moving parts that can break during the process.

Serverless Functions

On the other side of the spectrum are serverless functions or Lambda functions in the AWS ecosystem. They are lean and very cheap. However, you need to consider two things before using them:

Compared to container images, Lambda functions are less flexible. For instance, using third-party Python libraries can be painful since you have to provide a .zip file with the dependencies. Compare this hustle with a pip command as part of a Dockerfile.
Lambda functions are limited in computational power and runtime duration. These limitations can become problematic when the amount of data or the complexity of the processing increases.

Thanks for reading!

If you want to know more about how to save cloud costs in general, have a look at our Cloud Cost Cleanup offer ! If you need other help or wish to have more extensive discussions on the topics, my colleagues and I are happy to be there for you!

Was this post helpful?

Blog author

Timo Böhm

Do you still have questions? Just send me a message.

Serverless from Europe: My Experience with Scaleway as an Alternative ...

In addition to dominant US providers like AWS, Azure, and GCP, the French company Scaleway now offers a comprehensive serverless computing portfolio. This includes services for Function as a Service, a lightweight Key/Value Store, and a simple messaging...

Compliance
Infrastructure
data protection
Cloud native
Cloud
Infrastructure as Code

28.5.2025 | 5 minutes reading time

Florian Lüdiger

Pull off Architecture Reviews at Light-Speed with LASR!

Foreword: This blog is loosely based on a recent project experience. All persons, companies and names are fictitious, as to make them NDA compliant. Any resemblance to a person, existing company or brand is purely coincidental and unintentional.For most...

Software architecture

4.4.2025 | 13 minutes reading time

Feature-Sliced Design and what we need for good frontend architecture

Feature-Sliced Design and what we need for good frontend architecture While a lot has been published on the topic of software architecture in the backend, and there are well-established best practices, this topic is less prominent for frontend applications...

Software architecture
Frontend

23.1.2025 | 10 minutes reading time

Hexagonal Architecture is just an island

Imagine an island called "Alistair Island." This island is a vibrant place with houses, fertile soil, and a well-coordinated community of residents who live by well-defined routines. Every activity on the island has significance and serves a specific...

Software architecture
Testing
Software development

22.1.2025 | 10 minutes reading time

Danny Keller

Modularization the easy way: Spring Modulith with Kotlin and Hexagonal...

Modularization the easy way: Spring Modulith with Kotlin and Hexagonal Architecture Modularization is a key concept in modern software development to make applications maintainable, testable and flexible. In this article we will see how Spring Modulith...

Software architecture
Kotlin
Spring

14.1.2025 | 9 minutes reading time

Danny Keller

Charge your APIs Volume 36 - Trends for 2025

As 2025 approaches, new trends are emerging in the world of APIs. After 2024 was user-centric, the focus is now shifting back to developer needs and increasing productivity. APIs are evolving and the technologies surrounding them are becoming more powerful...

Integration
API
Data
Software architecture

11.12.2024 | 5 minutes reading time

Daniel Kocot

The Ultimate Tool for Engineers and Developers: Compass Premium

It’s not an every day activity that a tool comes and redefines how engineering and development teams operate, but Compass is the tool with a game-changing solution. As Atlassian's out-of-the-box internal developer platform, Compass helps teams to stay...

Atlassian
Cloud

3.12.2024 | 4 minutes reading time

Özge Kavas

Living on the edge: building serverless applications with Cloudflare Workers

Cloudflare is best known for its CDN, DNS server (1.1.1.1) or WAF/DDos mitigation services. These services are highly predicated on “Edge Computing”, bringing data closer to the user interested in those services – a user in Australia will be happier ...

Cloud native
Cloud
Serverless

28.11.2024 | 14 minutes reading time

We deployed our SaaS Application on fly.io (and it was great).

How we deployed our application in a fraction of the time while saving 100% of the cost. Our team, a bunch of experienced software engineers without prior contact to cloud deployments, wanted to deploy our OCPP-compliant EV Charging Station Simulator...

AWS
Cloud

23.10.2024 | 4 minutes reading time

Jannis Mainczyk

ArchUnit in practice: Keep your Architecture Clean

Who hasn’t been there: A new project kicks off or the old code finally needs a cleanup. A big meeting with all the developers is called: “This time, we’ll do it right—clean, correct, and structured!” Architecture Decision Records (ADRs) are created to...

Software architecture
Java
Kotlin
Software development

20.9.2024 | 18 minutes reading time

Danny Keller

Dangling DNS in cloud infrastructures

Dangling DNS entries are nothing new. Forgotten, outdated or incorrect DNS records can lead to subdomains being taken over and used in phishing campaigns, for example, to steal employee secrets. Due to dynamic IP addresses of rapidly changing resources...

IT-Security
Validation
Cloud
AWS
Infrastructure

5.9.2024 | 4 minutes reading time

Markus Höfer

Charge your APIs Volume 30 - Gateway to Success: Understanding and Choosing...

API gateways are essential for managing and securing data flow between services. As software architectures evolve, different types of API gateways have emerged to address specific challenges: Legacy, Agnostic, and Kubernetes-native. Drawing on insights...

API
Software architecture
Infrastructure
Integration

21.8.2024 | 12 minutes reading time

Daniel Kocot

When Business Meets Technology: From Data Product to Data Architecture...

Abstract The Data Product Canvas (DPC) is a tool for the lightweight and iterative definition of data products. It increases the efficiency of product definition by clearly presenting the key impact areas on data products. Additionally, the DPC motivates...

Software architecture
Data
DDD
Digital product developement

6.8.2024 | 24 minutes reading time

Dr. Florian Rademacher

Spring Boot and HTMX: Deployment to AWS Lambda

This is the next part of my series about Spring Boot and HTMX. In this post, I will show you how to deploy the application created in the previous post to AWS Lambda. If you're in a hurry or impatient, you can simply check out the accompanying Git Repo...

Serverless
Spring
AWS
DevOps
Cloud

30.7.2024 | 5 minutes reading time

Integrating Dapr with Azure Kubernetes Service (AKS): Portability is key

In a recent blog post, we explored how Dapr works and how to test it on a simple local Kubernetes cluster. One of Dapr's key advantages is its component system, which enhances portability. In this post, we'll take our previously daperized demo app and...

Software development
Cloud
Azure
Cloud native

22.7.2024 | 10 minutes reading time

Manuel Zapf

Exploring Dapr: A Deep Dive into Distributed Application Runtime

In a recent blog post, we introduced Dapr (Distributed Application Runtime) and highlighted its potential as a valuable tool for cloud-native applications, in combination with Aspire. This post dives deeper into the inner workings of Dapr, explaining...

Software development
Cloud native
Software architecture
Open Source

10.7.2024 | 10 minutes reading time

Manuel Zapf

Spring Boot and HTMX: The boring app

Motivation Most apps I touched in the wild follow the same two tiered approach. A backend delivering JSON (some may call this REST) and a frontend framework, consuming JSON from the backend converting it to the HTML displayed to the user. Worst case,...

Software architecture
Software development
Spring
Kotlin

28.6.2024 | 16 minutes reading time

Modern Microservices: Unleashing the Power of .NET Core, Aspire, and Dapr

I recall the days when writing a web application in C# with .NET meant deploying it on an IIS web server for accessibility. Today, this approach seems outdated, especially with the shift towards microservice-based architectures. Fortunately, Microsoft...

Software architecture
Open Source
Cloud
Microservices
Infrastructure as Code
.NET
Cloud native

27.6.2024 | 8 minutes reading time

Manuel Zapf

Zero Trust Azure Identity & Access Architecture

Falko Lehmann and Hendrik Kamp have already explained in their blog post on Zero-trust Architecture why zero-trust security models are preferable to traditional perimeter security models in order to minimize damage from cyber attacks. Falko and Hendrik...

IT-Security
IAM
Azure
Software architecture

4.6.2024 | 14 minutes reading time

From sidecars to sidecarless: Tracing the evolution of service mesh technologies...

Ever wondered how the technology that seamlessly manages microservices traffic evolved from early implementations to lean, kernel-level solutions? Let's dive into the fascinating journey of service meshes, from Linkerd 1.x to the cutting-edge technologies...

Cloud
Networking
Infrastructure
Kubernetes
Linux

22.5.2024 | 10 minutes reading time

Manuel Zapf

Cost-effective batch jobs on AWS’ serverless infrastructure

What are the requirements?

Why choose a serverless approach?

How to build the architecture for a serverless batch job?

Who will work with the implemented architecture?

How is the overall architecture structured?

Which role do the individual building blocks play?

The four development components

The two batch job components

What are the alternatives?

Virtual Machines

Serverless Functions

Thanks for reading!

Was this post helpful?

Blog author

More articles in this subject area

Serverless from Europe: My Experience with Scaleway as an Alternative ...

Pull off Architecture Reviews at Light-Speed with LASR!

Feature-Sliced Design and what we need for good frontend architecture

Hexagonal Architecture is just an island

Modularization the easy way: Spring Modulith with Kotlin and Hexagonal...

Charge your APIs Volume 36 - Trends for 2025

The Ultimate Tool for Engineers and Developers: Compass Premium

Living on the edge: building serverless applications with Cloudflare Workers

We deployed our SaaS Application on fly.io (and it was great).

ArchUnit in practice: Keep your Architecture Clean

Dangling DNS in cloud infrastructures

Charge your APIs Volume 30 - Gateway to Success: Understanding and Choosing...

When Business Meets Technology: From Data Product to Data Architecture...

Spring Boot and HTMX: Deployment to AWS Lambda

Integrating Dapr with Azure Kubernetes Service (AKS): Portability is key

Exploring Dapr: A Deep Dive into Distributed Application Runtime

Spring Boot and HTMX: The boring app

Modern Microservices: Unleashing the Power of .NET Core, Aspire, and Dapr

Zero Trust Azure Identity & Access Architecture

From sidecars to sidecarless: Tracing the evolution of service mesh technologies...