Deep Learning diesel car detection with AWS Deeplens

12.11.2018 | 10 minutes reading time

With this series, we would like to give you an understanding of different machine and deep learning approaches, illustrated by the example of recognizing diesel vehicles. In this article, we have summarized the approach based on deep learning in neural networks. For other techniques, please refer to the other parts of this series.

Relevant links and the other parts can be found here:
>Datengetriebene Lösungen für dein Unternehmen
>Deep Diesel – Part 1: Machine & Deep Learning for diesel car detection
>Deep Diesel – Part 2: Machine-Learning-Diesel Car Detection using a HOG detector
>Deep Diesel – Part 3: Deep Learning Diesel Car Detection using the AWS DeepLens
>codecentric.ai youtube channel

As hardware we used the AWS Deeplens, which offers the possibility to use neural networks efficiently in an Edge Device by using a built-in Intel GPU. For the implementation we used the Google Deep Learning Framework TensorFlow and adapted a neural network to our task. The objective is summarized in part 1 . As an example task, we recognize the green environment zone badge, which can be used as a marker for driving bans as soon as the blue sticker for cleaner EURO6 cars is introduced. In part 2 we have also shown that it is also possible to recognize diesel type plates (e.g. TDI). This also can be applied to car parts or vehicle types.

Object recognition with neural networks

In part 2 we recognized diesel cars based on geometric features, whereby the limitations of the approach became clear. Neural networks include more information for the recognition of features (environmental stickers, type plates, vehicle types) of a diesel vehicle, e.g. color, position, context,… . Compared to a pure classification of images, object recognition is a difficult task, because instead of outputting a class with a single output neuron, a bounding box and a probability need to be returned. There are different methods (e.g. Single Shot Detection, Faster R-CNN or R-FCN) which can be used here. Details on the different methods can be found in the paper Speed/accuracy trade-offs for modern convolutional object detectors.

In contrast to “classical” machine learning methods, we do not start from zero in building the detector, but draw on a pre-trained neural network that we merely teach the additional classes of the additional objects. This drastically reduces the required number of training data, so that we were able to achieve initial results with just a few hundred images. Since we want to use AWS Deeplens at runtime, there are special requirements for the architecture of the neural network. We decided to use an Inception v2 architecture, which we had to change in the course of the process, since we had new findings that made it inevitable.

AWS Deeplens

As hardware platform, we use an AWS Deeplens, which our colleagues brought along from the last re:Invent. The “Deep Learning Camera” is well suited for testing in our scenario for various reasons: it works stand-alone just with power, has an Intel Gen-9 GPU, support for multiple machine learning frameworks, built-in WiFi, and integration with AWS IoT messaging services, logging, easy deployment and more. AWS Deeplens runs Ubuntu Linux and can be connected to a monitor, keyboard, and mouse if required. Both the input and the output stream (after inference by our trained detector) can be accessed locally or via stream if required.

DeepLens Unboxing

An interesting concept is the integration of Deeplens into AWS as well as the deployment of the neural network (model) and the executing code (lambda). Deeplens is treated like any IoT device in AWS. A Greengrass instance is running locally, which takes over the code execution and messaging. Greengrass allows tasks that would otherwise be executed in the AWS cloud to be moved to an ‘edge device’ – such as a Lambda function. During deployment, a package, in our case consisting of a neural network and a Python code package, is simply pushed onto the device, gets unpacked and is executed there.

Deeplens architecture

AWS Deeplens provides us with some sample projects that work more or less on a single-click deployment. Once on the camera, they are immediately executable. The sample projects include:
– Object recognition (based on Apache MX)
– A hotdog classifier (hotdog or not?)
– Dog or cat classification
– Style transfer
– Face recognition
– Detect activities
– Recognize head posture

In addition, models from Amazon SageMaker are available and can be integrated and easily deployed on AWS Deeplens. SageMaker is a convenient way to train the Machine Learning Model in AWS. You can find an intro video to SageMaker here: codecentric.ai.

But that’s too easy for us in this case, which is why we have to:

use an external machine learning framework
train and evaluate this in the cloud
optimize the model outside of Deeplens
deploy the result to AWS Deeplens

AWS Deeplens supports several Machine Learning frameworks, such as Apache MXNet Google TensorFlow as well as Berkleys Caffe Models . ‘Support’ refers in particular to the availability of the Model Optimizer on Deeplens, which is a wrapper around the Intel Open Vino- Toolkit. This optimizes the externally trained models for the Intel GPU. With some effort, it will probably be possible to get more frameworks running on the platform.

Setting up a Machine Learning cloud instance

Since we need a powerful GPU for the training, we have chosen a p3.2xlarge instance with 61 GB memory, 1 NVIDIA V100 GPU, and 8 virtual cores.

To set up our training environment, we used the Deep Learning Base AMI (Amazon Linux) image from the AWS Marketplace and upgraded to the latest Tensorflow version, the TensorFlow Models , Google’s protobuf, Jupyter Notebook and Tensorboard. It is recommended to load the data for the training either into a separateElastic-Block-Storage partition or into a Git repository, since it allows you to re-attach your data to a different EC2 instance. We needed two attempts to select the correct neural network as basis for our project, since Open Vino does not seem to support all versions of Inception v2 correctly. The final compatible version (.tgz) was only found as a link in the Open Vino documentation. To optimize the model for the Intel GPU, we set up a Deep-Learning-Base-AMI- (Ubuntu-) instance, because the setup of Intel Open Vino turned out to be a little complicated on other images.

Recommended Deep Learning EC2 Images

Collect & label training data

As training data, we used the same kind of pictures that we used for the HOG Detector. For the AWS Deeplens, we limited ourselves to static photos of our company cars and did not take any additional videos of passing vehicles.

For the labeling process, we used LabelIMG that gave us a proper PascalVOC dataset for further use with tensorflow.

LabelImg – petrol engine with green environmental badge

Tensorflow uses so-called tensorflow records, which contain the image and training data combined as a byte stream. Therefore, the data is more efficiently accessible during training. Details on the file format can be found here .

The Pascal VOC format resulting from LabelImg can be converted easily into the Tensorflow.records. Hint: the Git repository SSD TensorFlow contains an example implementation for the conversion pascalvoc_to_tfrecords.py *. (We didn’t try it because we wrote the converter ourselves.)

Training and evaluation

With a powerful GPU, training the net with a few hundred images only takes a few minutes. We have divided the available labeled images into two overlap-free sets:

training data 80%, evaluation data 20%. We use the evaluation data to assess whether we fall into overfitting during training. (For this purpose, we compare the development of the loss function on the training data with the development of the loss function on the evaluation data).

Starting the training with:

python object_detection/train.py   --logtostderr --pipeline_config_path=/data/DeepDiesel/tdata/ssd_inception_v2_diesel.config   --train_dir=/data/DeepDiesel/tdata/train_out_new/

and at the same time leaving the result validated against the evaluation data:

python object_detection/eval.py   --logtostderr --checkpoint_dir=/data/DeepDiesel/tdata/train_out_new/   --pipeline_config_path=/data/DeepDiesel/tdata/ssd_inception_v2_diesel.config   --eval_dir=/data/DeepDiesel/tdata/train_out_new/

A suitable template for the inception v2 config file can also be found in the Tensorflow model repository. The file describes the training and model parameters. Here you can also find important inputs for later optimization of the model.

Tensorboard is used to monitor/visualize the training. It displays information about the performance of the training and the evaluation set images that were processed by the detector. Furthermore, the structure of the model can be displayed.

tensorboard --logdir /data/DeepDiesel/tdata/train_out_new/

Tensorboard can be accessed on port 6006 of the corresponding machine (if the correct inbound rule is set in the security group – allow inbound tcp 6006)

Successfully recognized environment zone badge

At the end of the training, the model is ‘frozen’ to be used efficiently for detection.

python object_detection/export_inference_graph.py    --input_type image_tensor --pipeline_config_path /data/DeepDiesel/tdata/ssd_inception_v2_diesel.config    --trained_checkpoint_prefix /data/DeepDiesel/tdata/train_out_new/model.ckpt-263 --output_directory /data/DeepDiesel/frozen_diesel_new263.pb

resulting in

(tensorflow):frozen_diesel_263.pb ec2-user$ ls
checkpoint model.ckpt.meta
frozen_inference_graph.pb pipeline.config
model.ckpt.data-00000-of-00001 saved_model
model.ckpt.index

You can find an explanation on the process here (DE).

By loading the video, you agree to YouTube’s privacy policy.
Learn more

Load video

Always unblock YouTube

Optimization of the model for AWS Deeplens

Actually, it should be possible to optimize the existing frozen_inference_graph.pb directly in the deployed lambda function by calling mo.optimize at runtime. Unfortunately, an old Open Vino version was delivered with Deeplens and an upgrade is not easy at the moment. Therefore, we decided to do the optimization externally on our AMI Base Ubuntu instance. Open Vino can be downloaded directly from Intel: https://software.intel.com/en-us/openvino-toolkit/choose-download/free-download-linux

The following call optimizes the existing model.

./mo.py --input_model /tmp/frozen_inference_graph.pb  --tensorflow_use_custom_operations_config extensions/front/tf/ssd_support.json --output="detection_boxes,detection_scores,num_detections" --input_shape="(1,300,300,3)"

In case of problems, a look into the documentation of OpenVINO is recommended, especially docs/TensorFlowObjectDetectionSSD.html explains important cornerstones. Here we also recognized our mistake regarding the wrong version of the underlying model.

The result should now indicate that the following three files are present:

frozen_inference_graph.bin
frozen_inference_graph.mapping
frozen_inference_graph.xml

Deeplens Project and Deployment

We create our own empty AWS Deeplens project with a custom model. Now we have to specify the model we want to upload and create a lambda function that detects AWS Deeplens (inference).

Regarding the model, we face the challenge of having to upload three model files instead of one ‘.pb’ file (in the tensorflow case). Unfortunately, this is not yet supported. Work-around: either the three files are packed into an archive and then unpacked first using the lambda function or manually using scp in /opt/awscam/artifacts. The model must be placed in an S3 bucket starting with “deeplens-“.

We create the lambda function starting from the template ‘greengrass-hello-world’, which contains all important modules and is accessible in the Lambda Template Library. A good example implementation for the required inference function can be found in the AWS documentation.

Deployment Succeeded

Detection of vehicles

To test the results in a real use case, we used the deep lenses with the model we trained on our company car park. With the first versions, the implementation still had to be readjusted. Here is an example where an incorrect label map was used and the detection threshold was set too low.

There are airplanes everywhere in the office.

Here Deeplens was built up on the parking lot

Consent to the use of the data 🙂

Evaluation of the live data

As a result, the performance of the detection was greatly improved compared to the HOG classifier. Practically all passing vehicles were detected correctly. We let the detection run at different driving speeds (6km/h, 20km/h, 30km/h, 40km/h and almost 50km/h) and detected the environmental zone badge almost every time in at least one frame. We could not test any faster in our parking lot. However, the applicable speed range is determined in principle by the speed up to which a frame can be recorded with low motion blur. Better optics and recording technology would further expand the speed range.

Conclusion and outlook

The implemented detector has made a considerable performance leap with regard to the application scenario. The implementation still has some framework-induced workarounds, but this might change as the framework implementations progress. The approach can be extended to better fit real-world scenarios. For example, events could be triggered in response to a detection. These events can be sent via the AWS IoT Message Broker towards consumers such as apps or IoT actuators. In addition, production systems for individual use cases can be built with relatively little effort.

The links to all parts and our Youtube channel can be found here:

>Datengetriebene Lösungen für dein Unternehmen
>Deep Diesel – Part 1: Machine & Deep Learning for diesel car detection
>Deep Diesel – Part 2: Machine-Learning-Diesel Car Detection using a HOG detector
>Deep Diesel – Part 3: Deep Learning Diesel Car Detection using the AWS DeepLens
>codecentric.ai youtube channel

If you are interested in this topic, additions or questions, please contact me at kai.herings@codecentric.de . Follow me on twitter: https://twitter.com/kherings

Was this post helpful?

Blog author

Kai Herings

Do you still have questions? Just send me a message.

DeepFake: Detect AI-Generated Images in 5 Steps

We live in a time when an image is no longer a reliable guarantee of truth. AI‑generated content floods social media feeds, news platforms and messenger groups every single day, and only very few people are able to tell the difference. What once required...

IT-Security
AI
Generative AI
Search
Google
data protection
Digitalization

16.3.2026 | 5 minutes reading time

From Stories to Code: How Domain Storytelling and EventStorming Give LLMs...

The Broken Promise of AI-Assisted Development By now, most development teams have tried using an LLM to generate code. The results are familiar: syntactically correct, superficially plausible, and frequently wrong in ways that take hours to diagnose...

4.3.2026 | 15 minutes reading time

Don't Let Your AI Cheat: Isolated Specification Testing with Claude Code

AI agents are powerful — but they will cheat if you let them. Letting the same agent develop and test your application risks one thing: it will no longer fulfill the specification, it will simply learn to pass the tests. This article shows how to ...

AI
LLM
Testing

2.3.2026 | 12 minutes reading time

Thomas Jaspers

5 reasons we developers misjudge agentic software engineering

Throughout 2025 a kind of trench warfare raged between software developers on the pro and anti-AI development camps. We are, by definition, the experts on software creation. Ironically, this also makes us highly biased, and is exactly the reason you ...

Generative AI
AI

8.1.2026 | 5 minutes reading time

John Fletcher

The developer's dilemma - mastering the transition to AI engineering

Dear software developer, please choose one of the following options for 2026 and beyond:a) finding yourself with obsolete skills, and eventually, unemployed. b) salary increases lower than inflation, whilst expectations of your output continually increase...

AI
Generative AI

1.1.2026 | 11 minutes reading time

John Fletcher

Open Source hits Billion-Dollar Market: DeepSeek-R1 is shaking up the ...

On January 27, 2025, the technology stock exchange experienced an unexpected crash: The NVIDIA stock price plummeted by over 17%, temporarily wiping out nearly $600 billion in market value and setting a new historical record in the stock market. Many...

AI
Generative AI
LLM

29.1.2025 | 8 minutes reading time

How we can hack an AI with just a few words

How we can hack an AI with just a few words Artificial intelligence (AI) has undergone an astonishing transformation in recent years and is now present in many areas of life. Whether in the form of chatbots that help us with everyday questions or generative...

IT-Security
AI

27.1.2025 | 4 minutes reading time

Simplifying LLM Application Development: A Newcomer's Perspective

I. Introduction Large Language Models (LLMs) have become highly popular due to their transformative impact on various fields, especially within IT. They enable developers to create innovative software applications centered around AI interactions, offering...

Generative AI
AI

6.12.2024 | 13 minutes reading time

We deployed our SaaS Application on fly.io (and it was great).

How we deployed our application in a fraction of the time while saving 100% of the cost. Our team, a bunch of experienced software engineers without prior contact to cloud deployments, wanted to deploy our OCPP-compliant EV Charging Station Simulator...

AWS
Cloud

23.10.2024 | 4 minutes reading time

Jannis Mainczyk

Function Calling with GPT Models

GenAI is a powerful tool for generating content and interacting with applications using natural language. However, this tool also has significant limitations when you plan to use it in your own software. GenAI's knowledge is limited to information that...

Generative AI
AI
LLM

6.9.2024 | 5 minutes reading time

Dangling DNS in cloud infrastructures

Dangling DNS entries are nothing new. Forgotten, outdated or incorrect DNS records can lead to subdomains being taken over and used in phishing campaigns, for example, to steal employee secrets. Due to dynamic IP addresses of rapidly changing resources...

IT-Security
Validation
Cloud
AWS
Infrastructure

5.9.2024 | 4 minutes reading time

Markus Höfer

Spring Boot and HTMX: Deployment to AWS Lambda

This is the next part of my series about Spring Boot and HTMX. In this post, I will show you how to deploy the application created in the previous post to AWS Lambda. If you're in a hurry or impatient, you can simply check out the accompanying Git Repo...

Serverless
Spring
AWS
DevOps
Cloud

30.7.2024 | 5 minutes reading time

Answer questions about your documents with OpenAI and Pinecone

In recent years, large language models (LLMs) have made remarkable progress in interacting with humans, showcasing their ability to answer a wide array of questions. Trained on publicly accessible internet content, these models have broad knowledge across...

13.11.2023 | 12 minutes reading time

Lukas Lehmann

An introduction to federated learning in an industrial context: Advanced

In the Machine Learning space, it was long believed that sharing learnings or weights was safe in the sense that the input data couldn't be extracted. However, this belief has been challenged by researchers coming out over the years. Nowadays, numerous...

Machine Learning
Big Data
Data Science
Data

18.9.2023 | 9 minutes reading time

An introduction to federated learning in an industrial context: Fundamentals

With the help of data, companies are able to make more informed decisions, optimize their workflows and gain an edge in the competitive world of business using the power of Machine Learning (ML). However, handling data has become increasingly difficult...

Machine Learning
Data Science
Data
Big Data

25.8.2023 | 8 minutes reading time

Fighting Gandalf with magic spells (the spells are prompt injections) ...

Note: Do not attack any systems for which you do not have explicit permission to do so. In this article, I will recount the tale of outwitting a large language model by performing prompt injection attacks. Before we start, let's establish a common baseline...

IT-Security
AI

10.7.2023 | 12 minutes reading time

Michael Wagner

Python and CDK (Part 2): Taking control of Python dependencies in AWS ...

In Part 1 of this series, Developing AWS Lambda Functions with Python and CDK, we covered the initial setup of a CDK and Python project. We walked through the process of creating a basic Hello World* Lambda function, testing it with a unit test, defining...

AWS
Serverless
Python

2.6.2023 | 2 minutes reading time

Python and CDK (Part 1): Developing AWS Lambda functions with Python and...

This blog post assumes that you are familiar with Python development and know the basic concepts of Amazon CDK. What's more, you should have an AWS account and have configured the AWS CLI. If you're new to CDK, go here, if you need to configure the AWS...

AWS
Serverless
Python

6.3.2023 | 6 minutes reading time

How to upgrade your Aurora Serverless database schema using CDK and Lambda

Imagine the following situation: You are building a serverless application using e.g. lambdas, you setup your system using CDK (or CloudFormation) and you store your data in Aurora Serverless. How would you automate your database schema adaptations or...

Cloud
Database
AWS
Infrastructure as Code
Serverless

16.1.2023 | 12 minutes reading time

How to combine Poetry, TensorFlow, and the power of the Apple M1 GPU

In this article, we'll explore how to use the Poetry package manager to manage the dependencies of a machine learning project that makes use of the M1 GPU for TensorFlow training. We'll cover the motivation for using Poetry in this context, and we'll...

Machine Learning
Apple
Data
AI
Python

11.1.2023 | 3 minutes reading time

Denis Stalz-John