DISH-O-TRON – Train that vision model!

11.10.2020 | 11 minutes reading time

With this article we continue our endeavor of building dish-o-tron – an AI system designed to prevent the sudden appearance of dirty dishes in the community kitchen sink, and hence turning the community kitchen into a place of peace and harmony.

This is part 3 of the dish-o-tron series, you may want to start with the first part where we introduce the idea and the concept behind dish-o-tron and the second part where we collect the initial data set.

In this article, we use the data gathered in the previous part to build “the heart” (or – perhaps better – “the brains”) of dish-o-tron empowering it to detect dirty dishes. In concrete terms, we train a machine learning model which is capable of classifying images of sinks into clean (no dirty dishes) and not_clean (dirty dishes) using the fast.ai library and AutoML from Google Cloud.

If this is the point where you think to yourself “oops, I did not gather any data” – we warned you several times. It is absolutely necessary that you gather training data yourself to have the real dish-o-tron-experience. We strongly encourage you to revisit the previous article and gather your own data and, in particular, don’t download our pre-prepared dataset .

If this is the point where you think to yourself “Yay, I did gather my own data”: congratulations, you may now continue your journey and indulge in one of the most favourite occupations of every deep learner: watching an AI-model during training.

If you are a developer maybe you know watching your program compile or watching your CI/CD pipeline running tests. But watching an AI model train is something special. And if you watch harder (but not too hard) you might even influence the accuracy of the outcoming model! Depending on the architecture it might be necessary to watch deeper instead of harder – you will find out with further practice.

dish-o-tron: codes compiling, model is training meme

Comics taken from 1 and 2 kudos to XKCD

We start with a short excursion about the requirements of dish-o-tron.

Excursion: Revisit some requirements of dish-o-tron

The dish-o-tron needs to be able to set off an alarm in case of a coworker violating the general rules of using the community kitchen. In most kitchens there are rules are like:

DO NOT PUT DIRTY DISHES IN THE SINK!
Please, respect rule number 1 !!!1!eleven!
If the dishwasher is running, take your stuff and leave it at your desk until the dishwasher has finished.
EVERYBODY can empty the dishwasher.
NO EXCUSES. DO NOT PUT DIRTY DISHES IN THE SINK. NEVER.

Exemplary community kitchen rules

In many kitchens these rules are manifested on various posters, stickers and even laminated printouts! Some even colorize words (OMG!) to emphasize that really everybody should take care of this. But we all know it. We are rebels. While reading these signs one always thinks: One day, when nobody sees me, I will just put my cup in the sink and run!

So far, we are not sure if this is only a German thing and thus if you have rules like this in your community kitchen please share a pic by answering to this tweet. .

Because there is nothing we can do about this – we have to find another solution. The next reasonable step obviously is: Permanent control and punishment. That’s where dish-o-tron enters the arena. Inspired by the DEFCON levels of the United States Armed Forces we therefore propose the DISHCON levels (see this wikipedia article for reference.)

The dishcon levels.

Since we are peace-loving problem solvers the escalations for DISHCON 1 and 2 WILL NEVER be implemented. Also privacy is important to us, so we will not record or save any images. We will not transfer any footage to the cloud. Dish-o-tron sees, maybe beeps, and then it just forgets.

Approach and Reasoning

Until only recently training a machine learning model for image classification would have required special knowledge in Data Science, however, current progress and development in particular in the environment of public cloud providers significantly simplified this task for problem solvers looking for rapid end-to-end progress.

This low barrier of entry into AI systems allows us to rely on existing libraries such as fast.ai and services like AutoML from Google Cloud to obtain a reasonable state-of-the-art vision model for our classification task. In this way we can build the first functioning prototype and focus on solving the actual problem at hand. At a later stage it might be useful to revisit the model training, however, the best model is useless as long as it is not integrated.

For many people dealing with AI and building neural networks from scratch is a lot of fun. However, be honest with yourself! There is close to zero chance that you will create something that will come close to existing solutions. In fact, you will spend lots of time for a worse outcome. It is essential that you focus! Don’t get sidetracked! You are a problem solver. Your goal is to solve an actual real-world problem. The AI model is merely a tool for you to bring peace and harmony to your community kitchen.

In the following, we pursue two options to obtain a vision model in just a few steps:

We utilize the fast.ai library
We use AutoML in the Google Cloud

Short sidenote: Yes, it might be useful to revisit the vision model at some point. At this stage of the project it is helpful to think about this point in time in terms of “as soon as 80% of all community kitchen sinks are equipped with a dish-o-tron”.

fast.ai

fast.ai is per se a great starting point if you want to start with deep learning and machine learning. With the mission of “Making neural nets uncool again” it provides a competitive high-level python library allowing for rapid progress while building an AI system.

The fast.ai library allows you to train state-of-the-art vision models in a few lines of code. To get started you use the following colab-notebook:
Colab notebook

When finishing this notebook you will end up with a fast.ai model which is basically a pytorch model. This model can also be exported and used outside of the colab-notebook environment. However, so far we struggled a lot to deploy fast.ai models on edge devices and in particular on a Google Coral device. Somehow we did not find a painless way to do so. Feel free to investigate on your own and we are very happy if you reach out to us if you find a nice way.

AutoML

AutoML is a Machine Learning Service from Google Cloud which allows you to automate the training of your own custom vision models. It comes with a graphical interface and the option to, e.g., export models to edge devices such as the Google Coral device. The only thing you have to provide are labeled images and money. Yes, that’s basically it: you trade money for AI-expertise and speed. For training a model with ~10.000 labeled images we expect costs of ~25 $.

Does this mean AutoML is always the right solution? Not at all! But it is a nice tool to have if you are looking for rapid end-to-end progress. This is particularly the case if the goal is to validate ideas. Here, learning slowly and struggling to make any real end-to-end progress with an idea in favour of saving a few bucks on your cloud bill is often the worst choice.

Obtaining an AutoML vision model requires four simple steps:

A tiny bit more data preparation and uploading the data
Creating the Dataset in AutoML
Training a readily available computer vision model in AutoML
Export the model (in a suitable format for the Coral device)

In order to follow along you require access to the Google Cloud and a Google Cloud project ideally with project-owner access privileges.

ATTENTION: Not everything we do is covered by the free-tier and hence, some charges may apply.

1. Data preparation

Before we can use AutoML to train a vision model, we have to upload our data to Google Cloud and also prepare a CSV file containing meta information about the data such as, e.g., labels of the images. This is a necessary evil before we can finally lean back and throw some money at Google to do the rest of the work.

This Colab notebook should help you to take the final hurdle. Here, we provide a possible way to:

Upload our data into a Storage Bucket in Google Cloud
Generate the necessary metadata CSV-file for AutoML

Finally, we are in a position to use AutoML.

2. Creating the dataset in AutoML

The starting point for using AutoML is creating a dataset. Because we already uploaded the data into a GCS bucket and prepared the CSV metadata file, we can create the dataset with a few clicks in the UI. After triggering the upload the import will take some time. This is your chance to ponder about life and do some meditation. You could also watch some cat videos – if this is your thing – or just grab a cup of coffee. While you are in the kitchen there might be an opportunity to collect another dirty dishes video. Don’t get mad – you already made fantastic progress on your journey to build dish-o-tron.

Creating a data set in AutoML.

Using the prepared csv file.

As soon as the import is finished, we can inspect the dataset in AutoML. It is useful to make a few sanity checks at this point to ensure that the data is uploaded correctly.

Inspecting a dataset in AutoML

3. Training the vision model

And now, finally, it is going to happen. We can start training the model with a few clicks in the UI. Because we plan on deploying the model on a Coral device, we choose the option “Edge”. For simplicity we select “optimize for best trade-off between latency and accuracy” and set (depending on the number of images) a suitable amount of node hours.

table with node hours depending on number of images

Suggested node hours depending on number of images in the dataset.

Please be aware that for each unit of time, Google Cloud uses 8 nodes in parallel, where each node is equivalent to a n1-standard-8 machine with an attached NVIDIA® Tesla® V100 GPU. Hence 8 node hours are approximately 1 “wall clock” hour. It is advisable to use the early stopping feature to ensure that training stops when further accuracy improvement is not possible. In the end, you pay only for the compute hours that are actually used.

Now push the final button.

Start model training in AutoML

You did it! You are now a real Deep Learner! Feel free to relax for a few hours and check at irregular intervals if the training is finished. This is your time to take a break without feeling bad about it. That is what being a Deep Learner is all about.

Training a model is a magical experience. Don’t forget to check on your model and observe it during the training every once in a while: Rumour has it that observing the training procedure will change the outcome of the experiment. There are even stories that the intensity of the observing influences the accuracy of the model.

When the training is complete or at the latest, when you are back at your desk and observe that the training is complete, it is time for a few sanity checks of the model. Again this is possible with built-in validations of AutoML. If the accuracy is below 95% there is a strong reason to believe that something went wrong with the data or the data preparation.

Evaluating a trained model in AutoML

If everything looks fine, we export the model for coral devices.

Exporting the trained model in AutoML

That’s it! We have our first vision model for our dish-o-tron. Peace and harmony for your community kitchen were rarely as tangible as at this point in time.

Conclusion

Finishing this part of the tutorial is an important step for you and your future career as a professional problem solver. Frankly, that’s one (very) small step for Deep Learning, one giant leap for you – but that is okay. Be proud of yourself! This is how successful real-world problem solvers tackle AI tasks for the first iterations.

Okay, let’s make this more official: you have earned the AI TRAINING WATCHER badge (silver level)

Don’t be shy, you earned it! Feel free to print it out and proudly wear it however you enjoy!

In the next article, we will build the first physical version of DISH-O-TRON which can (and should) be put into use at a real community kitchen sink. Stay tuned!

Was this post helpful?

Blog authors

Marcel Mikl

Service Lead Data & ML & AI

Do you still have questions? Just send me a message.

Oliver Moser

Do you still have questions? Just send me a message.

fromMarcel Mikl & Oliver Moser

DISH-O-TRON – Gather that DATA you must!

This is the second article in our dish-o-tron series (a non-standard Deep Learning tutorial) in which we tackle one of the biggest problems in community kitchens: coming across someone else’s dirty dishes. We are facing this problem by building a state...

AI
Computer Vision
Machine Learning

24.9.2020 | 11 minutes reading time

Marcel Mikl

Oliver Moser

DISH-O-TRON – No more dirty dishes thanks to AI

Sadly, to tell you the truth, doing dishes is still a thing. However, so far most of our readers still like our non-standard Deep Learning tutorial. Typically, AI is demonstrated as solving various toy problems. AI plays chess and Go, AI plays video ...

10.9.2020 | 7 minutes reading time

Marcel Mikl

Oliver Moser

Thinking AI means re-thinking data

While doing AI is sexy and cool, data infrastructure is typically not considered any of this. However, production-grade machine learning applications heavily rely on proper data infrastructure. Hence, in order to generate actual business value, solid...

AI
Big Data
Data
Machine Learning

27.5.2020 | 7 minutes reading time

Marcel Mikl

Great Expectations: Validating datasets in machine learning pipelines

Typically your favorite machine learning model doesn’t care whether or not your input dataset is professionally and technically correct. However, particularly for machine learning algorithms, the all-encompassing truth garbage in, garbage out holds true...

Python
Data
Machine Learning

17.2.2020 | 6 minutes reading time

Marcel Mikl

Remote training with GitLab-CI and DVC

In many Data Science projects there is a point in time where the workstation under your desk is not the ideal machine to perform the model training anymore. More potent processors and GPUs are required, e.g. a suitable server in your company’s rack or...

Git
Machine Learning
CI/CD
AI
GitLab

27.1.2020 | 15 minutes reading time

Marcel Mikl

Bert Besser

DISH-O-TRON – Gather that DATA you must!

AI
Computer Vision
Machine Learning

24.9.2020 | 11 minutes reading time

Marcel Mikl

Oliver Moser

DISH-O-TRON – No more dirty dishes thanks to AI

10.9.2020 | 7 minutes reading time

Marcel Mikl

Oliver Moser

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Open Source hits Billion-Dollar Market: DeepSeek-R1 is shaking up the ...

On January 27, 2025, the technology stock exchange experienced an unexpected crash: The NVIDIA stock price plummeted by over 17%, temporarily wiping out nearly $600 billion in market value and setting a new historical record in the stock market. Many...

AI
Generative AI
LLM

29.1.2025 | 8 [Missing String "readingTime"]

How we can hack an AI with just a few words

How we can hack an AI with just a few words Artificial intelligence (AI) has undergone an astonishing transformation in recent years and is now present in many areas of life. Whether in the form of chatbots that help us with everyday questions or generative...

IT-Security
AI

27.1.2025 | 4 [Missing String "readingTime"]

Simplifying LLM Application Development: A Newcomer's Perspective

I. Introduction Large Language Models (LLMs) have become highly popular due to their transformative impact on various fields, especially within IT. They enable developers to create innovative software applications centered around AI interactions, offering...

Generative AI
AI

6.12.2024 | 13 [Missing String "readingTime"]

Function Calling with GPT Models

GenAI is a powerful tool for generating content and interacting with applications using natural language. However, this tool also has significant limitations when you plan to use it in your own software. GenAI's knowledge is limited to information that...

Generative AI
AI
LLM

6.9.2024 | 5 [Missing String "readingTime"]

Answer questions about your documents with OpenAI and Pinecone

In recent years, large language models (LLMs) have made remarkable progress in interacting with humans, showcasing their ability to answer a wide array of questions. Trained on publicly accessible internet content, these models have broad knowledge across...

13.11.2023 | 12 [Missing String "readingTime"]

Lukas Lehmann

Fighting Gandalf with magic spells (the spells are prompt injections) ...

Note: Do not attack any systems for which you do not have explicit permission to do so. In this article, I will recount the tale of outwitting a large language model by performing prompt injection attacks. Before we start, let's establish a common baseline...

IT-Security
AI

10.7.2023 | 12 [Missing String "readingTime"]

Michael Wagner

How to combine Poetry, TensorFlow, and the power of the Apple M1 GPU

In this article, we'll explore how to use the Poetry package manager to manage the dependencies of a machine learning project that makes use of the M1 GPU for TensorFlow training. We'll cover the motivation for using Poetry in this context, and we'll...

Machine Learning
Apple
Data
AI
Python

11.1.2023 | 3 [Missing String "readingTime"]

Denis Stalz-John

How to use Java classes in Python

There is an old truism: “Use the right tool for the job.” However, in building software, we are often forced to nail in screws, just because the rest of the application was built with the figurative hammer Java. Of course, one of the preferred solutions...

AI
Java
Python

15.11.2021 | 8 [Missing String "readingTime"]

The universal recommender in Action(ML)

IntroductionRecommender systems have become crucial for many different businesses. E-commerce uses recommenders to guide their customers in finding the right products and to assure they stay on the site. Newspapers or entertainment websites want to keep...

AI
NoSQL
Data
Machine Learning
Python

18.4.2021 | 11 [Missing String "readingTime"]

Francesca Diana

NER with little data? Transformers to the rescue!

How do you solve deep learning problems with too little labelled data? The answer, of course, is transfer learning. In this post, we will apply this concept to named entity recognition (NER) andfine-tune a pre-trained BERT to extract information from...

Data
Machine Learning
AI
NLP
Agile transformation

14.12.2020 | 8 [Missing String "readingTime"]

Take control of named entity recognition with your own Keras model!

This post shows how to extract information from text documents with the high-level deep learning library Keras : we build, train and evaluate a bidirectional LSTM model by hand for a custom named entity recognition (NER) task on legal texts.In a previous...

Data
Python
AI
NLP
Machine Learning

13.11.2020 | 9 [Missing String "readingTime"]

NER @ CLI: Custom-named entity recognition with spaCy in four lines

Named entity recognition is a technical term for a solution to a key automation problem: extraction of information from text. Applications includeautomation of business processes involving documentsdistillation of data from the web by scraping websitesindexing...

Data
AI
NLP
Machine Learning

6.11.2020 | 9 [Missing String "readingTime"]

DISH-O-TRON – Gather that DATA you must!

AI
Computer Vision
Machine Learning

24.9.2020 | 11 [Missing String "readingTime"]

Marcel Mikl

DISH-O-TRON – No more dirty dishes thanks to AI

Sadly, to tell you the truth, doing dishes is still a thing. However, so far most of our readers still like our non-standard Deep Learning tutorial.Typically, AI is demonstrated as solving various toy problems. AI plays chess and Go, AI plays video games...

10.9.2020 | 7 [Missing String "readingTime"]

Marcel Mikl

Why user-oriented development is so important – the story of tactics.ai

In this blog post, we want to give you an insight into the product development of tactics.ai. Our initial idea was a data-driven football analysis tool that applies machine learning techniques to analyze the strengths and weaknesses of opponents and ...

Agile
AI
Startup
Machine Learning
Product management

23.8.2020 | 8 [Missing String "readingTime"]

Denis Stalz-John

Thinking AI means re-thinking data

AI
Big Data
Data
Machine Learning

27.5.2020 | 7 [Missing String "readingTime"]

Marcel Mikl

Kofax Transformation Modules: Natural Language Processing, sentiments ...

Kofax Transformation Modules (KTM) offers several tools for document classification and data extraction. There are some older blog articles about these tools:– Document classification – Data extraction with format locators – Machine Learning The ...

Content Management
AI
Archiving
NLP

6.4.2020 | 8 [Missing String "readingTime"]

Physical regression testing for the Thermomix

Automating physical regression testing of products with computer vision and roboticsTesting a physical product can be a highly manual task. The advances in Deep Learning techniques and computer vision have led to a situation where we can start to strive...

AWS
IoT
Computer Vision
Product management
AI
Testing

31.3.2020 | 8 [Missing String "readingTime"]

Remote training with GitLab-CI and DVC

Git
Machine Learning
CI/CD
AI
GitLab

27.1.2020 | 15 [Missing String "readingTime"]

Marcel Mikl

AWS SageMaker Machine Learning Data handling

Seven ways of handling image and machine learning data with AWS SageMaker and S3If you start using AWS machine learning services, you will have to dive into data handling with AWS SageMaker and S3. We want to show you seven ways of handling image and...

AWS
Computer Vision
Data
AI
Machine Learning

17.1.2020 | 10 [Missing String "readingTime"]

DISH-O-TRON – Train that vision model!

Excursion: Revisit some requirements of dish-o-tron

Approach and Reasoning

fast.ai

AutoML

1. Data preparation

2. Creating the dataset in AutoML

3. Training the vision model

Conclusion

Was this post helpful?

Blog authors

More articles

DISH-O-TRON – Gather that DATA you must!

DISH-O-TRON – No more dirty dishes thanks to AI

Thinking AI means re-thinking data

Great Expectations: Validating datasets in machine learning pipelines

Remote training with GitLab-CI and DVC

DISH-O-TRON – Gather that DATA you must!

DISH-O-TRON – No more dirty dishes thanks to AI

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

More articles in this subject area

Open Source hits Billion-Dollar Market: DeepSeek-R1 is shaking up the ...

How we can hack an AI with just a few words

Simplifying LLM Application Development: A Newcomer's Perspective

Function Calling with GPT Models

Answer questions about your documents with OpenAI and Pinecone

Fighting Gandalf with magic spells (the spells are prompt injections) ...

How to combine Poetry, TensorFlow, and the power of the Apple M1 GPU

How to use Java classes in Python

The universal recommender in Action(ML)

NER with little data? Transformers to the rescue!

Take control of named entity recognition with your own Keras model!

NER @ CLI: Custom-named entity recognition with spaCy in four lines

DISH-O-TRON – Gather that DATA you must!

DISH-O-TRON – No more dirty dishes thanks to AI

Why user-oriented development is so important – the story of tactics.ai

Thinking AI means re-thinking data

Kofax Transformation Modules: Natural Language Processing, sentiments ...

Physical regression testing for the Thermomix

Remote training with GitLab-CI and DVC

AWS SageMaker Machine Learning Data handling