Sadly, to tell you the truth, doing dishes is still a thing. However, so far most of our readers still like our non-standard Deep Learning tutorial.
Typically, AI is demonstrated as solving various toy problems. AI plays chess and Go, AI plays video games, AI makes people dance. It is time to stop this madness and finally apply AI in a meaningful way. Therefore, we proudly present the dish-o-tron. The dish-o-tron is an AI system designed to solve an actual real-world problem impacting millions of people around the world every day: facing dirty dishes in the community kitchen sink.
Reading this blog series will equip you with the ultimate power to solve this long-lasting problem in your community kitchen once and for all by using state-of-the-art AI technology.
At first glance, the dish-o-tron is an inconspicuous, well-positioned webcam in the kitchen observing the shared kitchen sink. In its natural state the dish-o-tron is just happy and enjoys life. The dish-o-tron doesn’t care whether you prefer tea or coffee and it likes all kinds of kitchen talk. However, there is one single thing that the dish-o-tron absolutely hates: watching someone put dirty dishes in the community sink.
Detecting dirty dishes in the sink enrages the peace-loving dish-o-tron so much that it starts beeping. The only way to return it to its natural peaceful state and thus stopping the noise is to admit one’s mistake and remove all dirty dishes from the community sink, leaving it neat and clean again.
Building the dish-o-tron requires three high-level steps:
- Gathering and preparing data
- Training an AI model
- Deployment of the model
In the following, we will discuss these steps further.
Gathering and preparing data
Trying to solve real-world problems with AI often starts with the realisation that there is little or even no data available. This issue prevents many problem solvers from actually solving the problem. “If only data collection had started years ago!”, they say, “then we could now actually solve the problem”. While this is a reasonable thought, it simply doesn’t help.
Consoling users currently facing a problem by saying that it is necessary to gather lots of data for quite some time before we can start building a solution is at least challenging. Typically a more promising approach is to build a system addressing the problem which is able to improve over time.
In this way, we will not solve the problem completely in the first step; however, we will tackle the problem right away and put ourselves in a position to iteratively adjust the solution to match the requirements which also become more and more clear while working on the problem.
Since our problem is unique in a sense that there is no Kaggle dataset readily available, we start our journey to building the dish-o-tron by doing our best to collect a suitable dataset for a first working system. Here, we will make videos of various kitchen sinks clean and not clean and split them up into a first labeled dataset.
In this way, we started collecting the DIRTY-DISHES-DATASET with thousands of pictures that we will share with you in the next article.
Training an AI model
Not so long ago, training an AI model was tedious and required expert knowledge. In many cases this is still true today. Depending on the problem, we have to figure out a suitable model architecture and feature engineering and this requires some experimentation before we can train a suitable AI model. This is another issue which prevents problem solvers from building a solution tackling the whole problem even if data is available.
Fortunately, image classification is one of the best understood use cases in AI. There are lots of established best practices regarding model architectures and training of models. Among others this led to two things:
- High-level software libraries such as fast.ai which abstract away lots of the nitty-gritty details of image classification, providing a black-box kind of approach where state-of-the-art practises are simply utilised without burdening the user with the details.
- Machine Learning as a service offerings from various public cloud providers such as automl and rekognition allowing training of image classification models on custom data in a few simple steps.
Both approaches will typically not lead to the absolutely best solution. However, most of the time this is not necessary and ‘good enough’ will be just fine and a nice trade-off between time & money spent vs. result. For our first version of the dish-o-tron, we will employ the AutoML Service from Google Cloud to train a first model.
We can use various tools to inspect the model and try to explain if the black box learns what we expect.
The training of the AI model with AutoML and its technical details will be discussed in a follow-up blog post.
Deployment of the model
Having an AI model generally will not solve an actual real-life problem. For a viable solution, the AI model has to be integrated into a suitable context. Many times, this is the key step to generating any value at all. Nevertheless, this step is often postponed to the distant future after “collecting high quality data” and “building the best AI model”. This is, more often than not, a mistake because integrating the model into its context poses various challenges on its own. Hence, it should not be ignored and instead tackled early in order to learn and identify the associated challenges.
While building the dish-o-tron, we tried multiple options to run the model. We deployed it on a Pi Zero which is a really small and cheap device that can be glued anywhere with a small powerbank. But it is rather slow. We ran the model in the browser using our notebook’s webcam with TensorFlow.js. We used the Google AIY Kit, which is much faster than the Pi Zero and also comes with a beeper and blinking lights (but it is quite old and deploying state-of-the-art models is hacky). Finally, we used the Google Coral device, which is made for this kind of workload and well-integrated into Google AutoML but comes with a price tag.
The community kitchen is a special place. It’s a place where rumors are born, where gossip is produced and where you can openly chat about the most secret secrets of your company! That’s why dish-o-tron is living on the edge. Edge devices enable you to run audio and video analytics AND respect the privacy of your community kitchen. No image is transferred to the cloud. Nothing is saved. Dish-o-tron sees and forgets.
Moreover, the hardware we consider and buy in order to actually build the dish-o-tron will establish basic conditions for our solution space. In other words, we have to mind that it is possible to painlessly deploy the AI model on our preferred edge device. For the first version of the dish-o-tron, we decided to use a Google AIY kit (see video below). For the next version, we chose a Google Coral edge device, which allows us to run advanced computer vision tasks on a Raspberry-size mini computer. Fortunately, AutoML allows us to export models in a viable format.
The construction of the dish-o-tron including the deployment of the model on the Coral device and its technical details will be discussed in an upcoming blog post.
AI research has brought us new technology that can solve problems that couldn’t be solved before. Have you read the book AI superpowers by Kai-Fu Lee? He says that you don’t need to be one of the best AI researchers any more to apply AI and find new business opportunities. You need to collect (lots of) data and can “just” use existing algorithms, services and open source frameworks. Well, in our opinion building AI solutions is not easy – but it is indeed getting easier and easier every day.
See the first prototype running on the google AIY kit here (mind the green/red LED at the box):
Follow this blog series if you want to know how to build and run such a model on an edge device yourself. Building the dish-o-tron will fundamentally change the way you experience the community kitchen. Instead of being a place of constant anger and hostility, the community kitchen will become a peaceful meeting ground for sharing ideas and connecting with co-workers.
In the upcoming blog posts, we will guide you through the process of building your own dish-o-tron for your community kitchen sink. Hence, we will tackle a real-world problem and playfully learn how to build and improve an AI system from scratch. Stay tuned!
Continue with the the second part of our series where we start with gathering data.
Your job at codecentric?
More articles in this subject area
Discover exciting further topics and let the codecentric world inspire you.