Core ML – inference on iOS

19.8.2019 | 7 minutes reading time

In machine learning, we are training a model for a particular task, e.g. distinguishing dogs and cats in pictures. Inference refers to the application of the model. Most of the inference applications are addressed via a client-server API or used in batch mode. In comparison, applications (such as Apple’s FaceID) run directly on the mobile device. The on-device inference has the benefit of low latencies, which creates an excellent user experience. In these applications, the topic of reasoning on mobile devices is gaining more and more attention. In addition to Apple, Google explores various hardware options to deploy the resource-intensive models on mobile devices. The goal of this article is to show the possibilities of inference with an iOS device. In addition to the advantages and disadvantages of mobile inference, we are taking a look at the Core ML framework, the Neural Engine and the hardware innovation in the mobile area.

Inference on-device

Let’s look at the pros and cons of putting a model on a mobile device.

Pros

Latency: There is no network traffic generated by using on-device inference. The prediction is computed directly on the hardware, that has the side benefit to use the application in an offline environment.
Data Security: There is no data movement involved when it comes to computing the prediction. The data does not have to leave the device, which introduces a certain level of data safety.

Cons

Updating the models: When we want to publish a newly trained model, we have to release a new version of the application itself. For my experience in the mobile development area, it usually takes some time until all users have their app updated. Furthermore, the models are consuming a lot of space, which is also a downside for some users.
Speed of the hardware: Depending on the equipment used, the computing time of the models may vary significantly. While newer devices such as the iPhone XS include specialized machine learning hardware, performance degradation can occur on older devices.

Both the latency and the given data security are fascinating arguments to deal with the subject of mobile machine learning more closely. One of the most critical factors in the successful application of the models is the speed of the hardware. Apple has equipped the A11 and A12 Bionic chip with specialized hardware to run neural networks on the iPhone efficiently. For these reasons, we want to dive deeper into the subject of machine learning on the iPhone.

Core ML

Core ML is a machine learning framework developed by Apple. Compared to PyTorch and TensorFlow, that are used to train models, Core ML has a focus on deployment and runtime of the models. With Core ML 3 on-device training is possible. The developer must have already trained a model to be then able to execute it with Core ML or integrate it into an iOS app. Before we can integrate the model into the application with Core ML, a conversion to the Core ML format is necessary. Mainly, Core ML can only be used within the Apple ecosystem and not for Android applications.

Core ML stack (https://developer.apple.com/documentation/coreml)

Core ML uses Accelerate , Basic neural network subroutines (BNNS) and Metal Performance Shaders (MPS) libraries, which primarily cover low-level neural network, inference, CPU, and GPU operations. These libraries greatly facilitate access to machine learning on iOS. Furthermore, Apple has developed the frameworks Vision and Natural Language to perform feature extraction on image and text data. For example, existing models of the Vision library can recognize faces, texts and barcodes on images. Then this information can act as features for your models.

Alternatives to Core ML

In addition to Core ML, there are of course also other ways to take a model on an iOS terminal in operation, for example, TensorFlow Lite . The significant advantage of this is that a model can be used directly on different platforms like Android. However, this involves some disadvantages. XCode provides access to Core ML. We don’t need to set up complex environments to start developing. Furthermore, Core ML is optimized to run on iOS. As a result, Core ML’s performance is significantly better compared to TensorFlow Lite. It is worth taking a look at the article by Andrey Logvinenko, who has studied the performance differences in detail.

Neural Engine

After taking a look at the software aspects of the iOS ecosystem, let’s look at the hardware features of the iPhone XS introduced last year. The A12 Bionic chip consists of a computing unit (six cores), a graphics unit (four cores) and a neural processing unit (also known as Neural Engine). The Neural Engine is the centrepiece for running models. For example, FaceID and Siri use the Neural Engine to make predictions. Although these applications could also be carried out with the CPU, this would significantly increase the computing time and energy consumption. The Neural Engine consists of eight cores and theoretically can perform up to 5 trillion calculations per second. As a developer, we have access to the Neural Engine and can run our models on it.

Looking at the development of the A * Bionic series, it becomes clear that Apple puts a lot of effort into further development. On the A12 Bionic chip, Core ML is up to 9 times faster than its predecessor A11 Bionic . Various experiments from the community, such as Yolo and Core ML , show that a similar improvement is achieved.

Image recognition with Core ML

To use a trained model with Core ML, we need to convert it with the Python Package coremltools . Besides the coremltools, there are also third-party conversion tools, such as TensorFlow converter .

We are using a Keras model that can distinguish between dogs and cats. In the app, either a picture taken with the camera or removed from the photo library.

Keras model to Core ML model with coremltools

Before the conversion, the Keras model must be saved with model.save (“model.h5”). Finally, the model can be converted using the method coremltools.converters.keras.convert. You must specify different metadata such as the classes. Furthermore, additional preprocessing methods such as a normalization of the data can be specified. In our case, we have the two classes Cat (0) and Dogs (1). The image_scale, green_bias, red_bias, and blue_bias fields specify the preprocessing values. In this example, we use MobileNet preprocessing. After conversion, the model must be saved as “.mlmodel”. Core ML can then read this in an app.

For integration into an app, the file must be added to the Xcode project. In XCode, you can see which model parameters are given for the input and output of the data. In our case, we need as input an RGB image with 224×224 pixels. The output of the model is the highest-probability label and a hashmap that contains the likelihood of the labels.

Integration of Core ML model into XCode

The prediction works with the model.prediction (image: features) method. For this, the model must first be loaded. With the class UIImage the image data can be processed. Besides, we have added the methods resize and pixelBuffer to the class. The resize method can be used to resize images to 224×224 pixels to prepare for the prediction. The pixel buffer serves as the input vector for the model.

Application that computes probabilities of cats/dogs by using deep learning

Application to predict cats and dogs

Summary

In this article, Core ML and Apple’s hardware innovations were introduced to enable inference on iOS. While frameworks such as TensorFlow can both train new models and infer models, Core ML only allows inference. A model trained with a TensorFlow or another Third-Party Library needs to be converted to Core ML using the Python library coremltools. Then the converted model can be integrated into an app and run through Core ML. In addition to Core ML, there are other frameworks such as TensorFlow Lite to perform inference on iOS. One of the core strengths of Core ML compared to the other frameworks is its performance. Core ML is much faster due to the hardware optimizations. In addition to software development, Apple is investing in hardware innovations. The Neural Engine has created a vital core that provides iOS devices with sufficient resources to enable inference on the end device. This ensures the privacy of the data without compromising the performance of the models. In conclusion, Apple has created an ecosystem through Core ML and hardware innovation that makes it easy to use machine learning in apps.

Was this post helpful?

Blog author

Nico Axtmann

Do you still have questions? Just send me a message.

fromNico Axtmann

Portability between deep learning frameworks – with ONNX

In recent years, the number of frameworks for deep learning has exploded. Companies such as Google, Facebook and Amazon have made their deep learning frameworks TensorFlow , PyTorch and MXNet available open-source or are actively involved in developing...

Data
Machine Learning
AI
Python

27.8.2019 | 6 minutes reading time

Nico Axtmann

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Using Dagster with DuckDB

DuckDB has rapidly emerged as a popular in-process analytics database. Dagster, on the other hand, is a modern data orchestration framework that makes it easy to build and manage data pipelines. Combining Dagster with DuckDB allows data engineers to ...

Data

16.5.2025 | 4 minutes reading time

Hendrik Kamp

Querying Databricks Delta Tables in Motherduck

Intro In a previous article, my colleague Matthias Niehoff demonstrated how duckdb can serve as a viable alternative to Spark for processing data stored in Databricks, specifically by directly accessing the Unity Catalog. Building upon that, a next ...

Data

25.4.2025 | 4 minutes reading time

Hendrik Kamp

Self-issued JWT for mobile client authentication

Overview Mobile applications frequently authenticate their backend calls via JWT. These tokens are frequently used in conjunction with OIDC to authenticate a user. Sometimes, particularly in high-assurance scenarios, it can be preferable to authenticate...

IT-Security
Mobile
Rust
Kotlin
Android

4.2.2025 | 8 minutes reading time

Elisabeth Schulz

Introducing Data Interface Quadrants (DIQs)

In today’s rapidly evolving, data-driven world, organisations face an increasingly complex challenge: how to design, implement, and manage data interfaces that meet both immediate operational demands and long-term strategic business objectives. A data...

API
Data

30.1.2025 | 8 minutes reading time

Daniel Kocot

Miriam Greis

Open Source hits Billion-Dollar Market: DeepSeek-R1 is shaking up the ...

On January 27, 2025, the technology stock exchange experienced an unexpected crash: The NVIDIA stock price plummeted by over 17%, temporarily wiping out nearly $600 billion in market value and setting a new historical record in the stock market. Many...

AI
Generative AI
LLM

29.1.2025 | 8 minutes reading time

How we can hack an AI with just a few words

How we can hack an AI with just a few words Artificial intelligence (AI) has undergone an astonishing transformation in recent years and is now present in many areas of life. Whether in the form of chatbots that help us with everyday questions or generative...

IT-Security
AI

27.1.2025 | 4 minutes reading time

Access Databricks UnityCatalog from duckdb

Databricks is a great platform when it comes to data management and governance, mostly due to the unity catalog. But Spark as an engine for processing the data is just ok'ish, especially when data is not really big. New engines like polars, datafusion...

Data

20.1.2025 | 5 minutes reading time

Matthias Niehoff

Charge your APIs Volume 36 - Trends for 2025

As 2025 approaches, new trends are emerging in the world of APIs. After 2024 was user-centric, the focus is now shifting back to developer needs and increasing productivity. APIs are evolving and the technologies surrounding them are becoming more powerful...

Integration
API
Data
Software architecture

11.12.2024 | 5 minutes reading time

Daniel Kocot

Simplifying LLM Application Development: A Newcomer's Perspective

I. Introduction Large Language Models (LLMs) have become highly popular due to their transformative impact on various fields, especially within IT. They enable developers to create innovative software applications centered around AI interactions, offering...

Generative AI
AI

6.12.2024 | 13 minutes reading time

Function Calling with GPT Models

GenAI is a powerful tool for generating content and interacting with applications using natural language. However, this tool also has significant limitations when you plan to use it in your own software. GenAI's knowledge is limited to information that...

Generative AI
AI
LLM

6.9.2024 | 5 minutes reading time

When Business Meets Technology: From Data Product to Data Architecture...

Abstract The Data Product Canvas (DPC) is a tool for the lightweight and iterative definition of data products. It increases the efficiency of product definition by clearly presenting the key impact areas on data products. Additionally, the DPC motivates...

Software architecture
Data
DDD
Digital product developement

6.8.2024 | 24 minutes reading time

Dr. Florian Rademacher

Charge your APIs Volume 28: Empowering application and data integration...

In today's fast-paced world, seamless application and data integration is crucial for organisational success. This blog explores how frameworks like Maslow's Pyramid, Team Topologies, Evolutionary Architectures, API Federation, and API Marketplaces, ...

API
Data
Integration

25.7.2024 | 8 minutes reading time

Daniel Kocot

Data for the Masses Volume 2: Data Products, Data Contracts and API Contracts

The pillars of modern data architectures as success factors for organisations In the digital economy, a well-thought-out data architecture and the efficient use of data are crucial for organisational success. Data products, data contracts and API contracts...

Data
API

13.6.2024 | 7 minutes reading time

Daniel Kocot

Becoming a Data-Driven Company with Applied Data Products

In recent years, the hype surrounding the value of data has grown continuously, and a multitude of concepts and methods have emerged on how companies can become 'data-driven'. From strategic top management to detail-oriented data analysts attempts are...

Agile
Big Data
Data
Product management
Digitalization
Data Science
Business Intelligence

18.5.2024 | 9 minutes reading time

Dr. Florian Rademacher

A/B Testing: Tool support and testing GrowthBook

In the previous blog post we introduced some general concepts of A/B testing: we explored the main aspects, defined test types and explained the most common statistical methods. Now we want to explore the areas in which A/B testing tools can provide...

Testing
Python
Data
UX/UI
Analysis
JavaScript

18.3.2024 | 20 minutes reading time

Francesca Diana

A/B Testing: An introduction

This blog series aims to aid teams who are contemplating adding A/B testing to their toolkit but are unsure of which tool to use. In addition to helping with tool selection, the series also provides the entire team with a consistent initial understanding...

Testing
Data
UX/UI
Analysis

6.2.2024 | 29 minutes reading time

Francesca Diana

Data for the Masses Volume 1: The Digital Product Passport - A Key Element...

The Digital Product Passport represents a significant shift for digital units within organisations, compelling them to ensure comprehensive data transparency. This tool not only serves as a product's digital fingerprint but also opens up new dimensions...

Data
Product management

25.1.2024 | 7 minutes reading time

Daniel Kocot

Answer questions about your documents with OpenAI and Pinecone

In recent years, large language models (LLMs) have made remarkable progress in interacting with humans, showcasing their ability to answer a wide array of questions. Trained on publicly accessible internet content, these models have broad knowledge across...

13.11.2023 | 12 minutes reading time

Lukas Lehmann

Charge your APIs: NordicAPIs Platform Summit Edition - API first ... not...

In the ever-evolving landscape of software development, buzzwords and paradigms come and go. One such term that has gained significant traction in recent years is "API-First Development." It's been hailed as the holy grail of modern software engineering...

API
Data

19.10.2023 | 5 minutes reading time

Daniel Kocot

An introduction to federated learning in an industrial context: Advanced

In the Machine Learning space, it was long believed that sharing learnings or weights was safe in the sense that the input data couldn't be extracted. However, this belief has been challenged by researchers coming out over the years. Nowadays, numerous...

Machine Learning
Big Data
Data Science
Data

18.9.2023 | 9 minutes reading time

Core ML – inference on iOS

Inference on-device

Pros

Cons

Core ML

Alternatives to Core ML

Neural Engine

Image recognition with Core ML

Summary

Was this post helpful?

Blog author

More articles

Portability between deep learning frameworks – with ONNX

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

More articles in this subject area

Using Dagster with DuckDB

Querying Databricks Delta Tables in Motherduck

Self-issued JWT for mobile client authentication

Introducing Data Interface Quadrants (DIQs)

Open Source hits Billion-Dollar Market: DeepSeek-R1 is shaking up the ...

How we can hack an AI with just a few words

Access Databricks UnityCatalog from duckdb

Charge your APIs Volume 36 - Trends for 2025

Simplifying LLM Application Development: A Newcomer's Perspective

Function Calling with GPT Models

When Business Meets Technology: From Data Product to Data Architecture...

Charge your APIs Volume 28: Empowering application and data integration...

Data for the Masses Volume 2: Data Products, Data Contracts and API Contracts

Becoming a Data-Driven Company with Applied Data Products

A/B Testing: Tool support and testing GrowthBook

A/B Testing: An introduction

Data for the Masses Volume 1: The Digital Product Passport - A Key Element...

Answer questions about your documents with OpenAI and Pinecone

Charge your APIs: NordicAPIs Platform Summit Edition - API first ... not...

An introduction to federated learning in an industrial context: Advanced