CoreML Models for whisper.cpp


A guide for setting up and converting Whisper models to Core ML format on Apple Silicon Macs, following the official whisper.cpp Core ML instructions.

Prerequisites

Make sure Xcode and the command-line tools are installed:

$ xcode-select --install

macOS Sonoma (14) or newer is recommended; older releases can suffer from transcription hallucinations with Core ML models.

Miniconda Installation Instructions

These commands install Miniconda on a Mac with Apple Silicon (ARM64) processor. First, it downloads the installation script using curl. Then it runs the script in batch mode (-b), updating any existing installation (-u), and installing to the specified directory (-p). Finally, it removes the downloaded installer file. Replace <MINICONDA_DIRECTORY> with your desired installation path, like ~/miniconda3.

$ curl -LO https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
$ bash Miniconda3-latest-MacOSX-arm64.sh -b -u -p <MINICONDA_DIRECTORY>
$ rm Miniconda3-latest-MacOSX-arm64.sh

Create the Python Environment

These commands set up and activate a new conda environment. First, the source command activates conda in your current shell session, making the conda command available. Replace <MINICONDA_DIRECTORY> with your Miniconda installation path. The conda create command makes a new environment named “py311-whisper” with Python 3.11 installed (the version recommended by the upstream project). The -n flag sets the environment name, while python=3.11 specifies the Python version. The -y flag automatically answers “yes” to prompts. Finally, conda activate switches into the newly created environment, where you can start installing and using packages.

$ source <MINICONDA_DIRECTORY>/bin/activate
$ conda create -n py311-whisper python=3.11 -y
$ conda activate py311-whisper

Install the Required Python Packages

These commands install the Python packages needed to convert Whisper models to Core ML format. ane_transformers provides Apple Neural Engine optimizations, openai-whisper is the upstream speech recognition model, and coremltools performs the conversion to Apple’s Core ML format.

$ pip install ane_transformers
$ pip install openai-whisper
$ pip install coremltools

Download the whisper.cpp Models

These commands set up the GGML version of the Whisper model. First, it changes to the models directory within your Whisper installation (replace <WHISPER_DIRECTORY> with your actual Whisper directory path). Then it runs a script to download the “base” size GGML model, which is an optimized version of Whisper designed for CPU inference. The base model offers a good balance between accuracy and resource usage.

$ cd <WHISPER_DIRECTORY>/models
$ ./download-ggml-model.sh base

Note: Other available models are tiny, small, medium, large-v1, large-v2, large-v3 and large-v3-turbo. English-only variants (e.g. base.en, small.en) are also available and tend to be more accurate for English audio.

Convert the whisper.cpp Models

This command converts the base Whisper model to Core ML format. It runs a script that transforms the previously downloaded GGML base model into Apple’s Core ML format, optimized for the Apple Neural Engine. The result is a models/ggml-base-encoder.mlmodelc directory that whisper.cpp will load automatically when built with -DWHISPER_COREML=1.

$ ./generate-coreml-model.sh base

Note: Repeat for other models if needed, e.g. tiny, small, medium, large-v1, large-v2, large-v3 or large-v3-turbo. The first run on a device is slow because the ANE service compiles the model to a device-specific format; subsequent runs are fast.

Remove the Python Environment

These commands clean up the conda environment setup. First, conda deactivate exits the current environment and returns to the base conda environment. Then conda env remove deletes the “py311-whisper” environment and all its installed packages, freeing up disk space. This is useful when you’re done with your work and want to clean up your system.

$ conda deactivate
$ conda env remove -n py311-whisper