CoreML Models for whisper.cpp
A guide for setting up and converting Whisper models to Core ML format on Apple Silicon Macs, following the official whisper.cpp Core ML instructions.
Prerequisites
Make sure Xcode and the command-line tools are installed:
$ xcode-select --install
macOS Sonoma (14) or newer is recommended; older releases can suffer from transcription hallucinations with Core ML models.
Miniconda Installation Instructions
These commands install Miniconda on a Mac with Apple Silicon (ARM64) processor. First, it downloads the installation script using curl. Then it runs the script in batch mode (-b), updating any existing installation (-u), and installing to the specified directory (-p). Finally, it removes the downloaded installer file.
Replace <MINICONDA_DIRECTORY> with your desired installation path, like ~/miniconda3.
$ curl -LO https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
$ bash Miniconda3-latest-MacOSX-arm64.sh -b -u -p <MINICONDA_DIRECTORY>
$ rm Miniconda3-latest-MacOSX-arm64.sh
Create the Python Environment
These commands set up and activate a new conda environment. First, the source command activates conda in your current shell session, making the conda command available. Replace <MINICONDA_DIRECTORY> with your Miniconda installation path.
The conda create command makes a new environment named “py311-whisper” with Python 3.11 installed (the version recommended by the upstream project). The -n flag sets the environment name, while python=3.11 specifies the Python version. The -y flag automatically answers “yes” to prompts.
Finally, conda activate switches into the newly created environment, where you can start installing and using packages.
$ source <MINICONDA_DIRECTORY>/bin/activate
$ conda create -n py311-whisper python=3.11 -y
$ conda activate py311-whisper
Install the Required Python Packages
These commands install the Python packages needed to convert Whisper models to Core ML format. ane_transformers provides Apple Neural Engine optimizations, openai-whisper is the upstream speech recognition model, and coremltools performs the conversion to Apple’s Core ML format.
$ pip install ane_transformers
$ pip install openai-whisper
$ pip install coremltools
Download the whisper.cpp Models
These commands set up the GGML version of the Whisper model. First, it changes to the models directory within your Whisper installation (replace <WHISPER_DIRECTORY> with your actual Whisper directory path). Then it runs a script to download the “base” size GGML model, which is an optimized version of Whisper designed for CPU inference. The base model offers a good balance between accuracy and resource usage.
$ cd <WHISPER_DIRECTORY>/models
$ ./download-ggml-model.sh base
Note: Other available models are tiny, small, medium, large-v1, large-v2, large-v3 and large-v3-turbo. English-only variants (e.g. base.en, small.en) are also available and tend to be more accurate for English audio.
Convert the whisper.cpp Models
This command converts the base Whisper model to Core ML format. It runs a script that transforms the previously downloaded GGML base model into Apple’s Core ML format, optimized for the Apple Neural Engine. The result is a models/ggml-base-encoder.mlmodelc directory that whisper.cpp will load automatically when built with -DWHISPER_COREML=1.
$ ./generate-coreml-model.sh base
Note: Repeat for other models if needed, e.g. tiny, small, medium, large-v1, large-v2, large-v3 or large-v3-turbo. The first run on a device is slow because the ANE service compiles the model to a device-specific format; subsequent runs are fast.
Remove the Python Environment
These commands clean up the conda environment setup. First, conda deactivate exits the current environment and returns to the base conda environment. Then conda env remove deletes the “py311-whisper” environment and all its installed packages, freeing up disk space. This is useful when you’re done with your work and want to clean up your system.
$ conda deactivate
$ conda env remove -n py311-whisper