Convert Image to CVPixelBuffer for Machine Learning Swift
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Most Core ML image models do not want a UIImage directly. They want a CVPixelBuffer with a specific size and pixel format, so converting the image correctly is part of getting reliable predictions, not just plumbing.
What the Model Actually Expects
Before converting anything, check the model input description in Xcode or in code. The important details are:
- width and height
- color format
- whether the model expects an image or a multi-array
If the model expects 224 x 224 RGB-style image input and you feed a differently sized buffer, the prediction may fail or the framework may add resizing that you did not intend.
A Practical UIImage to CVPixelBuffer Conversion
The common pattern is:
- create a pixel buffer
- lock its memory
- draw the image into that memory with Core Graphics
- unlock the buffer
This is the standard shape of the solution: create, lock, draw, unlock.
Using the Buffer with Core ML
Once you have the pixel buffer, feed it into the model's prediction call:
The details of the generated API vary by model, but the conversion step stays about the same.
Orientation and Resizing Matter
A frequent bug is that the image looks correct on screen but arrives rotated or mirrored in the pixel buffer. UIImage can carry orientation metadata, while CGImage is just raw pixels. If the source image came from the camera or photo library, normalize orientation before converting, or make sure your drawing step handles it.
Resizing is another hidden decision. The simple context.draw call above scales the whole image to the target rectangle. That may distort aspect ratio. Some models tolerate that, but others perform better if you crop or letterbox first.
Performance Considerations
For occasional predictions, the conversion above is fine. For real-time camera inference, repeated UIImage conversion can become a bottleneck. In that case:
- prefer using the camera's existing
CVPixelBuffer - reuse buffers when possible
- avoid unnecessary image-format hops
If your input already comes from AVCaptureVideoDataOutput, you often receive a CVPixelBuffer directly and should skip UIImage entirely.
Common Pitfalls
- Converting to the wrong width, height, or pixel format for the model.
- Ignoring image orientation and getting rotated predictions.
- Stretching images without thinking about aspect ratio.
- Converting camera frames through
UIImagewhen aCVPixelBufferalready exists. - Forgetting to lock and unlock the pixel buffer around memory access.
Summary
- Core ML image models commonly expect
CVPixelBuffer, notUIImage. - Correct conversion requires the right size, pixel format, and drawing step.
- Orientation and aspect-ratio handling can affect model accuracy.
- For camera pipelines, avoid unnecessary
UIImageconversions when you already have pixel buffers. - Treat the conversion as part of model input correctness, not just a format workaround.

