Core ML and Vision for iOS

Apple showed off some spectacular tech demos throughout this year’s WWDC, particularly, those related to ARKit and it’s base framework Core ML. I’ve only dabbled with basic machine learning in the past (K Nearest Neighbor barely counts), and was intrigued at the amount of abstraction Apple provides in implementing computer vision, natural language processing, and working with custom models. After watching the Introducing Core ML session, I decided to get my hands dirty and create a little image recognition app fueled by machine learning.

Starting off was easy enough, after setting up a basic single page application with an UIImageView, a couple of labels, and a set of buttons, I picked a demo model (I went with Inception V3), and dragged it into Xcode 9. With the model in the project, all that was left was to reference the model, and have it make a prediction based on an image the user provides.

model
Once the model has been imported, clicking on the file in Xcode 9 will reveal details specific to the model, including what inputs and outputs. In the case of Inception V3, the model expects an image and will return a dictionary of labels and probabilities.

Using the model

let model = try VNCoreMLModel(for: Inceptionv3().model)
let request = VNCoreMLRequest(model: model, completionHandler: displayPredictions)
let handler = VNImageRequestHandler(cgImage: image.cgImage!)
try handler.perform([request])

These two code blocks are where the magic happens. Above, I referenced the model I imported, specify a completion handler, provide the image, and initiate the prediction.

Viewing predictions

func displayPredictions(request: VNRequest, error: Error?) {
   // Make sure we have a result
   guard let results = request.results as? [VNClassificationObservation]
   else { fatalError("Bad prediction") }

   // Sort results by confidence
   results.sorted(by: {$0.confidence > $1.confidence})

   // Show prediction results
   print("\(results[0].identifier) - \(results[0].confidence)%")
   print("\(results[1].identifier) - \(results[1].confidence)%")
   print("\(results[2].identifier) - \(results[2].confidence)%")
}

When the prediction request completes, the completion handler above is called, handling the results. I did a simple sort based on the confidence percentage provided by the model and displayed the top three results.

? Success: 98.32451%

Pretty dang easy. If you’d like to take a look at my example project, head on on over to GitHub.