Active Hackathon

Inside Microsoft’s Custom Vision AI; A Code-Free Automated ML For Image Classification

Microsoft has been trying to set a foot in artificial intelligence, and with Custom Vision AI, it brings a fresh outlook in the image recognition space, underlined with artificial intelligence. Forming an integral part of the host of machine learning services offered by Microsoft’s cloud-platform Azure called as “Cognitive Services”, Custom Vision was introduced with the idea of clubbing an image application program interface for its existing computer vision API to tap into the possibility of code-free user experience. It lets developers to quickly add image recognition capabilities to iOS and Android apps, hence creating sophisticated vision applications, but with minimum of efforts and time!


Sign up for your weekly dose of what's up in emerging technology.

After the company introduced Seeing AI, an AI based app that lets visually impaired people to see what’s around them as the phone’s camera describes what around them—including people and objects, this is Microsoft’s another attempt at using intelligent services to let developers train, deploy and optimise image classifiers, enabling computer vision capabilities to identify objects in applications.

“Our mission is to bring AI to every developer and every organisation on the planet, and help businesses augment human ingenuity in unique and differentiated ways”, Joseph Sirosh, corporate vice president of Artificial Intelligence and Research at Microsoft had said in a blogpost. The new tool allows developers to create and train the custom vision model and export the model in a matter of few clicks. This allows developers a quick way to take their custom model with them to any environment whether their scenario requires that the model run on-premises, in the cloud, or on mobile and edge devices, he had said.

Getting into details

Microsoft has been making great progress in AI by introducing capabilities such as Transfer learning and automated machine learning to developers. Custom Vision Service is one such effort in this direction. As we mentioned earlier Custom Vision Service is a cloud enabled tool for easily training, deploying, and improving your custom image classifiers, with just a handful of images per category, it lets developers train their own image classifier in minutes.

It can be defined as a collection of software such as Application Program Interfaces (APIs) and Software Development Kits (SDKs) under one roof, that helps developers with machine learning for user applications.

It was at the Microsoft’s Build Conference in Seattle that Cornelia Carapcea, Senior Project Manager at Microsoft had demonstrated the technology, where she said that creating your own custom vision API models requires a few sample of training data, may be a few dozens photographic samples, and Custom Vision does the rest. It can be accessed via REST API (Representational State Transfer Technology), hosted on Microsoft’s server, and can be used across identifying food and landmarks, and even in retail environment. She also explained that Custom Vision can select images that are most likely to add the most gain to your model, allowing you to manually tag the image, which then improves overall accuracy and reliability.

“Today, in addition to hosting your classifiers at a REST endpoint, you can now export models to run offline, starting with export to the CoreML format for iOS 11 and to the TensorFlow format for Android”, Sirosh had said.

Making your own model using Custom Vision AI

A tool for building custom image classifier, Custom Vision AI generates results based on the categories that are trained. As the company says, it requires only 50 images to get started and can generate results in as less as 1 minute. As Carapcea explains, the API is split into two sections—Prediction API and Training API. While prediction API can be thought of as public API service that allows to make predictions using trained classifiers, training API is private management system where user can add projects, add images, edit text, among others.

The step to create your model goes as follows:

  • To get started with building your own project, you would first need to choose a set of images (30-50) to build a model—such as cars, leaves, sheep dog or anything that you would like to build.
  • It is easy to sign in to the page, where you can create your login details to create a new project.

  •    Once logged in, create a new project and start adding images.

  • Save all the images, (in this case cars).

  • Once all the images are added, click ‘Train’ button which will perform initial training on the machine. The results are displayed once the training completes. The service provides a Prediction API where you can use the provided image URL or image file to classify new images.

  • The final step is testing, which can be done by clicking ‘Quick Test’. It can be tested by selecting a photo which wasn’t used in the training process. It gives a probability of that particular image being an image similar to the model that you had trained and saves it into the prediction tab.

  • If the model predicts the image used is 100% a car (in this case), then the machine learning model is working and has identified the right image class. In case it shows 0%, then the image doesn’t resemble the one that we had trained (eg. hot air balloon).

“Once you have created and trained your custom vision model through the service, it’s a matter of a few clicks to get your model exported from the service. This allows developers a quick way to take their custom model with them to any environment whether their scenario requires that the model run on-premises, in the cloud, or on mobile and edge devices. This provides the most flexible and easy way for developers to export and embed custom vision models in minutes with “zero” coding”, said Sirosh in the blogpost.

On a concluding note

Microsoft is aggressively venturing into crucial areas of research such as image identification and moderation, language analytics, speech recognition and knowledge mapping. With its latest service, while Microsoft has made it easy for the non coder-enthusiasts to get into the field of image classification, with all the APIs used in the website to go public, its competitors Google and Amazon are also exploring the space.

In June 2017, Google open sourced more of its machine learning computer vision technologies, making them available to developers via its TensorFlow Objection Detection API. It has helped developers and researchers create systems that automatically detect and identify multiple objects contained in a single image.

Amazon is also using its computer vision technologies to eliminate checkout lines at brick-and-mortar stores.

More Great AIM Stories

Srishti Deoras
Srishti currently works as Associate Editor at Analytics India Magazine. When not covering the analytics news, editing and writing articles, she could be found reading or capturing thoughts into pictures.

Our Upcoming Events

Conference, Virtual
Genpact Analytics Career Day
3rd Sep

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

Council Post: How to Evolve with Changing Workforce

The demand for digital roles is growing rapidly, and scouting for talent is becoming more and more difficult. If organisations do not change their ways to adapt and alter their strategy, it could have a significant business impact.

All Tech Giants: On your Mark, Get Set – Slow!

In September 2021, the FTC published a report on M&As of five top companies in the US that have escaped the antitrust laws. These were Alphabet/Google, Amazon, Apple, Facebook, and Microsoft.

The Digital Transformation Journey of Vedanta

In the current digital ecosystem, the evolving technologies can be seen both as an opportunity to gain new insights as well as a disruption by others, says Vineet Jaiswal, chief digital and technology officer at Vedanta Resources Limited

BlenderBot — Public, Yet Not Too Public

As a footnote, Meta cites access will be granted to academic researchers and people affiliated to government organisations, civil society groups, academia and global industry research labs.