Why India Is A Hotspot For Data Labelling Services?

An AI or machine learning model is as good as the data it is trained on. As things stand, 80 percentage time of an AI project is earmarked for wrangling training data, including data labelling. Data labelling takes up the bulk of data scientists’ time, which could otherwise have been devoted to building the algorithm.

Many companies outsource data labelling and annotation so the data scientists can focus on algorithm development and avoid project delays. According to a 2019 Cognilytica report, the market valuation of third-party data labelling services is projected to cross $1 billion by 2023.

Indian market

India has emerged as the top outsourcing destination for data labelling for apparent reasons. Globalisation, demographic advantage, and cheap labour, to name a few. Thanks to the BPO boom in the country at the turn of the century, the Indian workforce was more than ready to lap up data labelling jobs.


Sign up for your weekly dose of what's up in emerging technology.

Since data labelling is a process-driven task, even a person with a high-school education can pick up the required skills. The market for data annotation and labelling is exploding in India.

“AI requires properly annotated, classified and anonymised data. For this, whether you like it or not, you will use automation, but you will also have to use a skilled human workforce, and that is the opportunity it presents for India,” said Sangeeta Gupta, Senior VP at NASSCOM, in an earlier interview.

In the initial stages, Amazon’s MTurk was the go-to portal to find data labelling and annotating jobs. Freelancers will get paid based on tasks completed. However, Amazon soon put a restriction on non-US workers (lifted later). 

Amazon MTurk paved the way for similar organisations. The popular ones include:

Playment: It is a complete data labelling platform founded by ex-Flipkart employees–Siddharth Mall, Ajinkya Malasane and Akshay Kumar Lal. It breaks down large labelling tasks into micro-tasks and distributes them among its community of trained annotators. As of 2019, Playment’s platform has 300,000 labellers and annotators. A labeller attached with Playment can earn up to 30,000 per month.

iMerit: iMerit’s data labelling services are used in advanced machine learning algorithms, computer vision, natural language processing, augmented reality, and data analytics. As per the company, its workforce is adequately trained to label data for transformative technologies such as cancer research, driverless car training, and crop yield optimisation. The company is funded by Omidyar Network and Micheal and Susan Dell Foundation.

Infolks: Founded in 2016, Infolks is among India’s top data labelling companies. It offers services in machine learning, artificial intelligence, training data as a service, image annotation and data categorisation.

Hot takes

“Numerous data labelling firms have sprung up to address this growing need, and many of them are tapping into a global pool of ‘gig workers’ that can get this done effectively. Software and algorithms make it easier to divvy up tasks and have people work at their convenience. India offers a huge talent pool with ready access to smartphones and the ability to tap into a new income source or to supplement their earnings. Time difference, in this case, can even be an asset,” said Girish Muckai, Chief Sales & Marketing Officer of HEAL Software Inc.

“Training AI models to deliver high levels of accuracy is critical to success. However, labelling training data sets is tedious work. It’s time consuming, complex and requires significant workforce. The tech industry’s outsourcing boom in India and its large population, make it a growing hotbed of this precision work. Its people and skills position India as a key resource for years to come in an increasingly digital world,” said Lori McKellar, Senior Director, Product Marketing at OpenText. 

“India has emerged as a huge pool of employable workers to undertake data labelling jobs. The reasons are essentially the same which led to the expansion of the BPO/KPO service industry in India in the past 20 years:

  • Cost-effective workforce
  • English literacy and basic computing skills
  • High speed and cheap internet
  • Stable economy – compared to some other East-European/African/South-Asian countries

The need to provide a reliable and cheap way to produce training data is paramount now. Most of these are quite low pay + low skill jobs (compared to an average software developer) and require considerable basic training for the employee to become autonomous. Very soon, other developing economies like Romania, Indonesia, Vietnam, the Philippines etc. are likely to follow through and join this sphere, mostly due to the same factors/reasons. If India wants to maintain a lead in this market, we’ll have to keep evolving consistently by providing similar support to other AI operations which require more complexity and mid to high level of technical competency,” said Shishir Thakur, CEO and founder, Cranberry Tech.

More Great AIM Stories

Shraddha Goled
I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

Conference, in-person (Bangalore)
Cypher 2023
20-22nd Sep, 2023

3 Ways to Join our Community

Whatsapp group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our newsletter

Get the latest updates from AIM