MITB Banner

Why India Is A Hotspot For Data Labelling Services?

Share

An AI or machine learning model is as good as the data it is trained on. As things stand, 80 percentage time of an AI project is earmarked for wrangling training data, including data labelling. Data labelling takes up the bulk of data scientists’ time, which could otherwise have been devoted to building the algorithm.

Many companies outsource data labelling and annotation so the data scientists can focus on algorithm development and avoid project delays. According to a 2019 Cognilytica report, the market valuation of third-party data labelling services is projected to cross $1 billion by 2023.

Indian market

India has emerged as the top outsourcing destination for data labelling for apparent reasons. Globalisation, demographic advantage, and cheap labour, to name a few. Thanks to the BPO boom in the country at the turn of the century, the Indian workforce was more than ready to lap up data labelling jobs.

Since data labelling is a process-driven task, even a person with a high-school education can pick up the required skills. The market for data annotation and labelling is exploding in India.

“AI requires properly annotated, classified and anonymised data. For this, whether you like it or not, you will use automation, but you will also have to use a skilled human workforce, and that is the opportunity it presents for India,” said Sangeeta Gupta, Senior VP at NASSCOM, in an earlier interview.

In the initial stages, Amazon’s MTurk was the go-to portal to find data labelling and annotating jobs. Freelancers will get paid based on tasks completed. However, Amazon soon put a restriction on non-US workers (lifted later). 

Amazon MTurk paved the way for similar organisations. The popular ones include:

Playment: It is a complete data labelling platform founded by ex-Flipkart employees–Siddharth Mall, Ajinkya Malasane and Akshay Kumar Lal. It breaks down large labelling tasks into micro-tasks and distributes them among its community of trained annotators. As of 2019, Playment’s platform has 300,000 labellers and annotators. A labeller attached with Playment can earn up to 30,000 per month.

iMerit: iMerit’s data labelling services are used in advanced machine learning algorithms, computer vision, natural language processing, augmented reality, and data analytics. As per the company, its workforce is adequately trained to label data for transformative technologies such as cancer research, driverless car training, and crop yield optimisation. The company is funded by Omidyar Network and Micheal and Susan Dell Foundation.

Infolks: Founded in 2016, Infolks is among India’s top data labelling companies. It offers services in machine learning, artificial intelligence, training data as a service, image annotation and data categorisation.

Hot takes

“Numerous data labelling firms have sprung up to address this growing need, and many of them are tapping into a global pool of ‘gig workers’ that can get this done effectively. Software and algorithms make it easier to divvy up tasks and have people work at their convenience. India offers a huge talent pool with ready access to smartphones and the ability to tap into a new income source or to supplement their earnings. Time difference, in this case, can even be an asset,” said Girish Muckai, Chief Sales & Marketing Officer of HEAL Software Inc.

“Training AI models to deliver high levels of accuracy is critical to success. However, labelling training data sets is tedious work. It’s time consuming, complex and requires significant workforce. The tech industry’s outsourcing boom in India and its large population, make it a growing hotbed of this precision work. Its people and skills position India as a key resource for years to come in an increasingly digital world,” said Lori McKellar, Senior Director, Product Marketing at OpenText. 

“India has emerged as a huge pool of employable workers to undertake data labelling jobs. The reasons are essentially the same which led to the expansion of the BPO/KPO service industry in India in the past 20 years:

  • Cost-effective workforce
  • English literacy and basic computing skills
  • High speed and cheap internet
  • Stable economy – compared to some other East-European/African/South-Asian countries

The need to provide a reliable and cheap way to produce training data is paramount now. Most of these are quite low pay + low skill jobs (compared to an average software developer) and require considerable basic training for the employee to become autonomous. Very soon, other developing economies like Romania, Indonesia, Vietnam, the Philippines etc. are likely to follow through and join this sphere, mostly due to the same factors/reasons. If India wants to maintain a lead in this market, we’ll have to keep evolving consistently by providing similar support to other AI operations which require more complexity and mid to high level of technical competency,” said Shishir Thakur, CEO and founder, Cranberry Tech.

Share
Picture of Shraddha Goled

Shraddha Goled

I am a technology journalist with AIM. I write stories focused on the AI landscape in India and around the world with a special interest in analysing its long term impact on individuals and societies. Reach out to me at shraddha.goled@analyticsindiamag.com.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.