Inside The Gold Rush For Indian Language-Based AI Products

When in Rome do as the Romans do, they say. Seeing how a report suggests that an Indian consumer spends more than 50 percent time on Hindi videos and more than 40 percent time on regional content, it comes as no surprise that several local players and startups in India have made great strides into the Indian local language ecosystem — especially media. With this trend growing there is a special emphasis on advanced voice products.

The technology giants from the silicon valley are looking to capitalise on the trend of growing local language consumption in the Indian market. India is one of the fastest growing market for mobile as well as internet usage. By December 2017 the number of internet users in India was about 481 millions which was more than 11 percent growth from last year. The same report had also predicted that by June 2018 the internet usage would go up to 500 million taking the percentage adoption of internet to 35 percent of the total population.

Another report by Google and KPMG claims that more than 80 percent of internet consumption all around the world will be heavily video content by 2019. All these statistics are enough signals for technology giants to buckle up and improve their standing amongst the Indian masses when it comes to local language content.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Google believes that around 40 million new consumers adopt the internet every year. This is a combination users who come from both the rural and urban areas. This development has already made great transformations in the culture and the economy. The startups that rose adopting to local languages are one of the greatest by-products of the internet growth burst.

Voice Products

The most popular of AI and language-based applications so far are voice assistants. Reportedly over 90 percent of the current digital voice assistants in India only work with English. As there is a considerable difference between the queries and nature of interaction of English and vernacular language users, the market for the same is wide open.

Recently, Indian e-commerce giant Flipkart acquired AI company The acquisition of around $40 million dollars just underlines the kind of importance voice-based AI applications have gained. The aim was to compete with the likes of Amazon Echo and Google Home, both tremendous AI-driven voice products looking to make a foray into Indian languages.

Commenting on the success of the product, says Subodh Kumar, cofounder of Liv.AI says, “The API recognises 10 major Indian languages — English, Hindi, Bengali, Punjabi, Marathi, Gujarati, Kannada, Tamil, Telugu and Malayalam. Our system works with most accents and performs remarkably well in noisy environments.”

Amazon Web Services the cloud service providing branch of Amazon recently added Hindi support on Amazon Polly, an ML service that turns text into life-like speech. This is an addition to the suite of voice products by Amazon like Amazon Echo and Alexa. Amazon Polly helps users to build and deploy applications that work on speech input. The Hindi support feature helps to build the image of Amazon Polly as a bilingual voice ecosystem.

Navdeep Manaktala, who heads business at Amazon Internet Services Private Limited said, “Customers can now provide mixed input in a variety of dialects including Devanagari Hindi, Romanised Hindi, and English, as well as complete Hindi text input in Devanagari script,” he said.

Sreeraman Thiagarajan, co-founder at Agrahyah Technologies, is of the opinion that many internet companies are committing serious errors by not focussing on local language in voice products. Underlining the importance of voice he says, “Whatever idea and business model you come up with, the process starts with creating a voice user interface (VUI), which allows people to use voice input to control computers, connected devices, and voice assistants like Alexa.”

News And Literature  

AI technologies are also helping the big technology giants and startups to foray into news and literature space. For example, Google’s new project, Navalekha promises to make it easier for writers and publishers by making automated tools that will extract content from existing PDF files and create publish worthy text all via enhanced image processing and computer vision. It will compete with existing players like Pratilipi which connects large audiences to writers and creators.

Rajan Anandan, VP, India and SEA at Google created ripples throughout the startup world when he said, “Right now, the amount of online content in Indian languages is only 1 percent of what’s available in English. Our Navlekhā project comprises a tool that uses AI to render any PDF containing Indian language content into editable text, making it easy for print publishers to create mobile-friendly web content. It also provides Indian language publishers with free web hosting with AdSense support, so they can immediately start monetising their content.”

Many businesses have been adding value to millions of Indians every week. Take for example, Dailyhunt, a multilingual news app which reached around 100 million monthly active users. Umang Bedi, president at Dailyhunt, told a leading newspaper, “The single biggest opportunity on the Indian Internet is the local language, and the future certainly lies in winning the regional language audiences.”

Another success story has been ShareChat India which in its own words is the “best social app to communicate with ​friends​, ​share jokes​, and avail daily news” from India within seconds. Ankush Sachdeva, CEO at ShareChat says, “After studying user behaviour from 2014 to 2015, we realised that most of the users were only seeking local content and did not care for the services. So we pivoted again. In October 2015, we launched the product that is currently running.”

Research firm Tracxn, says there are nearly 146 startups in the content space that particularly focus on news in India and 9 have raised a round of funding since November 2015. Putting together startups from news, user generated content platforms and content aggregators will also lead to a number higher than 146.

Web Content

With purely video content platforms like Netflix launching new series in Indian local languages, the sector has garnered a renewed interest. They are also using AI to serve tailor-made recommendations to users. With the current growth in consumption of vernacular content they will also have to work on the recommendation engine to better match tastes in India.


User generated content (UGC) platforms are also a considerably large market. But there is a limitation to UGC platforms because they need to create a set of audience and also a set of creators to function profitably. These kind of platforms also have a wide demographic reach and can host a variety of content catering to different types of users. But all this is only possible when there is considerable technology advantage and a platform can connect creators and users keeping in mind the local language users and their tastes. Adoption of AI and NLP to understand language will the improve the stickability factor. Use of ML and AI will be really important in the ever growing space of vernacular content to analyse and build features for the UGC platforms based on Indian local languages.

It remains to be seen how far these insights take both both local and global startups and tech businesses in India.

Abhijeet Katte
As a thorough data geek, most of Abhijeet's day is spent in building and writing about intelligent systems. He also has deep interests in philosophy, economics and literature.

Download our Mobile App


AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIM Research

Pioneering advanced AI market research

Request Customised Insights & Surveys for the AI Industry


Strengthen Critical AI Skills with Trusted Corporate AI Training

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.