How Google’s GLOW Is Increasing Local Language Content On Wikipedia

In India, officially, there are 122 languages, People’s Linguistic Survey of India has identified 780 languages, of which 50 are extinct in the past five decades. Project Tiger, launched by Google in 2017 to increase the number of Wikipedia articles written in underrepresented languages in India, and to include 10 new languages in a handful of countries and regions. It will now be called GLOW, Growing Local Language Content on Wikipedia.

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

Google’s new investments in Wikipedia, especially in GLOW, will address a genuine problem. The majority of Wikipedia’s tens of millions of articles are in English or European languages like French, German, and Russian. (There are also lots of articles in Swedish and two versions of Filipino, but most of these pages were created by a prolific bot).

GLOW And Project Tiger

GOOGLE pumped an additional $3.1 million into Wikipedia, bringing its total contribution to the free encyclopedia over the past decade to more than $7.5 million, the company announced at the World Economic Forum.

Along with financial boosting, Google also grants access to its machine learning tools so that the Wikipedia editors can use them for free. Wikimedia and Google’s plan to intensify their efforts to make information available in the native languages aligns with our Government’s effort to make India, a powerhouse of this digital age.

Google is also providing Wikipedia free access to its Custom Search API and its Cloud Vision API. The Search API will allow them quickly to look up sources on the web without having to leave Wikipedia, while the vision tool will let editors automatically digitize books so they can be used to support Wikipedia articles.

When the initiative first launched in India, Google provided Chromebooks and internet access to editors, while the Centre for Internet and Society and the Wikimedia India Chapter organized a three-month article writing competition that resulted in nearly 4,500 new Wikipedia articles in 12 different Indic languages. As the number of internet users via smartphones eclipsing that of personal computers, the road to digital inclusion in India looks smooth from here.

Google Also Has A Lot To Benefit From Collaborating With Wikipedia

The rich database of Wikipedia helps Google in carrying out machine learning tasks like NLP. Recently Google set a new benchmark with its NQ dataset which relies on the information available on Wikipedia to train models for NLP tasks like Q&A, which usually powers the voice assistants and chatbots around the world.

The company also has used Wikipedia articles to train machine learning algorithms, as well as fight misinformation on YouTube.

These machine learning tools will absolutely make it easier for Wikipedia to reach people who speak languages currently underrepresented on the web. But the encyclopedia is also the reason many AI programs exist in the first place. The encyclopedia is also used by hundreds of other AI platforms, particularly because every Wikipedia article is under Creative Commons—meaning it can be reproduced for free without copyright restrictions.

Future Looks Bright For GLOW

Wikimedia also announced Google Translate was coming to Wikipedia, allowing editors to convert content into 15 additional languages, bringing the total available to 121.

Google looks forward to extending its GLOW capabilities by including languages from the regions of Indonesia, Mexico, and Nigeria, as well as the Middle East and North Africa—can help Google extending its reach as well.

Many parts of the world still lack the infrastructure to support internet facilities. The near future will see these nations coming to the fore but to make the information available when the infrastructure is ready, is a key aspect to this ambitious goal of making information more accessible and inexpensive.

 

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

Our Upcoming Events

Masterclass, Virtual
How to achieve real-time AI inference on your CPU
7th Jul

Masterclass, Virtual
How to power applications for the data-driven economy
20th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, Virtual
Deep Learning DevCon 2022
29th Oct

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
MOST POPULAR

What can SEBI learn from casinos?

It is said that casino AI technology comes with superior risk management systems compared to traditional data analytics that regulators are currently using.

Will Tesla Make (it) in India?

Tesla has struggled with optimising their production because Musk has been intent on manufacturing all the car’s parts independent of other suppliers since 2017.