How Google’s GLOW Is Increasing Local Language Content On Wikipedia

In India, officially, there are 122 languages, People’s Linguistic Survey of India has identified 780 languages, of which 50 are extinct in the past five decades. Project Tiger, launched by Google in 2017 to increase the number of Wikipedia articles written in underrepresented languages in India, and to include 10 new languages in a handful of countries and regions. It will now be called GLOW, Growing Local Language Content on Wikipedia.

Google’s new investments in Wikipedia, especially in GLOW, will address a genuine problem. The majority of Wikipedia’s tens of millions of articles are in English or European languages like French, German, and Russian. (There are also lots of articles in Swedish and two versions of Filipino, but most of these pages were created by a prolific bot).

GLOW And Project Tiger

GOOGLE pumped an additional $3.1 million into Wikipedia, bringing its total contribution to the free encyclopedia over the past decade to more than $7.5 million, the company announced at the World Economic Forum.

Along with financial boosting, Google also grants access to its machine learning tools so that the Wikipedia editors can use them for free. Wikimedia and Google’s plan to intensify their efforts to make information available in the native languages aligns with our Government’s effort to make India, a powerhouse of this digital age.

Google is also providing Wikipedia free access to its Custom Search API and its Cloud Vision API. The Search API will allow them quickly to look up sources on the web without having to leave Wikipedia, while the vision tool will let editors automatically digitize books so they can be used to support Wikipedia articles.

When the initiative first launched in India, Google provided Chromebooks and internet access to editors, while the Centre for Internet and Society and the Wikimedia India Chapter organized a three-month article writing competition that resulted in nearly 4,500 new Wikipedia articles in 12 different Indic languages. As the number of internet users via smartphones eclipsing that of personal computers, the road to digital inclusion in India looks smooth from here.

Google Also Has A Lot To Benefit From Collaborating With Wikipedia

The rich database of Wikipedia helps Google in carrying out machine learning tasks like NLP. Recently Google set a new benchmark with its NQ dataset which relies on the information available on Wikipedia to train models for NLP tasks like Q&A, which usually powers the voice assistants and chatbots around the world.

The company also has used Wikipedia articles to train machine learning algorithms, as well as fight misinformation on YouTube.

These machine learning tools will absolutely make it easier for Wikipedia to reach people who speak languages currently underrepresented on the web. But the encyclopedia is also the reason many AI programs exist in the first place. The encyclopedia is also used by hundreds of other AI platforms, particularly because every Wikipedia article is under Creative Commons—meaning it can be reproduced for free without copyright restrictions.

Future Looks Bright For GLOW

Wikimedia also announced Google Translate was coming to Wikipedia, allowing editors to convert content into 15 additional languages, bringing the total available to 121.

Google looks forward to extending its GLOW capabilities by including languages from the regions of Indonesia, Mexico, and Nigeria, as well as the Middle East and North Africa—can help Google extending its reach as well.

Many parts of the world still lack the infrastructure to support internet facilities. The near future will see these nations coming to the fore but to make the information available when the infrastructure is ready, is a key aspect to this ambitious goal of making information more accessible and inexpensive.

 

More Great AIM Stories

Ram Sagar
I have a master's degree in Robotics and I write about machine learning advancements.

More Stories

OUR UPCOMING EVENTS

8th April | In-person Conference | Hotel Radisson Blue, Bangalore

Organized by Analytics India Magazine

View Event >>

30th Apr | Virtual conference

Organized by Analytics India Magazine

View Event >>

MORE FROM AIM
Yugesh Verma
All you need to know about Graph Embeddings

Embeddings can be the subgroups of a group, similarly, in graph theory embedding of a graph can be considered as a representation of a graph on a surface, where points of that surface are made up of vertices and arcs are made up of edges

Yugesh Verma
A beginner’s guide to Spatio-Temporal graph neural networks

Spatio-temporal graphs are made of static structures and time-varying features, and such information in a graph requires a neural network that can deal with time-varying features of the graph. Neural networks which are developed to deal with time-varying features of the graph can be considered as Spatio-temporal graph neural networks. 

Vijaysinh Lendave
How to Evaluate Recommender Systems with RGRecSys?

A recommender system, sometimes known as a recommendation engine, is a type of information filtering system that attempts to forecast a user’s “rating” or “preference” for an item. In this post, we will look at RGRecSys, a library that performs constraint evaluation of recommender systems.

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM