In India, officially, there are 122 languages, People’s Linguistic Survey of India has identified 780 languages, of which 50 are extinct in the past five decades. Project Tiger, launched by Google in 2017 to increase the number of Wikipedia articles written in underrepresented languages in India, and to include 10 new languages in a handful of countries and regions. It will now be called GLOW, Growing Local Language Content on Wikipedia.
Sign up for your weekly dose of what's up in emerging technology.
Google’s new investments in Wikipedia, especially in GLOW, will address a genuine problem. The majority of Wikipedia’s tens of millions of articles are in English or European languages like French, German, and Russian. (There are also lots of articles in Swedish and two versions of Filipino, but most of these pages were created by a prolific bot).
GLOW And Project Tiger
GOOGLE pumped an additional $3.1 million into Wikipedia, bringing its total contribution to the free encyclopedia over the past decade to more than $7.5 million, the company announced at the World Economic Forum.
Along with financial boosting, Google also grants access to its machine learning tools so that the Wikipedia editors can use them for free. Wikimedia and Google’s plan to intensify their efforts to make information available in the native languages aligns with our Government’s effort to make India, a powerhouse of this digital age.
Google is also providing Wikipedia free access to its Custom Search API and its Cloud Vision API. The Search API will allow them quickly to look up sources on the web without having to leave Wikipedia, while the vision tool will let editors automatically digitize books so they can be used to support Wikipedia articles.
When the initiative first launched in India, Google provided Chromebooks and internet access to editors, while the Centre for Internet and Society and the Wikimedia India Chapter organized a three-month article writing competition that resulted in nearly 4,500 new Wikipedia articles in 12 different Indic languages. As the number of internet users via smartphones eclipsing that of personal computers, the road to digital inclusion in India looks smooth from here.
Google Also Has A Lot To Benefit From Collaborating With Wikipedia
The rich database of Wikipedia helps Google in carrying out machine learning tasks like NLP. Recently Google set a new benchmark with its NQ dataset which relies on the information available on Wikipedia to train models for NLP tasks like Q&A, which usually powers the voice assistants and chatbots around the world.
The company also has used Wikipedia articles to train machine learning algorithms, as well as fight misinformation on YouTube.
These machine learning tools will absolutely make it easier for Wikipedia to reach people who speak languages currently underrepresented on the web. But the encyclopedia is also the reason many AI programs exist in the first place. The encyclopedia is also used by hundreds of other AI platforms, particularly because every Wikipedia article is under Creative Commons—meaning it can be reproduced for free without copyright restrictions.
Future Looks Bright For GLOW
Wikimedia also announced Google Translate was coming to Wikipedia, allowing editors to convert content into 15 additional languages, bringing the total available to 121.
Google looks forward to extending its GLOW capabilities by including languages from the regions of Indonesia, Mexico, and Nigeria, as well as the Middle East and North Africa—can help Google extending its reach as well.
Many parts of the world still lack the infrastructure to support internet facilities. The near future will see these nations coming to the fore but to make the information available when the infrastructure is ready, is a key aspect to this ambitious goal of making information more accessible and inexpensive.