How Google’s Cloud Vision APIs Analyse A Decade Of Television News And Half A Billion Images

How Google’s Cloud Vision APIs Analyse A Decade Of Television News And Half A Billion Images

With Kalev Leetaru, internet entrepreneur and global thought leader, this talk of Computer Vision DevCon 2020, revolved around the open data GDELT Project that has leveraged Google’s Vision & Video APIs to analyse more than half a billion news images and a decade of television news. In this talk, Kalev Leetaru explains what it looks like to analyse the imagery perception of the news through the eyes of cloud AI and how does one transform the resulting tens of terabytes of JSON annotations into actionable insights?

The talk starts with the idea of taking the world’s open information. So the speaker is essentially talking about reaching around the world and around scooping up the world’s information as data, and leveraging it to understand the global world — from events and narratives to emotions — essentially to build a catalogue for the human society?

GDELT Project

So, according to the speaker — Kalev Leetaru, GDELT Project has many different pieces. With one being the tech, where the project scoops all the data from all around the world, going through each news article. Here Kalev used statistical neural techniques to primarily read through a news article and extract out the factual information from that article.

Case in point — With an example of text, the Cloud Vision API takes that sentence and parses out the meaning and converts the same to an actual code, which can be represented as well as analysed. Not only the API reaches the physical events described in the news, but also goes beyond that can capture the narrative within it. This way, it allows us to map emotions around the world — the happiness through the eyes of the news media. The project — GDELT — is an open data project, which is supported by Alphabet, Jigsaw as well as Google Cloud. 

How It Works & Use Cases

According to the speaker, it all starts with news, and GDELT drew news media from or about over the last three years, trying to reach across the planet, and, and essentially monitoring real-time news media around the world. Also, currently, GDLET is a machine translating 65 languages and uses that catalogue actually to process all this different content. This, in turn, leads to multiple different data sets. GDELT is further using various GCP tools to try to assess that worldwide content.

Explaining further, Kalev said, “So since 2016, we have taken each news article and extracted out the information, getting rid of advertisements and focusing on the actual part of the article itself. We do some initial pre-processing with traditional histogram like perceptual hashes, followed by some basic filtering to identify if the image is large enough or with enough resolution. It also determines if the image is worth processing?”

“So from that, we end up with about a million images a day that we run through Google’s Cloud Vision tool and produce something we call the visual Global Knowledge Graph,” said Kalev. Since 2016, GDELT has processed around a million news images from around the world.

With such advantages, GDELT can be used to search for protests that feature climate change or of polar bears in new media. Case in point — If an image is realised claiming to have a protest happening right now in Iran within the last 10 minutes. Well, using the optical character recognition, it can be checked whether the background of that image is Persian or Egyptian Arabic. Thus, only from the optical character recognition (OCR), images can be questioned of their reliability, reducing fake news.

With the expertise features, an arbitrary image from the new media can be taken, and the OCR can tell whether the image is likely to depict what it claims to be simply by looking at its different characteristics.

Visual Knowledge Graph

  • From 2016 to present 500 million worldwide news images have been annotated through Google’s Cloud Vision totalling quarter trillion pixels updating every 15 minutes.
  • Labels (objects + activities), OCR, geography, facial emotions, EXIF, and reverse image search are the characteristics to define the mapping.
  • Can be used to search for images of protests with “climate change” signs.
  • Can be used for polar bears vs deserts as climate representation.
  • Can also verify images with their reverse image search, captioning as well as language in the background.

Download our Mobile App

Sejuti Das
Sejuti currently works as Associate Editor at Analytics India Magazine (AIM). Reach out at

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

6 IDEs Built for Rust

Rust IDEs aid efficient code development by offering features like code completion, syntax highlighting, linting, debugging tools, and code refactoring