Guide To Twitter’s Image Crop Analysis

Share

Published on June 23, 2021

by Vijaysinh Lendave

Automated image cropping is a task that constructs a viewport based on the dimension or aspect ratio of a given image that crops the image. It fits the aspect ratio by preserving the most relevant or interesting part of the image within the viewport. It has a wide range of applications, e.g. in the cinematography and broadcasting industry; the film footage’s aspect ratio is changed to match the display aspect ratio. Many modern platforms operate on multiple types of devices that require varying aspect ratios, increasing the number of crops needed even on a single image.

There is not any ideal system. This automated image cropping system is built on machine learning technology designed to ensure that average error is low but do not consider the disparate impact of cases where the error is not uniformly distributed across demographic groups.

The developers of this system focused on an automated image cropping system that automatically crops images that users submit on Twitter to show image previews with respect to different aspect ratios across multiple devices such as a tablet, mobile phones, and desktop. The system employees supervised machine learning model on existing saliency maps to predict saliency score over any given input image. In computer vision, a saliency map is an image that shows each pixel’s unique quality. A saliency map aims to simplify and change the representation of an image into something more meaningful and easier to analyze.

The saliency score is meant to capture the importance of each region of the image. After having a saliency score, the model selects the crop by trying to centre it around the most salient point, shifting to stay within the original image.

Twitter’s image cropping algorithm relies on a machine learning model trained to predict saliency. The Twitter algorithm finds the most salient point, and then a set of heuristics is used to create a suitable center crop around that point for a given aspect ratio.

Cropping is conducted as follow;

For a given image, the image is discretized into a grid of points, and each grid point is associated with a saliency score predicted by the model.

The image and the coordinates of the most salient point and the desired aspect ratio are passed as an input to a cropping algorithm. This is repeated for each aspect ratio to show the image on multiple devices.

If the saliency map is almost symmetric horizontally, then a center crop is performed irrespective of the aspect ratio.

Otherwise, the cropping algorithm tries to ensure that the most salient point is within the crop with the desired aspect ratio. This is done by cropping only one dimension.

The model is trained by using WikiData Query Service consisting of images and labels of celebrities in wikidata. WikiCelb consists of images of individual Wikipedia pages obtained through the wikidata query service. It contains 4073 images of individual labels along with gender identity and ethnic group as recorded on wikidata.

More details regarding this research can be found here.

Code Implementation of Image Crop Analysis:

Load all dependencies

 import logging
 import shlex
 import subprocess
 import sys
 from collections import namedtuple
 from pathlib import Path
 import matplotlib.image as mpimg
 import matplotlib.pyplot as plt
 import numpy as np
 from matplotlib.collections import PatchCollection
 from matplotlib.patches import Rectangle

Load repository and create a virtual environment for google colab, for local machine follow the instruction mentioned in the repository.

 import platform
 BIN_MAPS = {"Darwin": "mac", "Linux": "linux"}
 HOME_DIR = Path("../").expanduser()
 try:
     import google.colab
     ! pip install pandas scikit-learn scikit-image statsmodels requests dash
     ! [[ -d image-crop-analysis ]] || git clone https://github.com/twitter-research/image-crop-analysis.git
     HOME_DIR = Path("./image-crop-analysis").expanduser()
     IN_COLAB = True
 except:
     IN_COLAB = False
 sys.path.append(str(HOME_DIR / "src"))
 bin_dir = HOME_DIR / Path("./bin")
 bin_path = bin_dir / BIN_MAPS[platform.system()] / "candidate_crops"
 model_path = bin_dir / "fastgaze.vxm"
 data_dir = HOME_DIR / Path("./data/")
 data_dir.exists()

Put the image at ‘/content/image-crop-analysis/data’; by default, there will dummy.jpeg image.

Mind the extension carefully; extension of your uploaded image and extension defined in the directory should be the same as below.

 img_path = next(data_dir.glob("./*.jpg"))
 img_path

Output:

PosixPath('image-crop-analysis/data/kohli.jpg')

See the uploaded image.

 img = mpimg.imread(img_path)
 plt.imshow(img)
 plt.gca().add_patch(
     Rectangle((0, 0), 200, 112, linewidth=1, edgecolor="b", facecolor="none"))

Load and run the model

 from crop_api import ImageSaliencyModel
 model = ImageSaliencyModel(crop_binary_path=bin_path, crop_model_path=model_path)
 for img_path in data_dir.glob("*.jpg"):
     print(img_path)
     model.plot_img_crops(img_path)

Output:

From the above output, the automated image crops a yellow rectangle on the image. The leftmost image is a heat map of the saliency score produced by the Twitter Image Cropping model on the image. The location with high predicted saliency could be humans, objects, texts and high contrast background.

The other is original images with their maximum saliency point as a yellow dot, and the automatic cop yellow rectangle is produced for different aspect ratio.

Conclusion:

This article discussed mainly Twitter’s saliency-based image cropping algorithm, which automatically crops images to different aspect ratios by centering crops around the most salient area. The different aspect ratio is produced based on saliency score as shown in the above output. However, this research addresses some major issues faced by previous algorithms, such as cropping light-skinned over dark-skinned individuals and favouring cropping women’s body overhead. One who is interested in detailed research can refer to the paper from the reference below.