A beginner’s guide to image matting in Python

Image matting is a very useful technique in image processing which helps in extracting a targeted part of the image.

Share

Published on January 7, 2022

by Vijaysinh Lendave

Image matting is a common image and video editing and composition technique. Traditionally, convolutional neural networks are used to infer the alpha matte from the entire input image and an associated trimap. These methods define the state-of-the-art in image processing, video editing, and filmmaking applications. So, in this article, we will talk about image matting and the procedures involved with this. Finally, we will see how PyMatting, a Python-based toolbox, combines these procedures and produces the desired result. Following are the major points to be discussed in this article.

What is Image Matting?
Need for Image Matting
Modes of Image Matting
Image Matting with PyMatting

Let’s start the discussion by understanding what image matting is.

What is Image Matting?

The task of extracting interesting targets from a static image or a video sequence is known as image matting. It has played a significant role in many images and video editing applications. Image compositing is the inverse procedure of image matting. Although studies on image matting techniques have surfaced, the traditional methods rely primarily on optical means.

With the advancement of computer and digital photography technology, most image matting operations are now performed using digital image processing software. Current image matting techniques can be divided into two categories based on image characteristics: blue screen matting, where the image background is user-defined, and natural image matting, where the background is arbitrary and unknown.

There are two types of traditional matting methods. One example is sampling-based methods.

Given an unknown pixel, these methods sample matched pixels from the foreground and background regions and then find the best combination of these pixels to predict the unknown pixel’s alpha value. These techniques include boundary sampling, ray casting sampling, and so on.

Propagation-based methods are another type. These methods include the Poisson equation-based method, random walks for interactive matting, and closed-form matting, which formulates a cost function based on local smoothness and then solves a linear equation system to find the globally optimized alpha matte.

Need for Image Matting

In image segmentation, there is a broad class of objects that contain characteristics about specific shapes such as human body hair, animal hair, peacock feathers, and spider webs because these special objects are usually narrower than one pixel. Also included are some translucent objects such as clouds, waterfall, glasses, plastic bags, and flames, which are made up of two parts: foreground objects and background colours.

How does the above-mentioned segmentation for these special objects come to fruition? Some traditional image segmentation algorithms typically fail to get started when attempting to solve these problems. Digital image matting technology for transparency/elongated objects may be the best solution in this case.

Modes of Image Matting

Trimap

To properly extract semantically meaningful foreground objects, the user can manually label an input image into three parts prior to matte pulling, namely foreground, background, and unknown regions in the image. The trimap is made up of three parts, as illustrated in Fig. 1 (b). The image matting problem is thus simplified for a given trimap to estimate the foreground colours, background colours, and alpha values for pixels in unknown regions based on the known foreground and background pixels.

Strokes

In contrast to the trimap method, which requires the marking of all image regions, the strokes method only requires the specification of a few foregrounds and background scribbles in the appropriate image regions, as shown in Fig. 1 (c). These marked scribbles are considered input in stroke-based algorithms and are used to extract the alpha matte. Strokes-based algorithms require less user interaction and operation than trimap-based algorithms.

The crucial step in the trimap method is to create an accurate trimap for the desired foreground in the given image. To achieve satisfactory matting results, the unknown regions around the foreground boundaries are expected to be specified as finely as possible. This is due to the fact that an accurate trimap can provide more detailed information about the background and foreground, which improves matting accuracy.

The trimap method has a significant advantage in that drawing the trimap is a natural operation that allows the user to intuitively determine where to place the labels. However, creating a high-quality trimap is usually a time-consuming process, especially for images with a lot of details like hairs and leaves.

In comparison, the strokes method provides more flexibility because it does not require a strict interactive operation. However, the results of strokes-based image matting methods are typically dependent on the locations of the user-specified scribbles.

Making matters worse, the user may be unaware of which labels can produce better matting results, and improper labelling operations can degrade matting quality. Furthermore, unknown image regions generated by the stroke methods are typically larger than those generated by the trimap methods, increasing the overall computation.

Image Matting with PyMatting

In this section, we will take a look at Python-based Toolbox which aims to construct or extract an object from images or video sequences using the alpha matting technique. And it also aims to be computationally efficient and easy to use.

The process involves two inputs from the user: the first is the original image and the second is the Trimap representation of the object within the input image supplied. Here I’m using a dataset from alphamatting.com where we can use a variety of sample images with trimap representation.

The toolbox can be installed using pip as ! pip install pymatting

To know more about the working of the toolbox kindly check the official technical report and Git-Hub repository. Following are the images that we have chosen to obtain a cutout and trimap representation.

Input Trimap for Foreground extraction

Now the simplest way to obtain a foreground image using this toolbox is done using the cutout method, which just takes the above two images and name, location to store the result.

from pymatting import cutout
 
cutout(
   # input image path
   "/content/GT25.png",
   # input trimap path
   "/content/GT25_tri1.png",
   # output cutout path
   "cutout1.png")

And here is the result,

Final Words

Through this article, we have discussed Image matting and the process involved in it. To summarize the process, first, the input image is taken and a trimap is formed for an interesting foreground object. These trimaps will most likely be NumPy nd-arrays of type np.float64 that have the same shape as the input image but only one colour channel Trimap values of 0.0 denote pixels that entirely background and vice-versa. Lastly, by using Alpha Matte the foreground is extracted precisely. All this process can be simply implemented with a python toolbox called PyMatting as we saw.