The performance of a deep learning model is influenced by large datasets and diversity of the dataset. But, there might be situations where the dataset is simply not large enough or diverse enough. In such cases, data augmentation is used. Data augmentation is a technique that enables you to significantly increase the diversity of data available for training models, without actually collecting new data. Although deep learning models come with inbuilt methods to augment the data, these can be inefficient or lacking some required functionality.
In this article, we will learn about an augmentation package for machine learning specifically using the PyTorch framework called Albumentation.
What is albumentation library?
Albumentation is a fast image augmentation library and easy to use with other libraries as a wrapper. The package is written on NumPy, OpenCV, and imgaug. What makes this library different is the number of data augmentation techniques that are available. While most of the augmentation libraries include techniques like cropping, flipping, rotating and scaling, albumentation provides a range of very extensive image augmentation techniques like contrast, blur and channel shuffle. Here is the range of augmentations that can be performed.
The above image was downloaded from the official GitHub repository of albumentations library.
Why is albumentation better?
The reason this library gained popularity in a small period of time is because of the features it offers. Some of the reasons why this library is better are:
- Performance: Albumentations delivers the best performance on most of the commonly used augmentations. It does this by wrapping several low-level image manipulation libraries and selects the fastest implementation.
- Variety: This library not only contains the common image manipulation techniques but a wide variety of image transforms. This is helpful for the task and domain-specific applications.
- Flexibility: Because this package is fairly new, there are multiple image transformations that are proposed and the package has to undergo these changes. But, albumentation has proven to be quite flexible in research and is easily adaptable to the changes.
Hands-on implementation of albumentation transformations
As mentioned earlier, this library gives a wide range of transformations other than the ones commonly used in other libraries. Let us see how we can implement these transformations on one image.
Let us start with a single transformation. I have chosen a random image from google and will perform a horizontal flip.
import random import cv2 from matplotlib import pyplot as plt import albumentations as A def view_transform(image): plt.figure(figsize=(5, 5)) plt.axis('off') plt.imshow(image) figure = cv2.imread(‘image.jpg’) figure = cv2.cvtColor(figure, cv2.COLOR_BGR2RGB) view_transform(figure)
transform = A.HorizontalFlip(p=0.5) random.seed(7) augmented_image = transform(image=figure)['image'] view_transform(augmented_image)
The real power of albumentation is in pipelining different transformations for the image at once. Let us implement this pipeline. I will pipeline
- CLAHE: Contrast Limited Adaptive Histogram Equalization to equalize images
- Cutout: takes out a part of the image that is not very important for classification.
- Random rotate: rotates the image by a certain degree
- Blur: that reduces the intensity of pixels to appear blur
- Optical distortion: This distorts certain elements of the image.
- ShiftScaleRotate: Allows you to scale and rotate the image by certain angles.
transform = A.Compose([ A.CLAHE(), A.RandomRotate90(), A.Transpose(), A.Cutout(num_holes=1, max_h_size=16,max_w_size = 16,p=1), A.ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.50, rotate_limit=45, p=.75), A.Blur(blur_limit=3), A.OpticalDistortion(), ]) random.seed(42) augmented_image = transform(image=figure)['image'] view_transform(augmented_image)
As you can see, all of these transformations are applied in a pipeline and in a really quick and efficient way.
Another interesting feature of this is called the OneOf method. Here, the transformation defined in the OneOf block is assigned with probabilities. These are normalized and the transformation with the highest normalized value is selected and applied on the image. This way, there is more efficiency in applying suitable transformations.
transform = A.Compose([ A.RandomRotate90(), A.Flip(), A.Transpose(), A.OneOf([ A.MotionBlur(p=.2), A.MedianBlur(blur_limit=3, p=0.3), A.Blur(blur_limit=3, p=0.1), ], p=0.2), A.ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.2, rotate_limit=45, p=0.2), A.OneOf([ A.OpticalDistortion(p=0.3), A.GridDistortion(p=.1), ], p=0.2), A.OneOf([ A.CLAHE(clip_limit=2), A.RandomBrightnessContrast(), ], p=0.3), A.HueSaturationValue(p=0.3), ]) random.seed(42) augmented_image = transform(image=figure)['image'] view_transform(augmented_image)
In this example above, the one of the method has motion blur, median blur and blur with assigned probabilities. Let us normalize this to see which has the highest probability.
Motion blur = (0.2 )/(0.2+0.3+0.1) =0.3
Median blur = (0.3)/(0.2+0.3+0.1)=0.5
Blur = (0.1)/(0.2+0.3+0.1)=0.17
The above calculations make it clear that the median blur will be applied. This way of pipelining increased the way the CPU is used.
Here is another example where I have applied multiple transformations on an image using albumentation.
Shift scale rotate
The article covers the different transformations that can be applied using the albumentation library. Various code samples provided in this article help to start using these packages for classification, segmentation, and object detection tasks in machine learning projects. This library is still developing and is very robust in adapting to changes. The various methods of transformation help in diversifying the data and creating larger datasets.