Have you ever wondered how Facebook labels people in a group photo? Well if you have, then here is the answer. Behind the attractive and cool looking user interface that you see, there is a complex algorithm that recognises the faces in every picture you upload to Facebook and they are always learning to improve. Image Classification is one of the most common problems where AI is applied to solve. In this article, we will explain the basics of CNNs and how to use it for image classification task.
This Tutorial Is Aimed At Beginners Who Want To Work With AI and Keras:
Prerequisites:
- Basic knowledge of Python
- Basic understanding of classification problems
What Is Image Classification
There are a few basic things about an Image Classification problem that you must know before you deep dive in building the convolutional neural network. We know that the machine’s perception of an image is completely different from what we see. In fact, it is only numbers that machines see in an image. Each pixel in the image is given a value between 0 and 255. Thus, for the machine to classify any image, it requires some preprocessing for finding patterns or features that distinguish an image from another.
The Key Processes
Here’s a look at the key stages that help machines to identify patterns in an image:
Convolution: Convolution is performed on an image to identify certain features in an image. Convolution helps in blurring, sharpening, edge detection, noise reduction and more on an image that can help the machine to learn specific characteristics of an image.
Pooling: A convoluted image can be too large and therefore needs to be reduced. Pooling is mainly done to reduce the image without losing features or patterns.
Flattening: Flattening transforms a two-dimensional matrix of features into a vector of features that can be fed into a neural network or classifier.
Full-Connection: Full connection simply refers to the process of feeding the flattened image into a neural network.
Understanding The Problem
We will use Keras and TensorFlow frameworks for building our Convolutional Neural Network.
Consider any classification problem that requires you to classify a set of images in to two categories whether or not they are cats or dogs, apple or oranges etc.
Have your images stored in directories with the directory names as labels.
For example, for a problem to classify apples and oranges and say we have a 1000 images of apple and orange each for training and a 100 images each for testing, then,
- have a directory named /training_set with directories /apple and /orange containing the 1000 images of apple and orange respectively.
- have a directory named /test_set with directories /apple and /orange containing the 100 images of apple and orange respectively.
Let’s start coding
Installing Packages:
- TensorFlow: Install TensorFlow for the desired platform from https://www.TensorFlow.org/install
- Keras :
(Make sure ‘pip’ is installed in your machine)
pip install –upgrade keras
Importing the Libraries and Packages
from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
Initialising the CNN
The first step in creating a Neural network is to initialise the network using the Sequential Class from keras.
model = Sequential()
Convolutional Layer
model.add(Convolution2D(filters = 32, kernel_size = (3, 3),
input_shape = (64, 64, 3),
activation = ‘relu’))
Arguments:
- filters : Denotes the number of Feature detectors.
- kernel_size : Denotes the shape of the feature detector. (3,3) denotes a 3 x 3 matrix.
- input _shape : standardises the size of the input image
- activation : Activation function to break the linearity
Pooling Layer
model.add(MaxPooling2D(pool_size = (2, 2)))
Arguments:
- pool_size : the shape of the pooling window.
Adding a second layer of Convolution and Pooling
model.add(Convolution2D(32, 3, 3, activation = ‘relu’))
model.add(MaxPooling2D(pool_size = (2, 2)))
Flattening Layer
model.add(Flatten())
Full-Connection Layer
Adding the Hidden layer
model.add(Dense(units = 128, activation = ‘relu’))
Adding the Output Layer
model.add(Dense(units = 1, activation = ‘sigmoid’))
Arguments:
- units: Number of nodes in the layer.
- activation : the activation function in each node.
Compiling the CNN
model.compile(optimiser = ‘adam’,
loss = ‘binary_crossentropy’,
metrics = [‘accuracy’])
Arguments:
- Optimiser: the Optimiser used to reduce the cost calculated by cross-entropy
- Loss: the loss function used to calculate the error
- Metrics: the metrics used to represent the efficiency of the model
Generating Image Data
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.1,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
Arguments:
- rescale: Rescaling factor. Defaults to None. If None or 0, no rescaling is applied, otherwise we multiply the data by the value provided
- shear_range: Shear Intensity. Shear angle in a counter-clockwise direction in degrees.
- zoom_range: Range for random zooming of the image.
Fitting images to the CNN
This function lets the classifier directly identify the labels from the name of the directories the image lies in.
training_set = train_datagen.flow_from_directory(‘dataset/training_set’,
target_size = (64, 64),
batch_size = 32,
class_mode = ‘binary’)
test_set = test_datagen.flow_from_directory(‘dataset/test_set’,
target_size = (64, 64),
batch_size = 32,
class_mode = ‘binary’)
Arguments:
- directory: Location of the training_set or test_set
- target_size : The dimensions to which all images found will be resized.Same as input size.
- Batch_size : Size of the batches of data (default: 32).
- Class_mode : Determines the type of label arrays that are returned.One of “categorical”, “binary”, “sparse”, “input”, or None.
Training and Evaluating the model
model.fit_generator(training_set,
samples_per_epoch = 2000,
nb_epoch = 15,
validation_data = test_set,
nb_val_samples = 200)
Arguments:
- generator : A generator sequence used to train the neural network(Training_set).
- Samples_per_epoch : Total number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to the number of samples of your dataset divided by the batch size.
- Nb_epoch : Total number of epochs. One complete cycle of predictions of a neural network is called an epoch.
- Validation_data : A generator sequence used to test and evaluate the predictions of the neural network(Test_set).
- Nb_val_samples :Total number of steps (batches of samples) to yield from validation_data generator before stopping at the end of every epoch.
The above function trains the neural network using the training set and evaluates its performance on the test set. The functions returns two metrics for each epoch ‘acc’ and ‘val_acc’ which are the accuracy of predictions obtained in the training set and accuracy attained in the test set respectively.