What Is A Convolutional Layer?

Convolution is the simple application of a filter to an input image that results in activation,

Most of the classification tasks are based on images and videos. We have seen that to perform classification tasks on images and videos; the convolutional layer plays a key role. “In mathematics, convolution is a mathematical operation of two functions such that it produces a third function that expresses how another function modifies the shape of one function.”

If you try to apply the above definition, the convolution in CNN denotes the operation performed on two images which can be represented as matrices are multiplied to give an output that is used to extract features from an image. Convolution is the simple application of a filter to an input image that results in activation, and repeated application of the same filter throughout the image results in a map of activation called feature map, indicating location and strength of detected feature in an input image.

In this article, we will learn:


Sign up for your weekly dose of what's up in emerging technology.
  1. The intuition of convolution in CNN.
  2. How can filters be handcrafted?
  3. How to calculate feature map from 1D and 2D data?

1. Intuition of convolution in CNN:

The CNN is a special type of neural network model designed to work on images data that can be one-dimensional, two-dimensional, and sometimes three-dimensional. Their application ranges from image and video recognition, image classification, medical image analysis, computer vision and natural language processing.

In the context of CNN, convolution is a linear operation involving the multiplication of a set of weights with the input images represented by metrics similar to traditional neural networks. Here an array of weights is called a filter or kernel. 

Download our Mobile App

Fig1: Convolution operation

Strides defines the motion of the filter; if you set stride=1, which is the default value, the kernel takes one step at a time. 

Usually, the filter size is smaller than the input data, and the type multiplication applied between filter and filter sized sample of input data is the dot product. A dot product is an element-wise multiplication between filter weights and filter sized sample of input data, summed up in a single value.  

Intentionally, the filter size is chosen smaller than that of input data as it allows the same set of filter weights to be multiplied by the input array multiple times at different points on the image. In simple words, the filter is applied systematically to each filter sized input data from left to right and top to bottom.

Fig2: One layer of CNN

This systematic application of the same filter throughout the same image is used to detect specific types of features in input data. As mentioned earlier, the output from the dot product of filter and input image for one time is a single scalar value. This filter is applied multiple times to the input image that results in a two-dimensional output array representing the filter of the input image. Such a two-dimensional output array is called a feature map, and this feature map then passed through some non-linearity like ReLU.  

2. How can filters be handcrafted?

Previously, the filter was designed by a computer vision expert, which is then applied to input image results in the feature map. 

Some examples of 3 x 3 filters;

Horizontal line detector 

 array([[[0., 0., 0.],
         [1., 1., 1.],
         [0., 0., 0.]]]) 

Vertical line detector 

 array([[[0., 1., 0.],
         [0., 1., 0.],
         [0., 1., 0.]]]) 

Applying these filters to an image will contain only horizontal and vertical lines from the input image, called a feature map. The main motto of convolution operation in the neural network is that weights of filters are to be learned by the network while training.

The convolutional neural network does not learn on a single filter; they learn multiple feature maps from a given input image. For example, if you set the filter size to be 30, the network will be executing these 30 different ways to catch features from the input image.  

If you talk more specifically about channels of input images, then the filter must have the same number of channels as input images.

3. How to calculate feature map from 1D and 2D data

Here we can better understand convolution operation and how to extract feature maps.

We can define one-dimensional and two-dimensional data as below,

 data_1D = [0, 1, 0, 1, 1, 0]
 data_2D = [[0, 1, 0, 1, 1, 0],
 [0, 1, 0, 1, 1, 0],
 [0, 1, 0, 1, 1, 0],
 [0, 1, 0, 1, 1, 0],
 [0, 1, 0, 1, 1, 0],
 [0, 1, 0, 1, 1, 0]] 

The input to the Keras Conv1D must be three dimensional, and for Conv2D, it must be four-dimensional. In the case of 1D and 2D, the first dimension represent the number of samples. In this case, we have only one; the second dimension in 1D refers to the length of each sample. In 2D, it refers to a number of rows. Here, in this case, it is six; the third dimension in 1D refers to the number of channels of each sample; for this case, it is one, and in 2D, it refers to the number of columns, in this case, is six.

The fourth dimension in 2D refers to no of channels for each sample.

Therefore output shape must be for Conv1D as [sample, length of sample, channel] in our case, it should be as [1,6,1], and for Conv2D as [samples, rows, columns, channel] in our case, it should be as [1,6,6,1]

Convert data into an array and reshape

 data_1D = np.array(data_1D)
 data_1D = data_1D.reshape(1,6,1)
 data_2D = np.array(data_2D)
 data_2D = data_2D.reshape(1,6,6,1) 

Now we will define the sequential model, which consists of the Conv1D layer, which expects an input shape as [1,6], and the model will have one filter with the shape of three or, in other words, three elements wide. The same will be carried out for Conv2D.

 from keras.models import Sequential
 from keras.layers import Conv1D,Conv2D
 model1 = Sequential()
 model1.add(Conv1D(1,kernel_size = 2,input_shape = (1,6))) 

Here we are explicitly setting the weight of filters; we are defining a filter that is capable of detecting changes in input data.

 weights = [np.array([[[0]],[[1]]]),np.array([0.0])] 

And finally, we can apply our input data to the model to see the convolution operation for that we are using predict method.

         [0.]]], dtype=float32)  

Now we are going to understand what exactly happened in the convolution operation.

First, the two elements of the filter [0,1] are applied to the first two input data elements, [0,1], and the dot product between them results in output as 1. And the same operation is followed till the last two values of input.

Note the length of the feature map is 5, whereas our input data has a length of 6. This is how the filter was applied to the input sequence. You can change the shape of a feature map by setting padding = ‘same’ in the Conv1D layer; it will give the same shape as that of the input sequence.

In the similar way you can calculate the the feature map for 2D data as shown below,

 model2 = Sequential()
 model2.add(Conv2D(1,kernel_size = (3,3), input_shape = (6,6,1), padding = 'same'))
 detectors = [[[[1]],[[0]],[[0]]],
 weights = [np.array(detectors),np.array([0.0])]
          [2.]]]], dtype=float32) 

As we settled padding = ‘same’ that has given output shape of feature map, same as input shape of data.

Link for Google Colab Notebook


In this article, we mainly discussed the Intuition of convolution in convolutional neural networks. We have seen how multiplication is carried out between filter and input data based on how the feature map is created. After that, we have seen what a filter is and how to use custom filters to our data to detect features. And finally, with the help of python codes, we observed all the theoretical discussion practically. 


More Great AIM Stories

Vijaysinh Lendave
Vijaysinh is an enthusiast in machine learning and deep learning. He is skilled in ML algorithms, data manipulation, handling and visualization, model building.

AIM Upcoming Events

Early Bird Passes expire on 3rd Feb

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Do machines feel pain?

Scientists worldwide have been finding ways to bring a sense of awareness to robots, including feeling pain, reacting to it, and withstanding harsh operating conditions.

IT professionals and DevOps say no to low-code

The obsession with low-code is led by its drag-and-drop interface, which saves a lot of time. In low-code, every single process is shown visually with the help of a graphical interface that makes everything easier to understand.

Neuralink elon musk

What could go wrong with Neuralink?

While the broad aim of developing such a BCI is to allow humans to be competitive with AI, Musk wants Neuralink to solve immediate problems like the treatment of Parkinson’s disease and brain ailments.

Understanding cybersecurity from machine learning POV 

Today, companies depend more on digitalisation and Internet-of-Things (IoT) after various security issues like unauthorised access, malware attack, zero-day attack, data breach, denial of service (DoS), social engineering or phishing surfaced at a significant rate.