Complete Tutorial On LeNet-5 | Guide To Begin With CNNs

To start with CNNs, LeNet-5 would be the best to learn first as it is a simple and basic model architecture. In this article, I’ll be discussing the architecture of LeNet-5 which is the very first convolutional neural network to be built.

In deep learning, Convolutional Neural Networks(CNNs or Convnets) take up a major role. CNNs are widely used in computer vision-based problems, natural language processing, time series analysis, recommendation systems. ConvNet architecture mainly has 3 layers – convolutional layer, pooling layer and fully connected layer. All these layers bring out features of the input by finding some pattern using mathematical calculations. Unlike other neural networks architecture, CNNs have a backpropagation algorithm.

To start with CNNs, LeNet-5 would be the best to learn first as it is a simple and basic model architecture. In this article, I’ll be discussing the architecture of LeNet-5 which is the very first convolutional neural network to be built.

What is LeNet-5?

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

LeNet-5 was developed by one of the pioneers of deep learning Yann LeCun in 1998 in his paper ‘Gradient-Based Learning Applied to Document Recognition’. LeNet was used in detecting handwritten cheques by banks based on MNIST dataset. Fully connected networks and activation functions were previously known in neural networks. LeNet-5 introduced convolutional and pooling layers. LeNet-5 is believed to be the base for all other ConvNets.




Source – Yann LeCun’s website showing LeNet-5 demo

A convolution is a linear operation. The convolutional layer does the major job by multiplying weight (kernel/filter) with the input.

A pooling layer generally comes after a convolutional layer. This layer helps in reducing the high dimensionality created by convolutional layers, to curb overfitting.

Architecture

LeNet-5 consists of 7 layers – alternatingly 2 convolutional and 2 average pooling layers, and then 2 fully connected layers and the output layer with activation function softmax. 

Original Image of LeNet-5 architecture

1) MNIST images dimensions are 28 × 28 pixels, but they are zero-padded to 32 × 32 pixels and normalized before being fed forward to the network. Input image shrinks further down the network.

2) In the average pooling layers each neuron computes the mean of its inputs, then multiplies the result by a learnable coefficient and adds a learnable bias term then finally applies the activation function.

3) Most neurons in the 3rd convolutional layer are connected to neurons in only three or four 2nd avg pooling layers.

4) In the output layer each neuron outputs the square of the Euclidean distance between its input vector and its weight vector. Each output measure predicts the probability of the image that belongs to a particular digit class. The cross-entropy cost function is used in this step.

Implementation of LeNet-5

We implement the LeNet-5 using MNIST dataset for handwritten character recognition.

Importing libraries:

import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, AveragePooling2D

Loading MNIST and splitting into training and testing datasets

mnist = tf.keras.datasets.mnist
(x_train, y_train),(x_test, y_test) = mnist.load_data()

Reshaping image dimensions 

x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

Normalization

x_train = tf.keras.utils.normalize(x_train, axis=1)  
x_test = tf.keras.utils.normalize(x_test, axis=1) 

Model Building

model = Sequential()
model.add(Conv2D(filters=6, kernel_size=(3, 3), activation='tanh', input_shape=(28,28,1)))
model.add(AveragePooling2D())
model.add(Conv2D(filters=16, kernel_size=(3, 3), activation='tanh'))
model.add(AveragePooling2D())
model.add(Flatten())
model.add(Dense(units=128, activation='tanh'))
model.add(Dense(units=10, activation = 'softmax'))
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_4 (Conv2D)            (None, 26, 26, 6)         60        
_________________________________________________________________
average_pooling2d_4 (Average (None, 13, 13, 6)         0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 11, 11, 16)        880       
_________________________________________________________________
average_pooling2d_5 (Average (None, 5, 5, 16)          0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 400)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 128)               51328     
_________________________________________________________________
dense_5 (Dense)              (None, 10)                1290      
=================================================================
Total params: 53,558
Trainable params: 53,558
Non-trainable params: 0

Model compilation

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Fitting the model

model.fit(x_train, y_train, 
          epochs=10, 
          validation_data=(x_test, y_test))
Epoch 1/10
1875/1875 [==============================] - 28s 15ms/step - loss: 0.2933 - accuracy: 0.9136 - val_loss: 0.1394 - val_accuracy: 0.9579
Epoch 2/10
1875/1875 [==============================] - 25s 13ms/step - loss: 0.1209 - accuracy: 0.9637 - val_loss: 0.1100 - val_accuracy: 0.9677
Epoch 3/10
1875/1875 [==============================] - 25s 13ms/step - loss: 0.0829 - accuracy: 0.9746 - val_loss: 0.0799 - val_accuracy: 0.9752
Epoch 4/10
1875/1875 [==============================] - 25s 14ms/step - loss: 0.0625 - accuracy: 0.9811 - val_loss: 0.0612 - val_accuracy: 0.9810
Epoch 5/10
1875/1875 [==============================] - 26s 14ms/step - loss: 0.0510 - accuracy: 0.9841 - val_loss: 0.0609 - val_accuracy: 0.9804
Epoch 6/10
1875/1875 [==============================] - 25s 13ms/step - loss: 0.0417 - accuracy: 0.9875 - val_loss: 0.0531 - val_accuracy: 0.9832
Epoch 7/10
1875/1875 [==============================] - 25s 13ms/step - loss: 0.0355 - accuracy: 0.9890 - val_loss: 0.0518 - val_accuracy: 0.9826
Epoch 8/10
1875/1875 [==============================] - 25s 14ms/step - loss: 0.0300 - accuracy: 0.9906 - val_loss: 0.0585 - val_accuracy: 0.9809
Epoch 9/10
1875/1875 [==============================] - 26s 14ms/step - loss: 0.0257 - accuracy: 0.9919 - val_loss: 0.0503 - val_accuracy: 0.9844
Epoch 10/10
1875/1875 [==============================] - 30s 16ms/step - loss: 0.0210 - accuracy: 0.9937 - val_loss: 0.0515 - val_accuracy: 0.9836
<tensorflow.python.keras.callbacks.History at 0x7fcf73ee0ef0>

Model prediction

predictions = model.predict(x_test)
print(np.argmax(predictions[0]))

OUTPUT : 7

plt.imshow(x_test[0],cmap=plt.cm.binary)
plt.show()

So our model has performed well with high accuracy and also predicted correctly.

Note that I’ve followed strictly the architecture and created the model with specified layers and activation functions, these can be tweaked and experimented. For example in place of ‘Tanh’ activation function, ’ReLU’ could be added. 

Conclusion

LeNet -5 is an excellent model for beginners to learn about convolutional neural networks. This helps in the basic understanding of how CNNs work. The functionality of convolutional, pooling and fully connected layers are well explained through this neural network.

The complete code of the above implementation is available at the AIM’s GitHub repositories. Please visit this link to find the code.

Jayita Bhattacharyya
Machine learning and data science enthusiast. Eager to learn new technology advances. A self-taught techie who loves to do cool stuff using technology for fun and worthwhile.

Download our Mobile App

MachineHack

AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIM Research

Pioneering advanced AI market research

Request Customised Insights & Surveys for the AI Industry

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Strengthen Critical AI Skills with Trusted Corporate AI Training

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.