Now Reading
Hands-on Guide To Implementing AlexNet With Keras For Multi-Class Image Classification

Hands-on Guide To Implementing AlexNet With Keras For Multi-Class Image Classification

Dr. Vaibhav Kumar
alexnet in keras

The computer vision is being applied in a variety of applications across the domains and thanks to the deep learning that is continuously giving new frameworks to be used in the computer vision space. As of now, there may be more than hundreds of deep learning models that have proven their capabilities in handling millions of images and producing accurate results. Every deep learning model has a specific architecture and is trained in that specific way. Convolutional neural networks are one of the popular deep learning models that have a wide range of applications in the field of computer vision.

There is a variety of Convolutional Neural Network (CNN) architectures. AlexNet is one of the variants of CNN which is also referred to as a Deep Convolutional Neural Network. In this article, we will discuss the architecture and implementation of AlexNet using Keras library without using transfer learning approach. In the end, we will evaluate the performance of this model in classification.


AlexNet is a deep learning model and it is a variant of the convolutional neural network. This model was proposed by Alex Krizhevsky as his research work. His work was supervised by Geoffery E. Hinton, a well-known name in the field of deep learning research. Alex Krizhevsky competed in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC2012) in the year 2012 where he used the AlexNet model and achieved a top-5 error of 15.3%, more than 10.8 percentage points lower than that of the runner up.

Architecture of AlexNet

The AlexNet proposed by Alex Krizhevsky in his work has eight layers including five convolutional layers followed by three fully connected layers. Some of the convolutional layers of the model are followed by max-pooling layers. As an activation function, the ReLU function is used by the network which shows improved performance over sigmoid and tanh functions.

alexnet model architecture

(Source: Original research paper)

The network consists of a kernel or filters with size 11 x 11, 5 x 5, 3 x 3, 3 x 3 and 3 x 3 for its five convolutional layers respectively. The rest of the parameters of the network can be tuned depending on the training performances. 

The AlexNet employing the transfer learning which uses weights of the pre-trained network on ImageNet dataset has shown exceptional performance. But in this article, we will not use the pre-trained weights and simply define the CNN according to the proposed architecture. 

Implementing in Keras

Here, we will implement the Alexnet in Keras as per the model description given in the research work, Please note that we will not use it a pre-trained model.

Stay Connected

Get the latest updates and relevant offers by sharing your email.

This code was implemented in Google Colab and the .py file was downloaded.

# -*- coding: utf-8 -*-

Automatically generated by Colaboratory.

Original file is located at

In the first step, we will define the AlexNet network using Keras library. The parameters of the network will be kept according to the above descriptions, that is 5 convolutional layers with kernel size 11 x 11, 5 x 5, 3 x 3, 3 x 3 respectively, 3 fully connected layers, ReLU as an activation function at all layers except at the output layer. Since we will test this model in CIFAR10 classification, at the output layer we will define a Dense layer with 10 nodes.

#Importing library
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Flatten, Conv2D, MaxPooling2D
from keras.layers.normalization import BatchNormalization
import numpy as np


AlexNet = Sequential()

#1st Convolutional Layer
AlexNet.add(Conv2D(filters=96, input_shape=(32,32,3), kernel_size=(11,11), strides=(4,4), padding='same'))
AlexNet.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))

#2nd Convolutional Layer
AlexNet.add(Conv2D(filters=256, kernel_size=(5, 5), strides=(1,1), padding='same'))
AlexNet.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))

#3rd Convolutional Layer
AlexNet.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding='same'))

#4th Convolutional Layer
AlexNet.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding='same'))

#5th Convolutional Layer
AlexNet.add(Conv2D(filters=256, kernel_size=(3,3), strides=(1,1), padding='same'))
AlexNet.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='same'))

#Passing it to a Fully Connected layer
# 1st Fully Connected Layer
AlexNet.add(Dense(4096, input_shape=(32,32,3,)))
# Add Dropout to prevent overfitting

#2nd Fully Connected Layer
#Add Dropout

#3rd Fully Connected Layer
#Add Dropout

#Output Layer

#Model Summary

alexnet in keras 

Once the model is defined, we will compile this model and use Adam as an optimizer. We could use stochastic gradient descent (sgd) as well.

# Compiling the model
AlexNet.compile(loss = keras.losses.categorical_crossentropy, optimizer= 'adam', metrics=['accuracy'])

Now, as we are ready with our model, we will check its performance in classification. For the same, we will use the CIFAR10 dataset that is a popular benchmark in image classification. The CIFAR-10 dataset is a publically available image data set provided by the Canadian Institute for Advanced Research (CIFAR). It consists of 60000 32×32 colour images in 10 classes, with 6000 images per class. The 10 different classes represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks. There are 50000 training images and 10000 test images in this dataset.

For more information on the CIFAR10 dataset and its preprocessing for a convolutional neural network, please read my article ‘Transfer Learning for Multi-Class Image Classification Using Deep Convolutional Neural Network’. 

See Also

#Keras library for CIFAR dataset
from keras.datasets import cifar10
(x_train, y_train),(x_test, y_test)=cifar10.load_data()

#Train-validation-test split
from sklearn.model_selection import train_test_split

#Dimension of the CIFAR10 dataset

cifar 10

#Onehot Encoding the labels.
from sklearn.utils.multiclass import unique_labels
from keras.utils import to_categorical

#Since we have 10 classes we should expect the shape[1] of y_train,y_val and y_test to change from 1 to 10

#Verifying the dimension after one hot encoding

cifar 10

#Image Data Augmentation
from keras.preprocessing.image import ImageDataGenerator

train_generator = ImageDataGenerator(rotation_range=2, horizontal_flip=True,zoom_range=.1 )

val_generator = ImageDataGenerator(rotation_range=2, horizontal_flip=True,zoom_range=.1)

test_generator = ImageDataGenerator(rotation_range=2, horizontal_flip= True,zoom_range=.1)

#Fitting the augmentation defined above to the data

After preprocessing the CIFAR10 dataset, we are ready now to train our defined AlexNet model. We will use the learning rate annealer in this experiment. The learning rate annealer decreases the learning rate after a certain number of epochs if the error rate does not change. Here, through this technique, we will monitor the validation accuracy and if it seems to be a plateau in 3 epochs, it will reduce the learning rate by 0.01.

#Learning Rate Annealer
from keras.callbacks import ReduceLROnPlateau
lrr= ReduceLROnPlateau(   monitor='val_acc',   factor=.01,   patience=3,  min_lr=1e-5) 

To train the model, we will define below the number of epochs, the number of batches and the learning rate.

#Defining the parameters
batch_size= 100

Now, we will train our defined AlexNet model.

#Training the model
AlexNet.fit_generator(train_generator.flow(x_train, y_train, batch_size=batch_size), epochs = epochs, steps_per_epoch = x_train.shape[0]//batch_size, validation_data = val_generator.flow(x_val, y_val, batch_size=batch_size), validation_steps = 250, callbacks = [lrr], verbose=1)

alexnet training alexnet training

#After successful training, we will visualize its performance.

import matplotlib.pyplot as plt
#Plotting the training and validation loss

f,ax=plt.subplots(2,1) #Creates 2 subplots under 1 column

#Assigning the first subplot to graph training loss and validation loss
ax[0].plot(AlexNet.history.history['loss'],color='b',label='Training Loss')
ax[0].plot(AlexNet.history.history['val_loss'],color='r',label='Validation Loss')

#Plotting the training accuracy and validation accuracy
ax[1].plot(AlexNet.history.history['accuracy'],color='b',label='Training  Accuracy')
ax[1].plot(AlexNet.history.history['val_accuracy'],color='r',label='Validation Accuracy')


alexnet training performance

We will see the classification performance using a non-normalized and a normalized confusion matrices. For this purpose, first, we will define a function through which the confusion matrices will be plotted.

#Defining function for confusion matrix plot
def plot_confusion_matrix(y_true, y_pred, classes,
    if not title:
        if normalize:
            title = 'Normalized confusion matrix'
            title = 'Confusion matrix, without normalization'

    # Compute confusion matrix
    cm = confusion_matrix(y_true, y_pred)
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        print("Normalized confusion matrix")
        print('Confusion matrix, without normalization')

#Print Confusion matrix
    fig, ax = plt.subplots(figsize=(7,7))
    im = ax.imshow(cm, interpolation='nearest', cmap=cmap)
    ax.figure.colorbar(im, ax=ax)
    # We want to show all ticks...
           xticklabels=classes, yticklabels=classes,
           ylabel='True label',
           xlabel='Predicted label')

    # Rotate the tick labels and set their alignment.
    plt.setp(ax.get_xticklabels(), rotation=45, ha="right",
    # Loop over data dimensions and create text annotations.
    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i in range(cm.shape[0]):
        for j in range(cm.shape[1]):
            ax.text(j, i, format(cm[i, j], fmt),
                    ha="center", va="center",
                    color="white" if cm[i, j] > thresh else "black")
    return ax


In the next step, we will predict the class labels for the test images using the trained AlexNet model.

#Making prediction

#Plotting the confusion matrix
from sklearn.metrics import confusion_matrix

class_names=['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

# Plotting non-normalized confusion matrix
plot_confusion_matrix(y_true, y_pred, classes = class_names,title = 'Confusion matrix, without normalization')

confusion matrix

# Plotting normalized confusion matrix
plot_confusion_matrix(y_true, y_pred, classes=class_names, normalize=True, title='Normalized confusion matrix')

confusion matrix 

The average accuracy score in classifying the unseen test data will be obtained now.

#Classification accuracy
from sklearn.metrics import accuracy_score
acc_score = accuracy_score(y_true, y_pred)
print('Accuracy Score = ', acc_score)

alexnet accuracy score

As we can see above, by analyzing the confusion matrices and the accuracy score, the performance of AlexNet is not very good and the average accuracy score is 64.8%. This is because we did not use the transfer learning approach. Our main purpose in this article was to demonstrate the architecture of the AlexNet model and how it can be defined using Keras library. In the next article, we will use the AlexNet model where transfer learning is applied using the pre-trained weights. 

What Do You Think?

If you loved this story, do join our Telegram Community.

Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.
What's Your Reaction?
In Love
Not Sure

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top