Guide to Autoencoders, with Python code

The autoencoder is a specific type of feed-forward neural network where input is the same as output.

An autoencoder is an Artificial Neural Network used to compress and decompress the input data in an unsupervised manner. Compression and decompression operation is data specific and lossy. The autoencoder aims to learn representation known as the encoding for a set of data, which typically results in dimensionality reduction by training the network, along with reduction a reconstruction side is also learned. Data specific means, autoencoder will only be able to compress the data on which they have trained, e.g. if the autoencoder is trained on images of dogs, it will give a poor performance on images of cats. Here lossy operation can be explained as when you share an image on WhatApp, the quality of uploaded/shared image is degraded, in the same way, reconstruction side gives the output. From the below image, watch the quality of the reconstructed image and original image carefully.

Fig1: Schematic of autoencoder 

The autoencoder is a specific type of feed-forward neural network where input is the same as output. As shown in the above figure, to build an autoencoder, we need an encoding method, decoding method and loss function to compare the output with the target. 

First, the input passes through the encoders, which are nothing but fully connected artificial neural networks that produce the further code decoder with a similar structure like ANN, producing output using the same code. Here code is nothing but the compressed version of the input. 


Sign up for your weekly dose of what's up in emerging technology.

Code implementation: 

Autoencoders are in the same way as conventional ANN trained through backpropagation. 

We are mainly going to cover three autoencoder i,e

Download our Mobile App

  1. Simple autoencoder 
  2. Deep CNN autoencoder 
  3. Denoising autoencoder

For the implementation part, we are using a popular MNIST digits data set.

  1. Simple Autoencoder:
 Import all the dependencies
 from keras.layers import Dense,Conv2D,MaxPooling2D,UpSampling2D
 from keras import Input, Model
 from keras.datasets import mnist
 import numpy as np
 import matplotlib.pyplot as plt 

Build the model, here the encoding dimension decides by what amount the image will compress, lesser the dimension more the compression.  

 encoding_dim = 15 
 input_img = Input(shape=(784,))
 # encoded representation of input
 encoded = Dense(encoding_dim, activation='relu')(input_img)
 # decoded representation of code 
 decoded = Dense(784, activation='sigmoid')(encoded)
 # Model which take input image and shows decoded images
 autoencoder = Model(input_img, decoded) 

Build the encoder model decoder model separately so that we can easily differentiate between input and output

 # This model shows encoded images
 encoder = Model(input_img, encoded)
 # Creating a decoder model
 encoded_input = Input(shape=(encoding_dim,))
 # last layer of the autoencoder model
 decoder_layer = autoencoder.layers[-1]
 # decoder model
 decoder = Model(encoded_input, decoder_layer(encoded_input)) 

Compile the model with Adam optimizer and cross entropy loss function, fitment 

autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

 (x_train, y_train), (x_test, y_test) = mnist.load_data(), x_train,
                 validation_data=(x_test, x_test))
 encoded_img = encoder.predict(x_test)
 decoded_img = decoder.predict(encoded_img) 

Using the plot function, you can see the output for encoded and decoded images, respectively as below.

  1. Deep CNN Autoencoder:

As the input is images, it makes more sense to use Convolutional Network; the encoder will consist of a stack of Conv2D and max-pooling layer, whereas the decoder consists of a stack of Conv2D and Upsampling layer.

model = Sequential()

 # encoder network
 model.add(Conv2D(30, 3, activation= 'relu', padding='same', input_shape = (28,28,1)))
 model.add(MaxPooling2D(2, padding= 'same'))
 model.add(Conv2D(15, 3, activation= 'relu', padding='same'))
 model.add(MaxPooling2D(2, padding= 'same')) 
 #decoder network
 model.add(Conv2D(15, 3, activation= 'relu', padding='same'))
 model.add(Conv2D(30, 3, activation= 'relu', padding='same'))
 model.add(Conv2D(1,3,activation='sigmoid', padding= 'same')) # output layer
 model.compile(optimizer= 'adam', loss = 'binary_crossentropy'
 Model: "sequential"
 Layer (type)                 Output Shape              Param #   
 conv2d_17 (Conv2D)           (None, 28, 28, 30)        300       
 max_pooling2d_7 (MaxPooling2 (None, 14, 14, 30)        0         
 conv2d_18 (Conv2D)           (None, 14, 14, 15)        4065      
 max_pooling2d_8 (MaxPooling2 (None, 7, 7, 15)          0         
 conv2d_19 (Conv2D)           (None, 7, 7, 15)          2040      
 up_sampling2d_7 (UpSampling2 (None, 14, 14, 15)        0         
 conv2d_20 (Conv2D)           (None, 14, 14, 30)        4080      
 up_sampling2d_8 (UpSampling2 (None, 28, 28, 30)        0         
 conv2d_21 (Conv2D)           (None, 28, 28, 1)         271       
 Total params: 10,756
 Trainable params: 10,756
 Non-trainable params: 0

Resize the images to 28×28 and scale the values between 0 to 1 and fit the model, x_train,
                 validation_data=(x_test, x_test)) 

Here are the input images and decoded images are given by the CNN based Autoencoder 

  1. Denoising autoencoder:

Let’s check whether the autoencoder can deal with noise in images, noise in the sense of Bluray images, white marker on the images changing the color of images, etc.

Now here we are introducing some noise to our original digits, then we will try to recover those images by the best possible result.

Introduce noise as below

 noise_factor = 0.7
 x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape) 
 x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape) 
 x_train_noisy = np.clip(x_train_noisy, 0., 1.)
 x_test_noisy = np.clip(x_test_noisy, 0., 1.) 

Here is some example of noisy images

 plt.figure(figsize=(20, 2))
 for i in range(1, 5 + 1):
     ax = plt.subplot(1, 5, i)
     plt.imshow(x_test_noisy[i].reshape(28, 28))

You can see that we barely identify digits, intentionally introducing more noise so as to check up to what extent autoencoder can recover the image.

Modify the layers of the above-defined model, such as increase the filter so that model can perform at best and fit the model, x_train,
                 validation_data=(x_test_noisy, x_test))
 pred = model.predict(x_test_noisy) 

Plot function 

 plt.figure(figsize=(20, 4))
 for i in range(5):
     # Display original
     ax = plt.subplot(2, 5, i + 1)
     plt.imshow(x_test_noisy[i].reshape(28, 28))
     # Display reconstruction
     ax = plt.subplot(2, 5, i + 1 + 5)
     plt.imshow(pred[i].reshape(28, 28))


We have seen the structure of autoencoders and practically realised some basic autoencoders. There is a wide range of applications of autoencoders such as Dimensionality reduction image compression, a recommendation system and so on. Here we have trained our model for a few epochs; by increasing the epochs, we can boost the performance and also by increasing the dimension of our network.


More Great AIM Stories

Vijaysinh Lendave
Vijaysinh is an enthusiast in machine learning and deep learning. He is skilled in ML algorithms, data manipulation, handling and visualization, model building.

AIM Upcoming Events

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Early Bird Passes expire on 10th Feb

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Top BI tools for Mainframes

Without BI, organisations will not be able to dominate with data-driven decision-making but focus on experiences, intuition, and gut feelings.