Active Hackathon

Hands-On Guide to Implement Deep Autoencoder in PyTorch for Image Reconstruction

In this article, we will demonstrate the implementation of a Deep Autoencoder in PyTorch for reconstructing images. This deep learning model will be trained on the MNIST handwritten digits and it will reconstruct the digit images after learning the representation of the input images. 
Deep Autoencoder in PyTorch

Artificial Neural Networks have many popular variants that are applied in supervised and unsupervised learning problems. The Autoeconders are also a variant of neural networks that are mostly applied in unsupervised learning problems. When they come with multiple hidden layers in the architecture, they are referred to as the Deep Autoencoders. These models can be applied in a variety of applications including image reconstruction. In image reconstruction, they learn the representation of the input image pattern and reconstruct the new images matching to the original input image pattern. Image reconstruction has many important applications especially in the medical field where the decoded and noise-free images are required from the available incomplete or noisy images. 

In this article, we will demonstrate the implementation of a Deep Autoencoder in PyTorch for reconstructing images. This deep learning model will be trained on the MNIST handwritten digits and it will reconstruct the digit images after learning the representation of the input images. 


Sign up for your weekly dose of what's up in emerging technology.


Autoencoders are the variants of Artificial Neural Networks which are generally used to learn the efficient data codings in an unsupervised manner. They usually learn in a representation learning scheme where they learn the encoding for a set of data. The network reconstructs the input data in a much similar way by learning its representation. The basic architecture of am Autoencoder is shown below.

deep Autoencoder in pytorch

(Image Source: Wikipedia)

The architecture generally comprises an input layer, an output layer and one or more hidden layers that connect input and output layers. The output layer has the same number of nodes as of input layers because of the purpose that it reconstructs the inputs. In its general form, there is only one hidden layer, but in case of deep autoencoders, there are multiple hidden layers. This increased depth reduces the computational cost of representing some functions and it decreases the amount of training data required to learn some functions. The popular applications of autoencoder include anomaly detection, image processing, information retrieval, drug discovery etc.

Implementing Deep Autoencoder in PyTorch

First of all, we will import all the required libraries.

import os
import torch 
import torchvision
import torch.nn as nn
import torchvision.transforms as transforms
import torch.optim as optim
import matplotlib.pyplot as plt
import torch.nn.functional as F
from torchvision import datasets
from import DataLoader
from torchvision.utils import save_image
from PIL import Image

Now, we will define the values for the hyperparameters.

Epochs = 100
Lr_Rate = 1e-3
Batch_Size = 128

The below function will be used for image transformation that is required for the PyTorch model.

transform = transforms.Compose([
    transforms.Normalize((0.5,), (0.5,))

Using the below code snippet, we will download the MNIST handwritten digit dataset and get it ready for further processing.

train_set = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_set = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

train_loader = DataLoader(train_set, Batch_Size=Batch_Size, shuffle=True)
test_loader = DataLoader(test_set, Batch_Size=Batch_Size, shuffle=True)

Let us see some information on the training data and its classes.


deep Autoencoder in pytorch


deep Autoencoder in pytorch

In the next step, we will define the Autoencoder class that will be used to define the main model.

class Autoencoder(nn.Module):
    def __init__(self):
        super(Autoencoder, self).__init__()

        self.enc1 = nn.Linear(in_features=784, out_features=256) # Input image (28*28 = 784)
        self.enc2 = nn.Linear(in_features=256, out_features=128)
        self.enc3 = nn.Linear(in_features=128, out_features=64)
        self.enc4 = nn.Linear(in_features=64, out_features=32)
        self.enc5 = nn.Linear(in_features=32, out_features=16)

        self.dec1 = nn.Linear(in_features=16, out_features=32)
        self.dec2 = nn.Linear(in_features=32, out_features=64)
        self.dec3 = nn.Linear(in_features=64, out_features=128)
        self.dec4 = nn.Linear(in_features=128, out_features=256)
        self.dec5 = nn.Linear(in_features=256, out_features=784) # Output image (28*28 = 784)

    def forward(self, x):
        x = F.relu(self.enc1(x))
        x = F.relu(self.enc2(x))
        x = F.relu(self.enc3(x))
        x = F.relu(self.enc4(x))
        x = F.relu(self.enc5(x))

        x = F.relu(self.dec1(x))
        x = F.relu(self.dec2(x))
        x = F.relu(self.dec3(x))
        x = F.relu(self.dec4(x))
        x = F.relu(self.dec5(x))

        return x

Now, we will create the Autoencoder model as an object of the Autoencoder class that we have defined above.

model = Autoencoder()

deep Autoencoder in pytorch

Now, the loss criteria and the optimization methods will be defined.

criterion = nn.MSELoss()
optimizer = optim.Adam(net.parameters(), lr=Lr_Rate)

The below function will enable the CUDA environment.

def get_device():
    if torch.cuda.is_available():
        device = 'cuda:0'
        device = 'cpu'
    return device

The below function will create a directory to save the results.

def make_dir():
    image_dir = 'MNIST_Out_Images'
    if not os.path.exists(image_dir):

Using the below function, we will save the reconstructed images as generated by the model.

def save_decod_img(img, epoch):
    img = img.view(img.size(0), 1, 28, 28)
    save_image(img, './MNIST_Out_Images/Autoencoder_image{}.png'.format(epoch))

The below function will be called to train the model.

def training(model, train_loader, Epochs):
    train_loss = []
    for epoch in range(Epochs):
        running_loss = 0.0
        for data in train_loader:
            img, _ = data
            img =
            img = img.view(img.size(0), -1)
            outputs = model(img)
            loss = criterion(outputs, img)
            running_loss += loss.item()

        loss = running_loss / len(train_loader)
        print('Epoch {} of {}, Train Loss: {:.3f}'.format(
            epoch+1, Epochs, loss))

        if epoch % 5 == 0:
            save_decod_img(outputs.cpu().data, epoch)

    return train_loss

 The below function will test the trained model on image reconstruction.

def test_image_reconstruct(model, test_loader):
     for batch in test_loader:
        img, _ = batch
        img =
        img = img.view(img.size(0), -1)
        outputs = model(img)
        outputs = outputs.view(outputs.size(0), 1, 28, 28).cpu().data
        save_image(outputs, 'MNIST_reconstruction.png')

Before training, the model will be pushed to the CUDA environment and the directory will be created to save the result images using the functions defined above.

device = get_device()

Now, the training of the model will be performed.

train_loss = training(model, train_loader, Epochs)

After successful training, we will visualize the loss during training.

plt.title('Train Loss')

deep Autoencoder in PyTorch

We will visualize several images that are saved during training.'/content/MNIST_Out_Images/Autoencoder_image0.png')



In the last step, we will test our autoencoder model to reconstruct the images.

test_image_reconstruct(model, testloader)'/content/MNIST_reconstruction.png')
deep Autoencoder in PyTorch

So, as we could see that the autoencoder model started reconstructing the images since the start of the training process. After the first epoch, this reconstruction was not proper and was improved until the 50th epochs. After the complete training, as we can see in the image generated after the 95th epoch and on testing, it can construct the images very well matching to the original input images. Further, it opens a scope to train the model for more number of epochs as 100 or 200 because we have seen a heavy loss during training that was getting decreased epoch by epoch. After a long training, it is expected to obtain more clear reconstructed images. However, we could understand using this demonstration how to implement deep autoencoders in PyTorch for image reconstruction.


  1. Sovit Ranjan Rath, “Implementing Deep Autoencoder in PyTorch”
  2. Abien Fred Agarap, “Implementing an Autoencoder in PyTorch”
  3. Reyhane Askari, “Auto Encoders”

More Great AIM Stories

Dr. Vaibhav Kumar
Vaibhav Kumar has experience in the field of Data Science and Machine Learning, including research and development. He holds a PhD degree in which he has worked in the area of Deep Learning for Stock Market Prediction. He has published/presented more than 15 research papers in international journals and conferences. He has an interest in writing articles related to data science, machine learning and artificial intelligence.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
21st Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM