Guide to Feed-Forward Network using Pytorch with MNIST Dataset

Feed Forward Neural Network

Neural Networks are a series of algorithms that imitate the operations of a human brain to understand the relationships present in vast amounts of data. Each “neuron” present in a neural network can be defined as a mathematical function that collects and classifies information according to the specific architecture. The network, in general, comprises interconnected nodes, known as perceptrons. A multi-layered perceptron, or MLP, consists of perceptrons arranged in interconnected layers. The input layer collects input patterns. The output layer has classifications or output signals to which input patterns are mapped. 


A feed-forward neural network is a classification algorithm that consists of a large number of perceptrons, organized in layers & each unit in the layer is connected with all the units or neurons present in the previous layer. These connections are not all equal and can differ in strengths or weights. The weights on these connections cipher the knowledge of the network.

When the data enters at the inputs and passes through the network, layer by layer, there is no feedback in between the layers until it arrives at the outputs. This is the reason why they are known as a feedforward neural network.

About Pytorch

Pytorch is an open-source machine learning and deep learning framework widely used in applications such as natural language processing, image classification and computer vision applications. It was developed by Facebook’s AI Research and later adapted by several conglomerates such as  Uber, Twitter, Salesforce, and NVIDIA. 

PyTorch comes with several specially developed modules like torchtext,  torchvision and other classes such as torch.nn, torch.optim, Dataset, and Dataloader to help you create and train neural networks to work with a different machine and deep learning areas. 

About the Dataset 

The MNIST dataset, also known as the Modified National Institute of Standards and Technology dataset, consists of 60,000 small square 28×28 grayscale images of handwritten digits between 0 to 9 divided into ten different classes. This dataset is mainly used for text classification using deep learning models.

Creating a Feed-Forward Neural Network using Pytorch on MNIST Dataset

Our task will be to create a Feed-Forward classification model on the MNIST dataset.

To achieve this, we will do the following :

  •  Use DataLoader module from Pytorch to load our dataset and Transform It
  •  We will implement Neural Net, with input, hidden & output Layer
  •  Apply Activation Functions 
  •  Set up the Loss & Optimizer and implement a Training Loop that can use batch training
  •  Finally, evaluate the model and calculate our accuracy. 

The below code is in reference to the official implementation, which you can find here.

Installing The Pytorch Package and Importing 

We can install the PyTorch Package using Pip.

!pip3 install torch==1.9.0+cu102 torchvision==0.10.0+cu102 torchaudio===0.9.0 -f

To import them, we will use the following code. 

 import torch
 import torch.nn as nn 
 import torchvision
 import torchvision.transforms as transforms
 import matplotlib.pyplot as plt 

Here, the torch.nn module to help us create and train the neural network.

Creating our Network & Loading The Dataset

We will first define our hyperparameters for our neural network; we are setting our input size to 784 as we know that our dataset contains images of the size 28×28 and flatten this into a one-dimensional array. Other hyperparameters can be tuned or set up according to one’s choice.

 input_size = 784 # 28x28
 hidden_size = 500 
 num_classes = 10
 num_epochs = 2
 batch_size = 100
 learning_rate = 0.001 
Loading the MNIST Dataset from Pytorch
 # Import MNIST dataset 
 train_dataset = torchvision.datasets.MNIST(root='./data', 
 test_dataset = torchvision.datasets.MNIST(root='./data', 
Importing the Dataloader & specify Batch size
 # Data loader
 train_loader =, 
 test_loader =, 

Now, Let’s have a look at a batch of our data.

 examples = iter(test_loader)
 example_data, example_targets =
 for i in range(6):
     plt.imshow(example_data[i][0], cmap='gray') 
Creating our Fully Connected Network with One Hidden Layer 

We will be using the NeuralNet module from Pytorch and ReLU as our activation function.

 # Fully connected neural network with one hidden layer
 class NeuralNet(nn.Module):
     def __init__(self, input_size, hidden_size, num_classes):
         super(NeuralNet, self).__init__()
         self.input_size = input_size
         self.l1 = nn.Linear(input_size, hidden_size) 
         self.relu = nn.ReLU()
         self.l2 = nn.Linear(hidden_size, num_classes)   

Next is to define our feed-forward method and to apply our layers. We will not be not using the Softmax function here as cross-entropy loss implemented further will apply it automatically. 

     def forward(self, x):
         out = self.l1(x)
         out = self.relu(out)
         out = self.l2(out)
         # no activation and no softmax at the end
         return out 

Creating our model, 

model = NeuralNet(input_size, hidden_size, num_classes).to(device)

Setting our Loss and Optimizer Functions 

We are using Adam optimizer here.

 # Loss and optimizer
 criterion = nn.CrossEntropyLoss()
 optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) 
Creating our Training Loop 

In this, we create a loop to loop over our epochs and batches. We will reshape our images for processing according to our input size and calculate our loss using the forward pass.

 n_total_steps = len(train_loader)
 for epoch in range(num_epochs):
     for i, (images, labels) in enumerate(train_loader):  
         # origin shape: [100, 1, 28, 28]
         # resized: [100, 784]
         images = images.reshape(-1, 28*28).to(device)
         labels =
         # Forward pass
         outputs = model(images)
         loss = criterion(outputs, labels)
         # Backward and optimize

To Print the Loss at every 100th step and show our total steps :

         if (i+1) % 100 == 0:
             print (f'Epoch [{epoch+1}/{num_epochs}], Step[{i+1}/{n_total_steps}], Loss: {loss.item():.4f}') 

This will provide us the following output :

 Epoch [1/2], Step [100/600], Loss: 0.2965
 Epoch [1/2], Step [200/600], Loss: 0.3050
 Epoch [1/2], Step [300/600], Loss: 0.3245
 Epoch [1/2], Step [400/600], Loss: 0.4304
 Epoch [1/2], Step [500/600], Loss: 0.1765
 Epoch [1/2], Step [600/600], Loss: 0.1007
 Epoch [2/2], Step [100/600], Loss: 0.1382
 Epoch [2/2], Step [200/600], Loss: 0.0671
 Epoch [2/2], Step [300/600], Loss: 0.0705
 Epoch [2/2], Step [400/600], Loss: 0.1174
 Epoch [2/2], Step [500/600], Loss: 0.0741
 Epoch [2/2], Step [600/600], Loss: 0.1731 
Final Testing of the Model and Evaluating Accuracy 

In the test phase, we don’t need to compute gradients (for memory efficiency).

 with torch.no_grad():
     n_correct = 0
     n_samples = 0
     for images, labels in test_loader:
         images = images.reshape(-1, 28*28).to(device)
         labels =
         outputs = model(images)
         # max returns (value ,index)
         _, predicted = torch.max(, 1)
         n_samples += labels.size(0)
         n_correct += (predicted == labels).sum().item() 

 Print The Total Accuracy 

     acc = 100.0 * n_correct / n_samples
     print(f'Accuracy of the network on the 10000 test images: {acc} %') 

This will provide us with our final output,

Accuracy of the network on the 10000 test images: 97.3%

The accuracy of the model can be improved using hyperparameter tuning and increasing the number of epochs.


This article has implemented a simple Feed Forward Neural Network on the MNIST dataset for image classification using PyTorch Library and tested its accuracy. 

The Colab implementation of the above code can be found here.


Download our Mobile App

Victor Dey
Victor is an aspiring Data Scientist & is a Master of Science in Data Science & Big Data Analytics. He is a Researcher, a Data Science Influencer and also an Ex-University Football Player. A keen learner of new developments in Data Science and Artificial Intelligence, he is committed to growing the Data Science community.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Upcoming Events

15th June | Online

Building LLM powered applications using LangChain

17th June | Online

Mastering LangChain: A Hands-on Workshop for Building Generative AI Applications

Jun 23, 2023 | Bangalore

MachineCon 2023 India

26th June | Online

Accelerating inference for every workload with TensorRT

MachineCon 2023 USA

Jul 21, 2023 | New York

Cypher 2023

Oct 11-13, 2023 | Bangalore

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox