How ML Frameworks Like TensorFlow And PyTorch Handle Gradient Descent

Optimisation is the most important component when we consider machine learning algorithms. It helps in reducing the error and improving the accuracy in the solution of a problem. Gradient Descent is one such algorithm which is used for the purpose of optimisation. Here we take a deeper look at what Gradient Descent is and how it helps in optimisation.

Understanding Gradient Descent

Gradient Descent is the most common optimisation strategy used in ML frameworks. It is basically an iterative algorithm used to minimise a function to its local or global minima. In simple words, Gradient Descent iterates overs a function, adjusting it’s parameters until it finds the minimum. A gradient can be called the partial derivative of a function with respect to its inputs. Basically, it is a measure of the variation in weights with respect to change in error or change in input.

Let us visualise with the simplest example. Consider the following image of a curve:

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

For better understanding visualise two-dimensional section the curve. We will get something like this:

Download our Mobile App

Now imagine a ball being rolled from the top most end of the curve. The objective is to reach the lowest point. The ball will roll down and then up, repeatedly until it rests at the steepest point. This is how Gradient Descent works. The algorithm repeats and adjusts its parameters or coefficients to find the steepest point.

In the ML context, the Gradient Descent is used to minimise the error by adjusting weights after passing through all the samples in the training set. If the weights are updated after a specified subset of training samples, or after each sample in the training set, then it is called a Stochastic Gradient Descent. The higher the gradient, the steeper the slope and the faster a model can learn. But if the slope is zero, the model stops learning.

With this basic understanding, let us now take a look at how the popular ML packages like TensorFlow and PyTorch solve Gradient Descent.

Gradient Descent With TensorFlow

TensorFlow has a Class called  GradientDescentOptimizer to handle Gradient Descent.

Consider the simplest example that illustrates the usage of GradientDescentOptimizer class.

The highlighted part is where the GradientDescentOptimizer is invoked. GradientDescentOptimizer is called with a step of 0.01 which is the standard value.The minimise function minimises the value of the variable error which is defined as the square difference of the actual and predicted set.

The minimise function is a combination of two functions

  • Compute_gradients() : This method returns a list of (gradient, variable) pairs where “gradient” is the gradient for “variable”.
  • apply_gradients() : This is the second part of minimize(). It returns an Operation that applies gradients.

Gradient Descent with PyTorch

PyTorch uses the Class torch.optim.SGD to  Implement stochastic Gradient Descent.

Consider the following illustration.


The lr parameter stands for learning rate or step of the Gradient Descent and model.parameters returns the parameters learned from the data. The gradient buffer is set to zero by the function optimizer.zero_grad() once for every training iteration to reset the gradient computed by the last data batch

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Amal Nair
A Computer Science Engineer turned Data Scientist who is passionate about AI and all related technologies. Contact:

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Is Foxconn Conning India?

Most recently, Foxconn found itself embroiled in controversy when both Telangana and Karnataka governments simultaneously claimed Foxconn to have signed up for big investments in their respective states