# How ML Frameworks Like TensorFlow And PyTorch Handle Gradient Descent

Optimisation is the most important component when we consider machine learning algorithms. It helps in reducing the error and improving the accuracy in the solution of a problem. Gradient Descent is one such algorithm which is used for the purpose of optimisation. Here we take a deeper look at what Gradient Descent is and how it helps in optimisation.

Gradient Descent is the most common optimisation strategy used in ML frameworks. It is basically an iterative algorithm used to minimise a function to its local or global minima. In simple words, Gradient Descent iterates overs a function, adjusting it’s parameters until it finds the minimum. A gradient can be called the partial derivative of a function with respect to its inputs. Basically, it is a measure of the variation in weights with respect to change in error or change in input.

Let us visualise with the simplest example. Consider the following image of a curve:

#### AIM Daily XO

##### Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy

For better understanding visualise two-dimensional section the curve. We will get something like this:

Now imagine a ball being rolled from the top most end of the curve. The objective is to reach the lowest point. The ball will roll down and then up, repeatedly until it rests at the steepest point. This is how Gradient Descent works. The algorithm repeats and adjusts its parameters or coefficients to find the steepest point.

In the ML context, the Gradient Descent is used to minimise the error by adjusting weights after passing through all the samples in the training set. If the weights are updated after a specified subset of training samples, or after each sample in the training set, then it is called a Stochastic Gradient Descent. The higher the gradient, the steeper the slope and the faster a model can learn. But if the slope is zero, the model stops learning.

With this basic understanding, let us now take a look at how the popular ML packages like TensorFlow and PyTorch solve Gradient Descent.

Consider the simplest example that illustrates the usage of GradientDescentOptimizer class.

The highlighted part is where the GradientDescentOptimizer is invoked. GradientDescentOptimizer is called with a step of 0.01 which is the standard value.The minimise function minimises the value of the variable error which is defined as the square difference of the actual and predicted set.

The minimise function is a combination of two functions

• apply_gradients() : This is the second part of minimize(). It returns an Operation that applies gradients.

PyTorch uses the Class torch.optim.SGD to  Implement stochastic Gradient Descent.

Consider the following illustration.

The lr parameter stands for learning rate or step of the Gradient Descent and model.parameters returns the parameters learned from the data. The gradient buffer is set to zero by the function optimizer.zero_grad() once for every training iteration to reset the gradient computed by the last data batch

## The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

A Computer Science Engineer turned Data Scientist who is passionate about AI and all related technologies. Contact: amal.nair@analyticsindiamag.com

## Our Upcoming Events

24th Mar, 2023 | Webinar

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

### Telegram group

Discover special offers, top stories, upcoming events, and more.

### Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

### 18A, 20A Nodes: Intel’s Last Resort?

Intel’s new manufacturing process could make them a market leader.

### Is Foxconn Conning India?

Most recently, Foxconn found itself embroiled in controversy when both Telangana and Karnataka governments simultaneously claimed Foxconn to have signed up for big investments in their respective states

### Big Tech Turns to Multimodal For Attention

Companies are harnessing LLMs’ potential, integrating it with other models, to move beyond and delve into robotics, possibly AGI.

### ChatGPT Takes NEET; Will it Pass with Flying Colors or Flunk?

We tested ChatGPT on all the questions from NEET 2022, and the results are baffling

### Infosys’ Loss is Tech Mahindra’s Gain

Under Gurnani, Tech Mahindra has been busy in the M&A area.

### Crazy GPT-4 Predictions

With GPT-4 rumoured to release this week, industry predictions and company denials have been firing.

### From Silicon Valley to New Delhi: Can Rajeev Bring Tech Glory to India?

His background as a techie and his exposure to Silicon Valley played a crucial role in his appointment to the IT Ministry today

### Much Awaited Breakthrough in LLMs Has Finally Arrived

The reign of DIY chatbots is set to begin, thanks to Hugging Face.

### Is GPT-4 Powered Bing Chat Still Crazy?

Bing Chat has finally been confirmed to use GPT-4. Is it fixed now?

### Why is NewsGPT Bad News for Journalism?

These models are significantly low on accuracy, hence relying on them for something as significant as news could lead to catastrophic consequences