Gradient Descent

Introduction

Introduction to Gradient Descent:

Gradient descent is an algorithm that minimizes the cost function in various machine learning and optimization tasks as a method of optimization. This optimization method is very essential for training and fine-tuning Machine Learning models, especially in cases when the model’s parameters need to be changed and the size of error set up between the predicted output and the actual outcome should be as minimal as possible.

Importance of Gradient Descent:

Gradient descent helps in building a model to minimize the cost function making it an important part of machine learning. Through step-by-step correction of parameters in the opposite direction of the negative cost function, it trains models from data and improves its outcomes.

Where Do We Need Gradient Descent in Life:

Finance: Tweaking investment portfolio decision-making to the extent that maximizes returns and, at the same time, minimizes the chance of trading off the invested capital.
Marketing: Creating advertising strategies that are built to make the campaigns super appealing and have better returns on investment.
Manufacturing: Minimising costs and maximising efficiency with the help of the processes that involve production is the best idea.
Transportation: Desirable route and time perfecting to reduce fuel consumption and time for traveling.

Explanation and Summary:

Gradient descent is the basis of the famous optimization algorithm which is used in many fields of science and technology, and especially in machine learning. Through the iterative provision of changes to the parameters of the model in the direction of the cost function gradient, this (form of optimization) attempts to minimize the deviation (difference) between the actual and predicted outcomes by making the error as low as possible. This mechanism makes machine learning models learn from the data and achieve better accuracy in the forecast or prediction. From authentic demonstrations and analogies, we get to know what moves in the process of approximation, which makes it fundamental in constructing models for complex problems.

Gradient Descent Demo

Learning Rate:
Max Iterations:
Initial Guess:

Context of the demo

Learning Rate: The learning rate is a hyperparameter = the quantity that determines the magnitude of the increment at every iteration of the gradient descent algorithm that is calculated. Such a process is called a parameter update, and to do that, it multiply the gradient by the learning rate. The larger the learning rate the larger the steps taken, while the smaller the learning rate, the more smaller steps are taken. Making a good choice in terms of launching rate is one key factor of the algorithm.
Max Iterations: This is the notation for the limit that the descent limit will perform the maximum number of repeats before stopping. It is a certainty that the function is executed even if the minimizer is not yet reached. Ensuring the algorithm finishes within a reasonable time is another primary feature that must be realized by the maximum number of iterations.
Initial Guess: Gradational Inference Requires a guessing or an initial point from which the optimization process can begin. Depending on the problem, the initial guess is commonly chosen randomly or grounded in previous experience that have determined to be useful. The algorithm repeatedly revises the guess to a situation where the function assumes the smallest value.

The correlation between these parameters :

Learning Rate: Overshooting minimum and failing to converge can be attributes of the learning algorithm if the learning rate is set too high. On another side, when the learning rate is low, the algorithm may converge too slowly.
Max Iterations: Specifies the capability of the algorithm, this is the maximal number of iterations that a certain algorithm should go through. If the maximum number of iterations is selected too small, the algorithm will stop before it reaches the minimum. A print mismatch of large size implies the activity has been going on for so long but spending voidable CPU resources.
Initial Guess: Influences the divergence from which the algorithm is to start the optimization procedure in the optimized process. An initial guess can either be good or bad. A good initial guess paves the way for fast convergence of the algorithm whereas, on the other side, a poor initial guess results in slower convergence, convergence to local minimum/minimum values of the function and not the global optima.

Regarding how the iteration result is calculated, here’s a brief explanation:

The gradient descent algorithm gets started with an initial estimation for the parameter ( x ).
It calculates at that particular point the gradient of the function (which is the function’s slope concerning (x).
It then modifies the parameter (x) by applying the correction step which is built opposite to the gradient, and the learning rate is modified.
A fixed number of iterations or a criterion that determines convergence will serve as the end process.

The final value of the updated variable = (x) and to define the function is collected in struggle 1. The algorithm’s objective is to minimise the objective function, (f(x)), following a process of iteration, which involves updating the value of (x) so it converges to the minimum.

Links and references

Dabbura, I. (2017) ‘Gradient Descent Algorithm and Its Variants’, Towards Data Science. Available at: https://towardsdatascience.com/gradient-descent-algorithm-and-its-variants-10f652806a3 (Accessed: March 8, 2024)

Crypto1 (2024) ‘Gradient Descent Algorithm: How does it Work in Machine Learning?’, Analytics Vidhya. Available at: https://www.analyticsvidhya.com/blog/2020/10/how-does-the-gradient-descent-algorithm-work-in-machine-learning/ (Accessed: March 8, 2024)

https://www.ibm.com/topics/gradient-descent (Accessed: March 8, 2024)

Zaidi, H. (2022) ‘What is a model learning rate? Is a high learning rate always good?’, Data Science Dojo Forum. Available at: https://discuss.datasciencedojo.com/t/what-is-a-model-learning-rate-is-a-high-learning-rate-always-good/710 (Accessed: March 8, 2024)

Introduction