Gradient descent is an optimization algorithm used for finding local minima of functions. ^{1} While it does not guarantee a global minimum, gradient descent is especially useful for functions which are difficult to solve analytically for precise solutions, such as setting derivatives to zero and solving.

As alluded to above, in the context of neural networks, stochastic gradient descent is used to make informed adjustments to your network's parameters with the goal of minimizing the cost function, thus bringing your network's actual outputs closer and closer, iteratively, to the expected outputs during the course of training. This iterative minimization employs calculus, namely differentiation. After a training step, the network weights receive updates according the gradient of the cost function and the network's current weights, so that the next training step's results may be a little closer to correct (as measured by a smaller cost function). Backpropagation (backward propagation of errors) is the method used to dole these updates out to the network.

**Sources**

Matthew, Mayo. “Deep Learning Key Terms, Explained.” KDnuggets, KDnuggets, 10AD, 2016, http://www.kdnuggets.com/2016/10/deep-learning-key-terms-explained.html (1)