Abstract
In this paper, a comprehensive analysis of the stability of fast stochastic gradient descent algorithms and their modern modifications is presented. Theoretical foundations of convergence, stability conditions, and practical aspects of applying these methods to the optimization of neural network loss functions are discussed.
Keywords
stochastic gradient descent
algorithm stability
loss function optimization
convergence analysis
adaptive optimization methods
Adam optimizer
AdamW
momentum