AdaCB: An Adaptive Gradient Method with Convergence Range Bound of Learning Rate

Xuanzhi Liao

Shahnorbanun Sahran

Azizi Abdullah and Syaimak Abdul Shukor

Resumen

Adaptive gradient descent methods such as Adam, RMSprop, and AdaGrad achieve great success in training deep learning models. These methods adaptively change the learning rates, resulting in a faster convergence speed. Recent studies have shown their problems include extreme learning rates, non-convergence issues, as well as poor generalization. Some enhanced variants have been proposed, such as AMSGrad, and AdaBound. However, the performances of these alternatives are controversial and some drawbacks still occur. In this work, we proposed an optimizer called AdaCB, which limits the learning rates of Adam in a convergence range bound. The bound range is determined by the LR test, and then two bound functions are designed to constrain Adam, and two bound functions tend to a constant value. To evaluate our method, we carry out experiments on the image classification task, three models including Smallnet, Network IN Network, and Resnet are trained on CIFAR10 and CIFAR100 datasets. Experimental results show that our method outperforms other optimizers on CIFAR10 and CIFAR100 datasets with accuracies of (82.76%, 53.29%), (86.24%, 60.19%), and (83.24%, 55.04%) on Smallnet, Network IN Network and Resnet, respectively. The results also indicate that our method maintains a faster learning speed, like adaptive gradient methods, in the early stage and achieves considerable accuracy, like SGD (M), at the end.

Palabras claves

deep learning - adaptive gradient descent methods - stochastic gradient descent - convergence-range bound - image classification

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 12 Parte: 18 (2022)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

DOI

https://doi.org/10.3390/app12189389

AdaCB: An Adaptive Gradient Method with Convergence Range Bound of Learning Rate

Artículos similares

Revistas destacadas