AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization
In this work, we introduce AdaCN, a novel adaptive cubic Newton method for nonconvex stochastic optimization. AdaCN dynamically captures the curvature of the loss landscape by diagonally approximated Hessian plus the norm of difference between previous two estimates. It only requires at most first o...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Hindawi Limited
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/09154e3ff5c64b9a8f24e212323ae4d8 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:09154e3ff5c64b9a8f24e212323ae4d8 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:09154e3ff5c64b9a8f24e212323ae4d82021-11-22T01:09:56ZAdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization1687-527310.1155/2021/5790608https://doaj.org/article/09154e3ff5c64b9a8f24e212323ae4d82021-01-01T00:00:00Zhttp://dx.doi.org/10.1155/2021/5790608https://doaj.org/toc/1687-5273In this work, we introduce AdaCN, a novel adaptive cubic Newton method for nonconvex stochastic optimization. AdaCN dynamically captures the curvature of the loss landscape by diagonally approximated Hessian plus the norm of difference between previous two estimates. It only requires at most first order gradients and updates with linear complexity for both time and memory. In order to reduce the variance introduced by the stochastic nature of the problem, AdaCN hires the first and second moment to implement and exponential moving average on iteratively updated stochastic gradients and approximated stochastic Hessians, respectively. We validate AdaCN in extensive experiments, showing that it outperforms other stochastic first order methods (including SGD, Adam, and AdaBound) and stochastic quasi-Newton method (i.e., Apollo), in terms of both convergence speed and generalization performance.Yan LiuMaojun ZhangZhiwei ZhongXiangrong ZengHindawi LimitedarticleComputer applications to medicine. Medical informaticsR858-859.7Neurosciences. Biological psychiatry. NeuropsychiatryRC321-571ENComputational Intelligence and Neuroscience, Vol 2021 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Computer applications to medicine. Medical informatics R858-859.7 Neurosciences. Biological psychiatry. Neuropsychiatry RC321-571 |
spellingShingle |
Computer applications to medicine. Medical informatics R858-859.7 Neurosciences. Biological psychiatry. Neuropsychiatry RC321-571 Yan Liu Maojun Zhang Zhiwei Zhong Xiangrong Zeng AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
description |
In this work, we introduce AdaCN, a novel adaptive cubic Newton method for nonconvex stochastic optimization. AdaCN dynamically captures the curvature of the loss landscape by diagonally approximated Hessian plus the norm of difference between previous two estimates. It only requires at most first order gradients and updates with linear complexity for both time and memory. In order to reduce the variance introduced by the stochastic nature of the problem, AdaCN hires the first and second moment to implement and exponential moving average on iteratively updated stochastic gradients and approximated stochastic Hessians, respectively. We validate AdaCN in extensive experiments, showing that it outperforms other stochastic first order methods (including SGD, Adam, and AdaBound) and stochastic quasi-Newton method (i.e., Apollo), in terms of both convergence speed and generalization performance. |
format |
article |
author |
Yan Liu Maojun Zhang Zhiwei Zhong Xiangrong Zeng |
author_facet |
Yan Liu Maojun Zhang Zhiwei Zhong Xiangrong Zeng |
author_sort |
Yan Liu |
title |
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
title_short |
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
title_full |
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
title_fullStr |
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
title_full_unstemmed |
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization |
title_sort |
adacn: an adaptive cubic newton method for nonconvex stochastic optimization |
publisher |
Hindawi Limited |
publishDate |
2021 |
url |
https://doaj.org/article/09154e3ff5c64b9a8f24e212323ae4d8 |
work_keys_str_mv |
AT yanliu adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization AT maojunzhang adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization AT zhiweizhong adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization AT xiangrongzeng adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization |
_version_ |
1718418381344866304 |