AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization

In this work, we introduce AdaCN, a novel adaptive cubic Newton method for nonconvex stochastic optimization. AdaCN dynamically captures the curvature of the loss landscape by diagonally approximated Hessian plus the norm of difference between previous two estimates. It only requires at most first o...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Yan Liu, Maojun Zhang, Zhiwei Zhong, Xiangrong Zeng
Formato: article
Lenguaje:EN
Publicado: Hindawi Limited 2021
Materias:
Acceso en línea:https://doaj.org/article/09154e3ff5c64b9a8f24e212323ae4d8
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:09154e3ff5c64b9a8f24e212323ae4d8
record_format dspace
spelling oai:doaj.org-article:09154e3ff5c64b9a8f24e212323ae4d82021-11-22T01:09:56ZAdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization1687-527310.1155/2021/5790608https://doaj.org/article/09154e3ff5c64b9a8f24e212323ae4d82021-01-01T00:00:00Zhttp://dx.doi.org/10.1155/2021/5790608https://doaj.org/toc/1687-5273In this work, we introduce AdaCN, a novel adaptive cubic Newton method for nonconvex stochastic optimization. AdaCN dynamically captures the curvature of the loss landscape by diagonally approximated Hessian plus the norm of difference between previous two estimates. It only requires at most first order gradients and updates with linear complexity for both time and memory. In order to reduce the variance introduced by the stochastic nature of the problem, AdaCN hires the first and second moment to implement and exponential moving average on iteratively updated stochastic gradients and approximated stochastic Hessians, respectively. We validate AdaCN in extensive experiments, showing that it outperforms other stochastic first order methods (including SGD, Adam, and AdaBound) and stochastic quasi-Newton method (i.e., Apollo), in terms of both convergence speed and generalization performance.Yan LiuMaojun ZhangZhiwei ZhongXiangrong ZengHindawi LimitedarticleComputer applications to medicine. Medical informaticsR858-859.7Neurosciences. Biological psychiatry. NeuropsychiatryRC321-571ENComputational Intelligence and Neuroscience, Vol 2021 (2021)
institution DOAJ
collection DOAJ
language EN
topic Computer applications to medicine. Medical informatics
R858-859.7
Neurosciences. Biological psychiatry. Neuropsychiatry
RC321-571
spellingShingle Computer applications to medicine. Medical informatics
R858-859.7
Neurosciences. Biological psychiatry. Neuropsychiatry
RC321-571
Yan Liu
Maojun Zhang
Zhiwei Zhong
Xiangrong Zeng
AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization
description In this work, we introduce AdaCN, a novel adaptive cubic Newton method for nonconvex stochastic optimization. AdaCN dynamically captures the curvature of the loss landscape by diagonally approximated Hessian plus the norm of difference between previous two estimates. It only requires at most first order gradients and updates with linear complexity for both time and memory. In order to reduce the variance introduced by the stochastic nature of the problem, AdaCN hires the first and second moment to implement and exponential moving average on iteratively updated stochastic gradients and approximated stochastic Hessians, respectively. We validate AdaCN in extensive experiments, showing that it outperforms other stochastic first order methods (including SGD, Adam, and AdaBound) and stochastic quasi-Newton method (i.e., Apollo), in terms of both convergence speed and generalization performance.
format article
author Yan Liu
Maojun Zhang
Zhiwei Zhong
Xiangrong Zeng
author_facet Yan Liu
Maojun Zhang
Zhiwei Zhong
Xiangrong Zeng
author_sort Yan Liu
title AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization
title_short AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization
title_full AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization
title_fullStr AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization
title_full_unstemmed AdaCN: An Adaptive Cubic Newton Method for Nonconvex Stochastic Optimization
title_sort adacn: an adaptive cubic newton method for nonconvex stochastic optimization
publisher Hindawi Limited
publishDate 2021
url https://doaj.org/article/09154e3ff5c64b9a8f24e212323ae4d8
work_keys_str_mv AT yanliu adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization
AT maojunzhang adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization
AT zhiweizhong adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization
AT xiangrongzeng adacnanadaptivecubicnewtonmethodfornonconvexstochasticoptimization
_version_ 1718418381344866304