Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers

Abstract Restricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network p...

Description complète

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Guanglei Xu, William S. Oates
Format:	article
Langue:	EN
Publié:	Nature Portfolio 2021
Sujets:	Medicine R Science Q
Accès en ligne:	https://doaj.org/article/ab490becf52a484583c18e292dabe3d1
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

id	oai:doaj.org-article:ab490becf52a484583c18e292dabe3d1
record_format	dspace
spelling	oai:doaj.org-article:ab490becf52a484583c18e292dabe3d12021-12-02T14:06:55ZAdaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers10.1038/s41598-021-82197-12045-2322https://doaj.org/article/ab490becf52a484583c18e292dabe3d12021-02-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-82197-1https://doaj.org/toc/2045-2322Abstract Restricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters by optimizing the likelihood of predicting an output given hidden states trained on available data. Training such networks often requires sampling over a large probability space that must be approximated during gradient based optimization. Quantum annealing has been proposed as a means to search this space more efficiently which has been experimentally investigated on D-Wave hardware. D-Wave implementation requires selection of an effective inverse temperature or hyperparameter ( $$\beta $$ β ) within the Boltzmann distribution which can strongly influence optimization. Here, we show how this parameter can be estimated as a hyperparameter applied to D-Wave hardware during neural network training by maximizing the likelihood or minimizing the Shannon entropy. We find both methods improve training RBMs based upon D-Wave hardware experimental validation on an image recognition problem. Neural network image reconstruction errors are evaluated using Bayesian uncertainty analysis which illustrate more than an order magnitude lower image reconstruction error using the maximum likelihood over manually optimizing the hyperparameter. The maximum likelihood method is also shown to out-perform minimizing the Shannon entropy for image reconstruction.Guanglei XuWilliam S. OatesNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-10 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Medicine R Science Q
spellingShingle	Medicine R Science Q Guanglei Xu William S. Oates Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
description	Abstract Restricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters by optimizing the likelihood of predicting an output given hidden states trained on available data. Training such networks often requires sampling over a large probability space that must be approximated during gradient based optimization. Quantum annealing has been proposed as a means to search this space more efficiently which has been experimentally investigated on D-Wave hardware. D-Wave implementation requires selection of an effective inverse temperature or hyperparameter ( $$\beta $$ β ) within the Boltzmann distribution which can strongly influence optimization. Here, we show how this parameter can be estimated as a hyperparameter applied to D-Wave hardware during neural network training by maximizing the likelihood or minimizing the Shannon entropy. We find both methods improve training RBMs based upon D-Wave hardware experimental validation on an image recognition problem. Neural network image reconstruction errors are evaluated using Bayesian uncertainty analysis which illustrate more than an order magnitude lower image reconstruction error using the maximum likelihood over manually optimizing the hyperparameter. The maximum likelihood method is also shown to out-perform minimizing the Shannon entropy for image reconstruction.
format	article
author	Guanglei Xu William S. Oates
author_facet	Guanglei Xu William S. Oates
author_sort	Guanglei Xu
title	Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
title_short	Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
title_full	Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
title_fullStr	Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
title_full_unstemmed	Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
title_sort	adaptive hyperparameter updating for training restricted boltzmann machines on quantum annealers
publisher	Nature Portfolio
publishDate	2021
url	https://doaj.org/article/ab490becf52a484583c18e292dabe3d1
work_keys_str_mv	AT guangleixu adaptivehyperparameterupdatingfortrainingrestrictedboltzmannmachinesonquantumannealers AT williamsoates adaptivehyperparameterupdatingfortrainingrestrictedboltzmannmachinesonquantumannealers
_version_	1718391982509785088

Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers

Documents similaires