Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers
Abstract Restricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network p...
Guardado en:
Autores principales: | , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Nature Portfolio
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/ab490becf52a484583c18e292dabe3d1 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:ab490becf52a484583c18e292dabe3d1 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:ab490becf52a484583c18e292dabe3d12021-12-02T14:06:55ZAdaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers10.1038/s41598-021-82197-12045-2322https://doaj.org/article/ab490becf52a484583c18e292dabe3d12021-02-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-82197-1https://doaj.org/toc/2045-2322Abstract Restricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters by optimizing the likelihood of predicting an output given hidden states trained on available data. Training such networks often requires sampling over a large probability space that must be approximated during gradient based optimization. Quantum annealing has been proposed as a means to search this space more efficiently which has been experimentally investigated on D-Wave hardware. D-Wave implementation requires selection of an effective inverse temperature or hyperparameter ( $$\beta $$ β ) within the Boltzmann distribution which can strongly influence optimization. Here, we show how this parameter can be estimated as a hyperparameter applied to D-Wave hardware during neural network training by maximizing the likelihood or minimizing the Shannon entropy. We find both methods improve training RBMs based upon D-Wave hardware experimental validation on an image recognition problem. Neural network image reconstruction errors are evaluated using Bayesian uncertainty analysis which illustrate more than an order magnitude lower image reconstruction error using the maximum likelihood over manually optimizing the hyperparameter. The maximum likelihood method is also shown to out-perform minimizing the Shannon entropy for image reconstruction.Guanglei XuWilliam S. OatesNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-10 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Guanglei Xu William S. Oates Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers |
description |
Abstract Restricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters by optimizing the likelihood of predicting an output given hidden states trained on available data. Training such networks often requires sampling over a large probability space that must be approximated during gradient based optimization. Quantum annealing has been proposed as a means to search this space more efficiently which has been experimentally investigated on D-Wave hardware. D-Wave implementation requires selection of an effective inverse temperature or hyperparameter ( $$\beta $$ β ) within the Boltzmann distribution which can strongly influence optimization. Here, we show how this parameter can be estimated as a hyperparameter applied to D-Wave hardware during neural network training by maximizing the likelihood or minimizing the Shannon entropy. We find both methods improve training RBMs based upon D-Wave hardware experimental validation on an image recognition problem. Neural network image reconstruction errors are evaluated using Bayesian uncertainty analysis which illustrate more than an order magnitude lower image reconstruction error using the maximum likelihood over manually optimizing the hyperparameter. The maximum likelihood method is also shown to out-perform minimizing the Shannon entropy for image reconstruction. |
format |
article |
author |
Guanglei Xu William S. Oates |
author_facet |
Guanglei Xu William S. Oates |
author_sort |
Guanglei Xu |
title |
Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers |
title_short |
Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers |
title_full |
Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers |
title_fullStr |
Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers |
title_full_unstemmed |
Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers |
title_sort |
adaptive hyperparameter updating for training restricted boltzmann machines on quantum annealers |
publisher |
Nature Portfolio |
publishDate |
2021 |
url |
https://doaj.org/article/ab490becf52a484583c18e292dabe3d1 |
work_keys_str_mv |
AT guangleixu adaptivehyperparameterupdatingfortrainingrestrictedboltzmannmachinesonquantumannealers AT williamsoates adaptivehyperparameterupdatingfortrainingrestrictedboltzmannmachinesonquantumannealers |
_version_ |
1718391982509785088 |