Modeling the Influence of Data Structure on Learning in Neural Networks: The Hidden Manifold Model

Understanding the reasons for the success of deep neural networks trained using stochastic gradient-based methods is a key open problem for the nascent theory of deep learning. The types of data where these networks are most successful, such as images or sequences of speech, are characterized by int...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Sebastian Goldt, Marc Mézard, Florent Krzakala, Lenka Zdeborová
Formato:	article
Lenguaje:	EN
Publicado:	American Physical Society 2020
Materias:	Physics QC1-999
Acceso en línea:	https://doaj.org/article/1a23fabc856f4ecb8c6ed721b923a393
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:1a23fabc856f4ecb8c6ed721b923a393
record_format	dspace
spelling	oai:doaj.org-article:1a23fabc856f4ecb8c6ed721b923a3932021-12-02T12:17:09ZModeling the Influence of Data Structure on Learning in Neural Networks: The Hidden Manifold Model10.1103/PhysRevX.10.0410442160-3308https://doaj.org/article/1a23fabc856f4ecb8c6ed721b923a3932020-12-01T00:00:00Zhttp://doi.org/10.1103/PhysRevX.10.041044http://doi.org/10.1103/PhysRevX.10.041044https://doaj.org/toc/2160-3308Understanding the reasons for the success of deep neural networks trained using stochastic gradient-based methods is a key open problem for the nascent theory of deep learning. The types of data where these networks are most successful, such as images or sequences of speech, are characterized by intricate correlations. Yet, most theoretical work on neural networks does not explicitly model training data or assumes that elements of each data sample are drawn independently from some factorized probability distribution. These approaches are, thus, by construction blind to the correlation structure of real-world datasets and their impact on learning in neural networks. Here, we introduce a generative model for structured datasets that we call the hidden manifold model. The idea is to construct high-dimensional inputs that lie on a lower-dimensional manifold, with labels that depend only on their position within this manifold, akin to a single-layer decoder or generator in a generative adversarial network. We demonstrate that learning of the hidden manifold model is amenable to an analytical treatment by proving a “Gaussian equivalence property” (GEP), and we use the GEP to show how the dynamics of two-layer neural networks trained using one-pass stochastic gradient descent is captured by a set of integro-differential equations that track the performance of the network at all times. This approach permits us to analyze in detail how a neural network learns functions of increasing complexity during training, how its performance depends on its size, and how it is impacted by parameters such as the learning rate or the dimension of the hidden manifold.Sebastian GoldtMarc MézardFlorent KrzakalaLenka ZdeborováAmerican Physical SocietyarticlePhysicsQC1-999ENPhysical Review X, Vol 10, Iss 4, p 041044 (2020)
institution	DOAJ
collection	DOAJ
language	EN
topic	Physics QC1-999
spellingShingle	Physics QC1-999 Sebastian Goldt Marc Mézard Florent Krzakala Lenka Zdeborová Modeling the Influence of Data Structure on Learning in Neural Networks: The Hidden Manifold Model
description	Understanding the reasons for the success of deep neural networks trained using stochastic gradient-based methods is a key open problem for the nascent theory of deep learning. The types of data where these networks are most successful, such as images or sequences of speech, are characterized by intricate correlations. Yet, most theoretical work on neural networks does not explicitly model training data or assumes that elements of each data sample are drawn independently from some factorized probability distribution. These approaches are, thus, by construction blind to the correlation structure of real-world datasets and their impact on learning in neural networks. Here, we introduce a generative model for structured datasets that we call the hidden manifold model. The idea is to construct high-dimensional inputs that lie on a lower-dimensional manifold, with labels that depend only on their position within this manifold, akin to a single-layer decoder or generator in a generative adversarial network. We demonstrate that learning of the hidden manifold model is amenable to an analytical treatment by proving a “Gaussian equivalence property” (GEP), and we use the GEP to show how the dynamics of two-layer neural networks trained using one-pass stochastic gradient descent is captured by a set of integro-differential equations that track the performance of the network at all times. This approach permits us to analyze in detail how a neural network learns functions of increasing complexity during training, how its performance depends on its size, and how it is impacted by parameters such as the learning rate or the dimension of the hidden manifold.
format	article
author	Sebastian Goldt Marc Mézard Florent Krzakala Lenka Zdeborová
author_facet	Sebastian Goldt Marc Mézard Florent Krzakala Lenka Zdeborová
author_sort	Sebastian Goldt
title	Modeling the Influence of Data Structure on Learning in Neural Networks: The Hidden Manifold Model
title_short	Modeling the Influence of Data Structure on Learning in Neural Networks: The Hidden Manifold Model
title_full	Modeling the Influence of Data Structure on Learning in Neural Networks: The Hidden Manifold Model
title_fullStr	Modeling the Influence of Data Structure on Learning in Neural Networks: The Hidden Manifold Model
title_full_unstemmed	Modeling the Influence of Data Structure on Learning in Neural Networks: The Hidden Manifold Model
title_sort	modeling the influence of data structure on learning in neural networks: the hidden manifold model
publisher	American Physical Society
publishDate	2020
url	https://doaj.org/article/1a23fabc856f4ecb8c6ed721b923a393
work_keys_str_mv	AT sebastiangoldt modelingtheinfluenceofdatastructureonlearninginneuralnetworksthehiddenmanifoldmodel AT marcmezard modelingtheinfluenceofdatastructureonlearninginneuralnetworksthehiddenmanifoldmodel AT florentkrzakala modelingtheinfluenceofdatastructureonlearninginneuralnetworksthehiddenmanifoldmodel AT lenkazdeborova modelingtheinfluenceofdatastructureonlearninginneuralnetworksthehiddenmanifoldmodel
_version_	1718394500075749376

Modeling the Influence of Data Structure on Learning in Neural Networks: The Hidden Manifold Model

Ejemplares similares