Metadynamics sampling in atomic environment space for collecting training data for machine learning potentials

Abstract The universal mathematical form of machine-learning potentials (MLPs) shifts the core of development of interatomic potentials to collecting proper training data. Ideally, the training set should encompass diverse local atomic environments but conventional approaches are prone to sampling s...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Dongsun Yoo, Jisu Jung, Wonseok Jeong, Seungwu Han
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
Acceso en línea:https://doaj.org/article/4d6bf771dd664f17a0d20898685ce07d
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:4d6bf771dd664f17a0d20898685ce07d
record_format dspace
spelling oai:doaj.org-article:4d6bf771dd664f17a0d20898685ce07d2021-12-02T15:10:54ZMetadynamics sampling in atomic environment space for collecting training data for machine learning potentials10.1038/s41524-021-00595-52057-3960https://doaj.org/article/4d6bf771dd664f17a0d20898685ce07d2021-08-01T00:00:00Zhttps://doi.org/10.1038/s41524-021-00595-5https://doaj.org/toc/2057-3960Abstract The universal mathematical form of machine-learning potentials (MLPs) shifts the core of development of interatomic potentials to collecting proper training data. Ideally, the training set should encompass diverse local atomic environments but conventional approaches are prone to sampling similar configurations repeatedly, mainly due to the Boltzmann statistics. As such, practitioners handpick a large pool of distinct configurations manually, stretching the development period significantly. To overcome this hurdle, methods are being proposed that automatically generate training data. Herein, we suggest a sampling method optimized for gathering diverse yet relevant configurations semi-automatically. This is achieved by applying the metadynamics with the descriptor for the local atomic environment as a collective variable. As a result, the simulation is automatically steered toward unvisited local environment space such that each atom experiences diverse chemical environments without redundancy. We apply the proposed metadynamics sampling to H:Pt(111), GeTe, and Si systems. Throughout these examples, a small number of metadynamics trajectories can provide reference structures necessary for training high-fidelity MLPs. By proposing a semi-automatic sampling method tuned for MLPs, the present work paves the way to wider applications of MLPs to many challenging applications.Dongsun YooJisu JungWonseok JeongSeungwu HanNature PortfolioarticleMaterials of engineering and construction. Mechanics of materialsTA401-492Computer softwareQA76.75-76.765ENnpj Computational Materials, Vol 7, Iss 1, Pp 1-9 (2021)
institution DOAJ
collection DOAJ
language EN
topic Materials of engineering and construction. Mechanics of materials
TA401-492
Computer software
QA76.75-76.765
spellingShingle Materials of engineering and construction. Mechanics of materials
TA401-492
Computer software
QA76.75-76.765
Dongsun Yoo
Jisu Jung
Wonseok Jeong
Seungwu Han
Metadynamics sampling in atomic environment space for collecting training data for machine learning potentials
description Abstract The universal mathematical form of machine-learning potentials (MLPs) shifts the core of development of interatomic potentials to collecting proper training data. Ideally, the training set should encompass diverse local atomic environments but conventional approaches are prone to sampling similar configurations repeatedly, mainly due to the Boltzmann statistics. As such, practitioners handpick a large pool of distinct configurations manually, stretching the development period significantly. To overcome this hurdle, methods are being proposed that automatically generate training data. Herein, we suggest a sampling method optimized for gathering diverse yet relevant configurations semi-automatically. This is achieved by applying the metadynamics with the descriptor for the local atomic environment as a collective variable. As a result, the simulation is automatically steered toward unvisited local environment space such that each atom experiences diverse chemical environments without redundancy. We apply the proposed metadynamics sampling to H:Pt(111), GeTe, and Si systems. Throughout these examples, a small number of metadynamics trajectories can provide reference structures necessary for training high-fidelity MLPs. By proposing a semi-automatic sampling method tuned for MLPs, the present work paves the way to wider applications of MLPs to many challenging applications.
format article
author Dongsun Yoo
Jisu Jung
Wonseok Jeong
Seungwu Han
author_facet Dongsun Yoo
Jisu Jung
Wonseok Jeong
Seungwu Han
author_sort Dongsun Yoo
title Metadynamics sampling in atomic environment space for collecting training data for machine learning potentials
title_short Metadynamics sampling in atomic environment space for collecting training data for machine learning potentials
title_full Metadynamics sampling in atomic environment space for collecting training data for machine learning potentials
title_fullStr Metadynamics sampling in atomic environment space for collecting training data for machine learning potentials
title_full_unstemmed Metadynamics sampling in atomic environment space for collecting training data for machine learning potentials
title_sort metadynamics sampling in atomic environment space for collecting training data for machine learning potentials
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/4d6bf771dd664f17a0d20898685ce07d
work_keys_str_mv AT dongsunyoo metadynamicssamplinginatomicenvironmentspaceforcollectingtrainingdataformachinelearningpotentials
AT jisujung metadynamicssamplinginatomicenvironmentspaceforcollectingtrainingdataformachinelearningpotentials
AT wonseokjeong metadynamicssamplinginatomicenvironmentspaceforcollectingtrainingdataformachinelearningpotentials
AT seungwuhan metadynamicssamplinginatomicenvironmentspaceforcollectingtrainingdataformachinelearningpotentials
_version_ 1718387626583523328