A Bayesian framework for adsorption energy prediction on bimetallic alloy catalysts

Abstract For high-throughput screening of materials for heterogeneous catalysis, scaling relations provides an efficient scheme to estimate the chemisorption energies of hydrogenated species. However, conditioning on a single descriptor ignores the model uncertainty and leads to suboptimal predictio...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Osman Mamun, Kirsten T. Winther, Jacob R. Boes, Thomas Bligaard
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2020
Materias:
Acceso en línea:https://doaj.org/article/49cb2d1a410e45efa84ac22648868d1b
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Abstract For high-throughput screening of materials for heterogeneous catalysis, scaling relations provides an efficient scheme to estimate the chemisorption energies of hydrogenated species. However, conditioning on a single descriptor ignores the model uncertainty and leads to suboptimal prediction of the chemisorption energy. In this article, we extend the single descriptor linear scaling relation to a multi-descriptor linear regression models to leverage the correlation between adsorption energy of any two pair of adsorbates. With a large dataset, we use Bayesian Information Criteria (BIC) as the model evidence to select the best linear regression model. Furthermore, Gaussian Process Regression (GPR) based on the meaningful convolution of physical properties of the metal-adsorbate complex can be used to predict the baseline residual of the selected model. This integrated Bayesian model selection and Gaussian process regression, dubbed as residual learning, can achieve performance comparable to standard DFT error (0.1 eV) for most adsorbate system. For sparse and small datasets, we propose an ad hoc Bayesian Model Averaging (BMA) approach to make a robust prediction. With this Bayesian framework, we significantly reduce the model uncertainty and improve the prediction accuracy. The possibilities of the framework for high-throughput catalytic materials exploration in a realistic setting is illustrated using large and small sets of both dense and sparse simulated dataset generated from a public database of bimetallic alloys available in Catalysis-Hub.org.