No longer confidential: estimating the confidence of individual regression predictions.

Quantitative predictions in computational life sciences are often based on regression models. The advent of machine learning has led to highly accurate regression models that have gained widespread acceptance. While there are statistical methods available to estimate the global performance of regres...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Sebastian Briesemeister, Jörg Rahnenführer, Oliver Kohlbacher
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2012
Materias:
R
Q
Acceso en línea:https://doaj.org/article/c79f21a6301d43409556a7e2ea034705
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:c79f21a6301d43409556a7e2ea034705
record_format dspace
spelling oai:doaj.org-article:c79f21a6301d43409556a7e2ea0347052021-11-18T08:08:36ZNo longer confidential: estimating the confidence of individual regression predictions.1932-620310.1371/journal.pone.0048723https://doaj.org/article/c79f21a6301d43409556a7e2ea0347052012-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23166592/?tool=EBIhttps://doaj.org/toc/1932-6203Quantitative predictions in computational life sciences are often based on regression models. The advent of machine learning has led to highly accurate regression models that have gained widespread acceptance. While there are statistical methods available to estimate the global performance of regression models on a test or training dataset, it is often not clear how well this performance transfers to other datasets or how reliable an individual prediction is-a fact that often reduces a user's trust into a computational method. In analogy to the concept of an experimental error, we sketch how estimators for individual prediction errors can be used to provide confidence intervals for individual predictions. Two novel statistical methods, named CONFINE and CONFIVE, can estimate the reliability of an individual prediction based on the local properties of nearby training data. The methods can be applied equally to linear and non-linear regression methods with very little computational overhead. We compare our confidence estimators with other existing confidence and applicability domain estimators on two biologically relevant problems (MHC-peptide binding prediction and quantitative structure-activity relationship (QSAR)). Our results suggest that the proposed confidence estimators perform comparable to or better than previously proposed estimation methods. Given a sufficient amount of training data, the estimators exhibit error estimates of high quality. In addition, we observed that the quality of estimated confidence intervals is predictable. We discuss how confidence estimation is influenced by noise, the number of features, and the dataset size. Estimating the confidence in individual prediction in terms of error intervals represents an important step from plain, non-informative predictions towards transparent and interpretable predictions that will help to improve the acceptance of computational methods in the biological community.Sebastian BriesemeisterJörg RahnenführerOliver KohlbacherPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 7, Iss 11, p e48723 (2012)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Sebastian Briesemeister
Jörg Rahnenführer
Oliver Kohlbacher
No longer confidential: estimating the confidence of individual regression predictions.
description Quantitative predictions in computational life sciences are often based on regression models. The advent of machine learning has led to highly accurate regression models that have gained widespread acceptance. While there are statistical methods available to estimate the global performance of regression models on a test or training dataset, it is often not clear how well this performance transfers to other datasets or how reliable an individual prediction is-a fact that often reduces a user's trust into a computational method. In analogy to the concept of an experimental error, we sketch how estimators for individual prediction errors can be used to provide confidence intervals for individual predictions. Two novel statistical methods, named CONFINE and CONFIVE, can estimate the reliability of an individual prediction based on the local properties of nearby training data. The methods can be applied equally to linear and non-linear regression methods with very little computational overhead. We compare our confidence estimators with other existing confidence and applicability domain estimators on two biologically relevant problems (MHC-peptide binding prediction and quantitative structure-activity relationship (QSAR)). Our results suggest that the proposed confidence estimators perform comparable to or better than previously proposed estimation methods. Given a sufficient amount of training data, the estimators exhibit error estimates of high quality. In addition, we observed that the quality of estimated confidence intervals is predictable. We discuss how confidence estimation is influenced by noise, the number of features, and the dataset size. Estimating the confidence in individual prediction in terms of error intervals represents an important step from plain, non-informative predictions towards transparent and interpretable predictions that will help to improve the acceptance of computational methods in the biological community.
format article
author Sebastian Briesemeister
Jörg Rahnenführer
Oliver Kohlbacher
author_facet Sebastian Briesemeister
Jörg Rahnenführer
Oliver Kohlbacher
author_sort Sebastian Briesemeister
title No longer confidential: estimating the confidence of individual regression predictions.
title_short No longer confidential: estimating the confidence of individual regression predictions.
title_full No longer confidential: estimating the confidence of individual regression predictions.
title_fullStr No longer confidential: estimating the confidence of individual regression predictions.
title_full_unstemmed No longer confidential: estimating the confidence of individual regression predictions.
title_sort no longer confidential: estimating the confidence of individual regression predictions.
publisher Public Library of Science (PLoS)
publishDate 2012
url https://doaj.org/article/c79f21a6301d43409556a7e2ea034705
work_keys_str_mv AT sebastianbriesemeister nolongerconfidentialestimatingtheconfidenceofindividualregressionpredictions
AT jorgrahnenfuhrer nolongerconfidentialestimatingtheconfidenceofindividualregressionpredictions
AT oliverkohlbacher nolongerconfidentialestimatingtheconfidenceofindividualregressionpredictions
_version_ 1718422194076254208