A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification

This paper presents the intrinsic limit determination algorithm (ILD Algorithm), a novel technique to determine the best possible performance, measured in terms of the AUC (area under the ROC curve) and accuracy, that can be obtained from a specific dataset in a binary classification problem with ca...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Umberto Michelucci, Michela Sperti, Dario Piga, Francesca Venturini, Marco A. Deriu
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/add5c84becaa4f53811e8bb6d2edb647
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:add5c84becaa4f53811e8bb6d2edb647
record_format dspace
spelling oai:doaj.org-article:add5c84becaa4f53811e8bb6d2edb6472021-11-25T16:12:49ZA Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification10.3390/a141103011999-4893https://doaj.org/article/add5c84becaa4f53811e8bb6d2edb6472021-10-01T00:00:00Zhttps://www.mdpi.com/1999-4893/14/11/301https://doaj.org/toc/1999-4893This paper presents the intrinsic limit determination algorithm (ILD Algorithm), a novel technique to determine the best possible performance, measured in terms of the AUC (area under the ROC curve) and accuracy, that can be obtained from a specific dataset in a binary classification problem with categorical features regardless of the model used. This limit, namely, the Bayes error, is completely independent of any model used and describes an intrinsic property of the dataset. The ILD algorithm thus provides important information regarding the prediction limits of any binary classification algorithm when applied to the considered dataset. In this paper, the algorithm is described in detail, its entire mathematical framework is presented and the pseudocode is given to facilitate its implementation. Finally, an example with a real dataset is given.Umberto MichelucciMichela SpertiDario PigaFrancesca VenturiniMarco A. DeriuMDPI AGarticlemachine learningintrinsic limitsROC curvebinary classificationarea under the curveNaïve Bayes classifierIndustrial engineering. Management engineeringT55.4-60.8Electronic computers. Computer scienceQA75.5-76.95ENAlgorithms, Vol 14, Iss 301, p 301 (2021)
institution DOAJ
collection DOAJ
language EN
topic machine learning
intrinsic limits
ROC curve
binary classification
area under the curve
Naïve Bayes classifier
Industrial engineering. Management engineering
T55.4-60.8
Electronic computers. Computer science
QA75.5-76.95
spellingShingle machine learning
intrinsic limits
ROC curve
binary classification
area under the curve
Naïve Bayes classifier
Industrial engineering. Management engineering
T55.4-60.8
Electronic computers. Computer science
QA75.5-76.95
Umberto Michelucci
Michela Sperti
Dario Piga
Francesca Venturini
Marco A. Deriu
A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification
description This paper presents the intrinsic limit determination algorithm (ILD Algorithm), a novel technique to determine the best possible performance, measured in terms of the AUC (area under the ROC curve) and accuracy, that can be obtained from a specific dataset in a binary classification problem with categorical features regardless of the model used. This limit, namely, the Bayes error, is completely independent of any model used and describes an intrinsic property of the dataset. The ILD algorithm thus provides important information regarding the prediction limits of any binary classification algorithm when applied to the considered dataset. In this paper, the algorithm is described in detail, its entire mathematical framework is presented and the pseudocode is given to facilitate its implementation. Finally, an example with a real dataset is given.
format article
author Umberto Michelucci
Michela Sperti
Dario Piga
Francesca Venturini
Marco A. Deriu
author_facet Umberto Michelucci
Michela Sperti
Dario Piga
Francesca Venturini
Marco A. Deriu
author_sort Umberto Michelucci
title A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification
title_short A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification
title_full A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification
title_fullStr A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification
title_full_unstemmed A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification
title_sort model-agnostic algorithm for bayes error determination in binary classification
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/add5c84becaa4f53811e8bb6d2edb647
work_keys_str_mv AT umbertomichelucci amodelagnosticalgorithmforbayeserrordeterminationinbinaryclassification
AT michelasperti amodelagnosticalgorithmforbayeserrordeterminationinbinaryclassification
AT dariopiga amodelagnosticalgorithmforbayeserrordeterminationinbinaryclassification
AT francescaventurini amodelagnosticalgorithmforbayeserrordeterminationinbinaryclassification
AT marcoaderiu amodelagnosticalgorithmforbayeserrordeterminationinbinaryclassification
AT umbertomichelucci modelagnosticalgorithmforbayeserrordeterminationinbinaryclassification
AT michelasperti modelagnosticalgorithmforbayeserrordeterminationinbinaryclassification
AT dariopiga modelagnosticalgorithmforbayeserrordeterminationinbinaryclassification
AT francescaventurini modelagnosticalgorithmforbayeserrordeterminationinbinaryclassification
AT marcoaderiu modelagnosticalgorithmforbayeserrordeterminationinbinaryclassification
_version_ 1718413276575956992