Enhancing the pattern recognition capacity of machine learning techniques: The importance of feature positioning

We design several algorithms representing evaluation processes of different complexity, ranging from basic environments based on a predetermined number of features to complex structures involving alternatives defined through decision trees whose number of nodes is determined by the cardinality of th...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Debora Di Caprio, Francisco J. Santos-Arteaga
Formato: article
Lenguaje:EN
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://doaj.org/article/befbd61f2c784f028f2de8f0c3ebb23d
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:We design several algorithms representing evaluation processes of different complexity, ranging from basic environments based on a predetermined number of features to complex structures involving alternatives defined through decision trees whose number of nodes is determined by the cardinality of the respective power sets. The sequential structure of these evaluation processes builds on the information retrieval behavior of users in online search environments. The algorithms generate two strings of data, namely, numerical evaluations determining the retrieval behavior of users and the subsequent choices made by the latter. The way the output obtained from the algorithms is inputted within the vectors summarizing the complexity of the evaluation processes conditions the capacity of machine learning techniques to categorize them correctly. The main purpose of the research is to illustrate numerically two main results. First, machine learning techniques categorize processes correctly even if their characteristic features are presented in a way that prevents their identification using standard statistical techniques. Second, the accuracy of the categorization capacities of these techniques can be substantially enhanced by describing the retrieval processes in the way required to implement standard statistical analyses. We perform a battery of tests using machine learning techniques to demonstrate and analyze these results. Their applicability to classification and prediction problems in medical environments, particularly those constrained by the quality of the data available, is emphasized.