Comprehensive decision tree models in bioinformatics.

<h4>Purpose</h4>Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the cla...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Gregor Stiglic, Simon Kocbek, Igor Pernek, Peter Kokol
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2012
Materias:
R
Q
Acceso en línea:https://doaj.org/article/782c508894924c779893ef1e028a65c0
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:782c508894924c779893ef1e028a65c0
record_format dspace
spelling oai:doaj.org-article:782c508894924c779893ef1e028a65c02021-11-18T07:23:41ZComprehensive decision tree models in bioinformatics.1932-620310.1371/journal.pone.0033812https://doaj.org/article/782c508894924c779893ef1e028a65c02012-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/22479449/?tool=EBIhttps://doaj.org/toc/1932-6203<h4>Purpose</h4>Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible.<h4>Methods</h4>This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree.<h4>Results</h4>The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree.<h4>Conclusions</h4>The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly redundant attributes that are very common in bioinformatics.Gregor StiglicSimon KocbekIgor PernekPeter KokolPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 7, Iss 3, p e33812 (2012)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Gregor Stiglic
Simon Kocbek
Igor Pernek
Peter Kokol
Comprehensive decision tree models in bioinformatics.
description <h4>Purpose</h4>Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible.<h4>Methods</h4>This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree.<h4>Results</h4>The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree.<h4>Conclusions</h4>The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly redundant attributes that are very common in bioinformatics.
format article
author Gregor Stiglic
Simon Kocbek
Igor Pernek
Peter Kokol
author_facet Gregor Stiglic
Simon Kocbek
Igor Pernek
Peter Kokol
author_sort Gregor Stiglic
title Comprehensive decision tree models in bioinformatics.
title_short Comprehensive decision tree models in bioinformatics.
title_full Comprehensive decision tree models in bioinformatics.
title_fullStr Comprehensive decision tree models in bioinformatics.
title_full_unstemmed Comprehensive decision tree models in bioinformatics.
title_sort comprehensive decision tree models in bioinformatics.
publisher Public Library of Science (PLoS)
publishDate 2012
url https://doaj.org/article/782c508894924c779893ef1e028a65c0
work_keys_str_mv AT gregorstiglic comprehensivedecisiontreemodelsinbioinformatics
AT simonkocbek comprehensivedecisiontreemodelsinbioinformatics
AT igorpernek comprehensivedecisiontreemodelsinbioinformatics
AT peterkokol comprehensivedecisiontreemodelsinbioinformatics
_version_ 1718423541015117824