Automatic peak selection by a Benjamini-Hochberg-based algorithm.

A common issue in bioinformatics is that computational methods often generate a large number of predictions sorted according to certain confidence scores. A key problem is then determining how many predictions must be selected to include most of the true predictions while maintaining reasonably high...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Ahmed Abbas, Xin-Bing Kong, Zhi Liu, Bing-Yi Jing, Xin Gao
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2013
Materias:
R
Q
Acceso en línea:https://doaj.org/article/561fad8551d64e1a90c6b633896dbc78
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:561fad8551d64e1a90c6b633896dbc78
record_format dspace
spelling oai:doaj.org-article:561fad8551d64e1a90c6b633896dbc782021-11-18T08:02:33ZAutomatic peak selection by a Benjamini-Hochberg-based algorithm.1932-620310.1371/journal.pone.0053112https://doaj.org/article/561fad8551d64e1a90c6b633896dbc782013-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23308147/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203A common issue in bioinformatics is that computational methods often generate a large number of predictions sorted according to certain confidence scores. A key problem is then determining how many predictions must be selected to include most of the true predictions while maintaining reasonably high precision. In nuclear magnetic resonance (NMR)-based protein structure determination, for instance, computational peak picking methods are becoming more and more common, although expert-knowledge remains the method of choice to determine how many peaks among thousands of candidate peaks should be taken into consideration to capture the true peaks. Here, we propose a Benjamini-Hochberg (B-H)-based approach that automatically selects the number of peaks. We formulate the peak selection problem as a multiple testing problem. Given a candidate peak list sorted by either volumes or intensities, we first convert the peaks into [Formula: see text]-values and then apply the B-H-based algorithm to automatically select the number of peaks. The proposed approach is tested on the state-of-the-art peak picking methods, including WaVPeak [1] and PICKY [2]. Compared with the traditional fixed number-based approach, our approach returns significantly more true peaks. For instance, by combining WaVPeak or PICKY with the proposed method, the missing peak rates are on average reduced by 20% and 26%, respectively, in a benchmark set of 32 spectra extracted from eight proteins. The consensus of the B-H-selected peaks from both WaVPeak and PICKY achieves 88% recall and 83% precision, which significantly outperforms each individual method and the consensus method without using the B-H algorithm. The proposed method can be used as a standard procedure for any peak picking method and straightforwardly applied to some other prediction selection problems in bioinformatics. The source code, documentation and example data of the proposed method is available at http://sfb.kaust.edu.sa/pages/software.aspx.Ahmed AbbasXin-Bing KongZhi LiuBing-Yi JingXin GaoPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 8, Iss 1, p e53112 (2013)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Ahmed Abbas
Xin-Bing Kong
Zhi Liu
Bing-Yi Jing
Xin Gao
Automatic peak selection by a Benjamini-Hochberg-based algorithm.
description A common issue in bioinformatics is that computational methods often generate a large number of predictions sorted according to certain confidence scores. A key problem is then determining how many predictions must be selected to include most of the true predictions while maintaining reasonably high precision. In nuclear magnetic resonance (NMR)-based protein structure determination, for instance, computational peak picking methods are becoming more and more common, although expert-knowledge remains the method of choice to determine how many peaks among thousands of candidate peaks should be taken into consideration to capture the true peaks. Here, we propose a Benjamini-Hochberg (B-H)-based approach that automatically selects the number of peaks. We formulate the peak selection problem as a multiple testing problem. Given a candidate peak list sorted by either volumes or intensities, we first convert the peaks into [Formula: see text]-values and then apply the B-H-based algorithm to automatically select the number of peaks. The proposed approach is tested on the state-of-the-art peak picking methods, including WaVPeak [1] and PICKY [2]. Compared with the traditional fixed number-based approach, our approach returns significantly more true peaks. For instance, by combining WaVPeak or PICKY with the proposed method, the missing peak rates are on average reduced by 20% and 26%, respectively, in a benchmark set of 32 spectra extracted from eight proteins. The consensus of the B-H-selected peaks from both WaVPeak and PICKY achieves 88% recall and 83% precision, which significantly outperforms each individual method and the consensus method without using the B-H algorithm. The proposed method can be used as a standard procedure for any peak picking method and straightforwardly applied to some other prediction selection problems in bioinformatics. The source code, documentation and example data of the proposed method is available at http://sfb.kaust.edu.sa/pages/software.aspx.
format article
author Ahmed Abbas
Xin-Bing Kong
Zhi Liu
Bing-Yi Jing
Xin Gao
author_facet Ahmed Abbas
Xin-Bing Kong
Zhi Liu
Bing-Yi Jing
Xin Gao
author_sort Ahmed Abbas
title Automatic peak selection by a Benjamini-Hochberg-based algorithm.
title_short Automatic peak selection by a Benjamini-Hochberg-based algorithm.
title_full Automatic peak selection by a Benjamini-Hochberg-based algorithm.
title_fullStr Automatic peak selection by a Benjamini-Hochberg-based algorithm.
title_full_unstemmed Automatic peak selection by a Benjamini-Hochberg-based algorithm.
title_sort automatic peak selection by a benjamini-hochberg-based algorithm.
publisher Public Library of Science (PLoS)
publishDate 2013
url https://doaj.org/article/561fad8551d64e1a90c6b633896dbc78
work_keys_str_mv AT ahmedabbas automaticpeakselectionbyabenjaminihochbergbasedalgorithm
AT xinbingkong automaticpeakselectionbyabenjaminihochbergbasedalgorithm
AT zhiliu automaticpeakselectionbyabenjaminihochbergbasedalgorithm
AT bingyijing automaticpeakselectionbyabenjaminihochbergbasedalgorithm
AT xingao automaticpeakselectionbyabenjaminihochbergbasedalgorithm
_version_ 1718422602228170752