High-order SNP combinations associated with complex diseases: efficient discovery, statistical power and functional interactions.

There has been increased interest in discovering combinations of single-nucleotide polymorphisms (SNPs) that are strongly associated with a phenotype even if each SNP has little individual effect. Efficient approaches have been proposed for searching two-locus combinations from genome-wide datasets....

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Gang Fang, Majda Haznadar, Wen Wang, Haoyu Yu, Michael Steinbach, Timothy R Church, William S Oetting, Brian Van Ness, Vipin Kumar
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2012
Materias:
R
Q
Acceso en línea:https://doaj.org/article/d1067285e52140b5a3f12c106ddfc5a6
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:d1067285e52140b5a3f12c106ddfc5a6
record_format dspace
spelling oai:doaj.org-article:d1067285e52140b5a3f12c106ddfc5a62021-11-18T07:21:39ZHigh-order SNP combinations associated with complex diseases: efficient discovery, statistical power and functional interactions.1932-620310.1371/journal.pone.0033531https://doaj.org/article/d1067285e52140b5a3f12c106ddfc5a62012-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/22536319/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203There has been increased interest in discovering combinations of single-nucleotide polymorphisms (SNPs) that are strongly associated with a phenotype even if each SNP has little individual effect. Efficient approaches have been proposed for searching two-locus combinations from genome-wide datasets. However, for high-order combinations, existing methods either adopt a brute-force search which only handles a small number of SNPs (up to few hundreds), or use heuristic search that may miss informative combinations. In addition, existing approaches lack statistical power because of the use of statistics with high degrees-of-freedom and the huge number of hypotheses tested during combinatorial search. Due to these challenges, functional interactions in high-order combinations have not been systematically explored. We leverage discriminative-pattern-mining algorithms from the data-mining community to search for high-order combinations in case-control datasets. The substantially improved efficiency and scalability demonstrated on synthetic and real datasets with several thousands of SNPs allows the study of several important mathematical and statistical properties of SNP combinations with order as high as eleven. We further explore functional interactions in high-order combinations and reveal a general connection between the increase in discriminative power of a combination over its subsets and the functional coherence among the genes comprising the combination, supported by multiple datasets. Finally, we study several significant high-order combinations discovered from a lung-cancer dataset and a kidney-transplant-rejection dataset in detail to provide novel insights on the complex diseases. Interestingly, many of these associations involve combinations of common variations that occur in small fractions of population. Thus, our approach is an alternative methodology for exploring the genetics of rare diseases for which the current focus is on individually rare variations.Gang FangMajda HaznadarWen WangHaoyu YuMichael SteinbachTimothy R ChurchWilliam S OettingBrian Van NessVipin KumarPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 7, Iss 4, p e33531 (2012)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Gang Fang
Majda Haznadar
Wen Wang
Haoyu Yu
Michael Steinbach
Timothy R Church
William S Oetting
Brian Van Ness
Vipin Kumar
High-order SNP combinations associated with complex diseases: efficient discovery, statistical power and functional interactions.
description There has been increased interest in discovering combinations of single-nucleotide polymorphisms (SNPs) that are strongly associated with a phenotype even if each SNP has little individual effect. Efficient approaches have been proposed for searching two-locus combinations from genome-wide datasets. However, for high-order combinations, existing methods either adopt a brute-force search which only handles a small number of SNPs (up to few hundreds), or use heuristic search that may miss informative combinations. In addition, existing approaches lack statistical power because of the use of statistics with high degrees-of-freedom and the huge number of hypotheses tested during combinatorial search. Due to these challenges, functional interactions in high-order combinations have not been systematically explored. We leverage discriminative-pattern-mining algorithms from the data-mining community to search for high-order combinations in case-control datasets. The substantially improved efficiency and scalability demonstrated on synthetic and real datasets with several thousands of SNPs allows the study of several important mathematical and statistical properties of SNP combinations with order as high as eleven. We further explore functional interactions in high-order combinations and reveal a general connection between the increase in discriminative power of a combination over its subsets and the functional coherence among the genes comprising the combination, supported by multiple datasets. Finally, we study several significant high-order combinations discovered from a lung-cancer dataset and a kidney-transplant-rejection dataset in detail to provide novel insights on the complex diseases. Interestingly, many of these associations involve combinations of common variations that occur in small fractions of population. Thus, our approach is an alternative methodology for exploring the genetics of rare diseases for which the current focus is on individually rare variations.
format article
author Gang Fang
Majda Haznadar
Wen Wang
Haoyu Yu
Michael Steinbach
Timothy R Church
William S Oetting
Brian Van Ness
Vipin Kumar
author_facet Gang Fang
Majda Haznadar
Wen Wang
Haoyu Yu
Michael Steinbach
Timothy R Church
William S Oetting
Brian Van Ness
Vipin Kumar
author_sort Gang Fang
title High-order SNP combinations associated with complex diseases: efficient discovery, statistical power and functional interactions.
title_short High-order SNP combinations associated with complex diseases: efficient discovery, statistical power and functional interactions.
title_full High-order SNP combinations associated with complex diseases: efficient discovery, statistical power and functional interactions.
title_fullStr High-order SNP combinations associated with complex diseases: efficient discovery, statistical power and functional interactions.
title_full_unstemmed High-order SNP combinations associated with complex diseases: efficient discovery, statistical power and functional interactions.
title_sort high-order snp combinations associated with complex diseases: efficient discovery, statistical power and functional interactions.
publisher Public Library of Science (PLoS)
publishDate 2012
url https://doaj.org/article/d1067285e52140b5a3f12c106ddfc5a6
work_keys_str_mv AT gangfang highordersnpcombinationsassociatedwithcomplexdiseasesefficientdiscoverystatisticalpowerandfunctionalinteractions
AT majdahaznadar highordersnpcombinationsassociatedwithcomplexdiseasesefficientdiscoverystatisticalpowerandfunctionalinteractions
AT wenwang highordersnpcombinationsassociatedwithcomplexdiseasesefficientdiscoverystatisticalpowerandfunctionalinteractions
AT haoyuyu highordersnpcombinationsassociatedwithcomplexdiseasesefficientdiscoverystatisticalpowerandfunctionalinteractions
AT michaelsteinbach highordersnpcombinationsassociatedwithcomplexdiseasesefficientdiscoverystatisticalpowerandfunctionalinteractions
AT timothyrchurch highordersnpcombinationsassociatedwithcomplexdiseasesefficientdiscoverystatisticalpowerandfunctionalinteractions
AT williamsoetting highordersnpcombinationsassociatedwithcomplexdiseasesefficientdiscoverystatisticalpowerandfunctionalinteractions
AT brianvanness highordersnpcombinationsassociatedwithcomplexdiseasesefficientdiscoverystatisticalpowerandfunctionalinteractions
AT vipinkumar highordersnpcombinationsassociatedwithcomplexdiseasesefficientdiscoverystatisticalpowerandfunctionalinteractions
_version_ 1718423628101451776