Mining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality.

<h4>Background</h4>The interaction between loci to affect phenotype is called epistasis. It is strict epistasis if no proper subset of the interacting loci exhibits a marginal effect. For many diseases, it is likely that unknown epistatic interactions affect disease susceptibility. A dif...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Xia Jiang, Richard E Neapolitan
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2012
Materias:
R
Q
Acceso en línea:https://doaj.org/article/e585593f398545f6863c98425f5dd319
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:e585593f398545f6863c98425f5dd319
record_format dspace
spelling oai:doaj.org-article:e585593f398545f6863c98425f5dd3192021-11-18T08:12:16ZMining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality.1932-620310.1371/journal.pone.0046771https://doaj.org/article/e585593f398545f6863c98425f5dd3192012-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23071633/?tool=EBIhttps://doaj.org/toc/1932-6203<h4>Background</h4>The interaction between loci to affect phenotype is called epistasis. It is strict epistasis if no proper subset of the interacting loci exhibits a marginal effect. For many diseases, it is likely that unknown epistatic interactions affect disease susceptibility. A difficulty when mining epistatic interactions from high-dimensional datasets concerns the curse of dimensionality. There are too many combinations of SNPs to perform an exhaustive search. A method that could locate strict epistasis without an exhaustive search can be considered the brass ring of methods for analyzing high-dimensional datasets.<h4>Methodology/findings</h4>A SNP pattern is a Bayesian network representing SNP-disease relationships. The Bayesian score for a SNP pattern is the probability of the data given the pattern, and has been used to learn SNP patterns. We identified a bound for the score of a SNP pattern. The bound provides an upper limit on the Bayesian score of any pattern that could be obtained by expanding a given pattern. We felt that the bound might enable the data to say something about the promise of expanding a 1-SNP pattern even when there are no marginal effects. We tested the bound using simulated datasets and semi-synthetic high-dimensional datasets obtained from GWAS datasets. We found that the bound was able to dramatically reduce the search time for strict epistasis. Using an Alzheimer's dataset, we showed that it is possible to discover an interaction involving the APOE gene based on its score because of its large marginal effect, but that the bound is most effective at discovering interactions without marginal effects.<h4>Conclusions/significance</h4>We conclude that the bound appears to ameliorate the curse of dimensionality in high-dimensional datasets. This is a very consequential result and could be pivotal in our efforts to reveal the dark matter of genetic disease risk from high-dimensional datasets.Xia JiangRichard E NeapolitanPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 7, Iss 10, p e46771 (2012)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Xia Jiang
Richard E Neapolitan
Mining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality.
description <h4>Background</h4>The interaction between loci to affect phenotype is called epistasis. It is strict epistasis if no proper subset of the interacting loci exhibits a marginal effect. For many diseases, it is likely that unknown epistatic interactions affect disease susceptibility. A difficulty when mining epistatic interactions from high-dimensional datasets concerns the curse of dimensionality. There are too many combinations of SNPs to perform an exhaustive search. A method that could locate strict epistasis without an exhaustive search can be considered the brass ring of methods for analyzing high-dimensional datasets.<h4>Methodology/findings</h4>A SNP pattern is a Bayesian network representing SNP-disease relationships. The Bayesian score for a SNP pattern is the probability of the data given the pattern, and has been used to learn SNP patterns. We identified a bound for the score of a SNP pattern. The bound provides an upper limit on the Bayesian score of any pattern that could be obtained by expanding a given pattern. We felt that the bound might enable the data to say something about the promise of expanding a 1-SNP pattern even when there are no marginal effects. We tested the bound using simulated datasets and semi-synthetic high-dimensional datasets obtained from GWAS datasets. We found that the bound was able to dramatically reduce the search time for strict epistasis. Using an Alzheimer's dataset, we showed that it is possible to discover an interaction involving the APOE gene based on its score because of its large marginal effect, but that the bound is most effective at discovering interactions without marginal effects.<h4>Conclusions/significance</h4>We conclude that the bound appears to ameliorate the curse of dimensionality in high-dimensional datasets. This is a very consequential result and could be pivotal in our efforts to reveal the dark matter of genetic disease risk from high-dimensional datasets.
format article
author Xia Jiang
Richard E Neapolitan
author_facet Xia Jiang
Richard E Neapolitan
author_sort Xia Jiang
title Mining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality.
title_short Mining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality.
title_full Mining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality.
title_fullStr Mining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality.
title_full_unstemmed Mining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality.
title_sort mining pure, strict epistatic interactions from high-dimensional datasets: ameliorating the curse of dimensionality.
publisher Public Library of Science (PLoS)
publishDate 2012
url https://doaj.org/article/e585593f398545f6863c98425f5dd319
work_keys_str_mv AT xiajiang miningpurestrictepistaticinteractionsfromhighdimensionaldatasetsamelioratingthecurseofdimensionality
AT richardeneapolitan miningpurestrictepistaticinteractionsfromhighdimensionaldatasetsamelioratingthecurseofdimensionality
_version_ 1718422079565463552