Multiple regression methods show great potential for rare variant association tests.

The investigation of associations between rare genetic variants and diseases or phenotypes has two goals. Firstly, the identification of which genes or genomic regions are associated, and secondly, discrimination of associated variants from background noise within each region. Over the last few year...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: ChangJiang Xu, Martin Ladouceur, Zari Dastani, J Brent Richards, Antonio Ciampi, Celia M T Greenwood
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2012
Materias:
R
Q
Acceso en línea:https://doaj.org/article/ae4e9d724aba43d3a9fdb01059ed9d29
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:ae4e9d724aba43d3a9fdb01059ed9d29
record_format dspace
spelling oai:doaj.org-article:ae4e9d724aba43d3a9fdb01059ed9d292021-11-18T07:09:16ZMultiple regression methods show great potential for rare variant association tests.1932-620310.1371/journal.pone.0041694https://doaj.org/article/ae4e9d724aba43d3a9fdb01059ed9d292012-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/22916111/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203The investigation of associations between rare genetic variants and diseases or phenotypes has two goals. Firstly, the identification of which genes or genomic regions are associated, and secondly, discrimination of associated variants from background noise within each region. Over the last few years, many new methods have been developed which associate genomic regions with phenotypes. However, classical methods for high-dimensional data have received little attention. Here we investigate whether several classical statistical methods for high-dimensional data: ridge regression (RR), principal components regression (PCR), partial least squares regression (PLS), a sparse version of PLS (SPLS), and the LASSO are able to detect associations with rare genetic variants. These approaches have been extensively used in statistics to identify the true associations in data sets containing many predictor variables. Using genetic variants identified in three genes that were Sanger sequenced in 1998 individuals, we simulated continuous phenotypes under several different models, and we show that these feature selection and feature extraction methods can substantially outperform several popular methods for rare variant analysis. Furthermore, these approaches can identify which variants are contributing most to the model fit, and therefore both goals of rare variant analysis can be achieved simultaneously with the use of regression regularization methods. These methods are briefly illustrated with an analysis of adiponectin levels and variants in the ADIPOQ gene.ChangJiang XuMartin LadouceurZari DastaniJ Brent RichardsAntonio CiampiCelia M T GreenwoodPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 7, Iss 8, p e41694 (2012)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
ChangJiang Xu
Martin Ladouceur
Zari Dastani
J Brent Richards
Antonio Ciampi
Celia M T Greenwood
Multiple regression methods show great potential for rare variant association tests.
description The investigation of associations between rare genetic variants and diseases or phenotypes has two goals. Firstly, the identification of which genes or genomic regions are associated, and secondly, discrimination of associated variants from background noise within each region. Over the last few years, many new methods have been developed which associate genomic regions with phenotypes. However, classical methods for high-dimensional data have received little attention. Here we investigate whether several classical statistical methods for high-dimensional data: ridge regression (RR), principal components regression (PCR), partial least squares regression (PLS), a sparse version of PLS (SPLS), and the LASSO are able to detect associations with rare genetic variants. These approaches have been extensively used in statistics to identify the true associations in data sets containing many predictor variables. Using genetic variants identified in three genes that were Sanger sequenced in 1998 individuals, we simulated continuous phenotypes under several different models, and we show that these feature selection and feature extraction methods can substantially outperform several popular methods for rare variant analysis. Furthermore, these approaches can identify which variants are contributing most to the model fit, and therefore both goals of rare variant analysis can be achieved simultaneously with the use of regression regularization methods. These methods are briefly illustrated with an analysis of adiponectin levels and variants in the ADIPOQ gene.
format article
author ChangJiang Xu
Martin Ladouceur
Zari Dastani
J Brent Richards
Antonio Ciampi
Celia M T Greenwood
author_facet ChangJiang Xu
Martin Ladouceur
Zari Dastani
J Brent Richards
Antonio Ciampi
Celia M T Greenwood
author_sort ChangJiang Xu
title Multiple regression methods show great potential for rare variant association tests.
title_short Multiple regression methods show great potential for rare variant association tests.
title_full Multiple regression methods show great potential for rare variant association tests.
title_fullStr Multiple regression methods show great potential for rare variant association tests.
title_full_unstemmed Multiple regression methods show great potential for rare variant association tests.
title_sort multiple regression methods show great potential for rare variant association tests.
publisher Public Library of Science (PLoS)
publishDate 2012
url https://doaj.org/article/ae4e9d724aba43d3a9fdb01059ed9d29
work_keys_str_mv AT changjiangxu multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests
AT martinladouceur multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests
AT zaridastani multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests
AT jbrentrichards multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests
AT antoniociampi multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests
AT celiamtgreenwood multipleregressionmethodsshowgreatpotentialforrarevariantassociationtests
_version_ 1718423873586724864