PUMA: a unified framework for penalized multiple regression analysis of GWAS data.

Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely appl...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Gabriel E Hoffman, Benjamin A Logsdon, Jason G Mezey
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2013
Materias:
Acceso en línea:https://doaj.org/article/df33d74e86dc45a2bc28814d4c499522
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:df33d74e86dc45a2bc28814d4c499522
record_format dspace
spelling oai:doaj.org-article:df33d74e86dc45a2bc28814d4c4995222021-11-18T05:52:04ZPUMA: a unified framework for penalized multiple regression analysis of GWAS data.1553-734X1553-735810.1371/journal.pcbi.1003101https://doaj.org/article/df33d74e86dc45a2bc28814d4c4995222013-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23825936/pdf/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely applied. Here, we present a combined algorithmic and heuristic framework for PUMA (Penalized Unified Multiple-locus Association) analysis that solves the problems of previously proposed methods including computational speed, poor performance on genome-scale simulated data, and identification of too many associations for real data to be biologically plausible. The framework includes a new minorize-maximization (MM) algorithm for generalized linear models (GLM) combined with heuristic model selection and testing methods for identification of robust associations. The PUMA framework implements the penalized maximum likelihood penalties previously proposed for GWAS analysis (i.e. Lasso, Adaptive Lasso, NEG, MCP), as well as a penalty that has not been previously applied to GWAS (i.e. LOG). Using simulations that closely mirror real GWAS data, we show that our framework has high performance and reliably increases power to detect weak associations, while existing PMR methods can perform worse than single marker testing in overall performance. To demonstrate the empirical value of PUMA, we analyzed GWAS data for type 1 diabetes, Crohns's disease, and rheumatoid arthritis, three autoimmune diseases from the original Wellcome Trust Case Control Consortium. Our analysis replicates known associations for these diseases and we discover novel etiologically relevant susceptibility loci that are invisible to standard single marker tests, including six novel associations implicating genes involved in pancreatic function, insulin pathways and immune-cell function in type 1 diabetes; three novel associations implicating genes in pro- and anti-inflammatory pathways in Crohn's disease; and one novel association implicating a gene involved in apoptosis pathways in rheumatoid arthritis. We provide software for applying our PUMA analysis framework.Gabriel E HoffmanBenjamin A LogsdonJason G MezeyPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 9, Iss 6, p e1003101 (2013)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Gabriel E Hoffman
Benjamin A Logsdon
Jason G Mezey
PUMA: a unified framework for penalized multiple regression analysis of GWAS data.
description Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely applied. Here, we present a combined algorithmic and heuristic framework for PUMA (Penalized Unified Multiple-locus Association) analysis that solves the problems of previously proposed methods including computational speed, poor performance on genome-scale simulated data, and identification of too many associations for real data to be biologically plausible. The framework includes a new minorize-maximization (MM) algorithm for generalized linear models (GLM) combined with heuristic model selection and testing methods for identification of robust associations. The PUMA framework implements the penalized maximum likelihood penalties previously proposed for GWAS analysis (i.e. Lasso, Adaptive Lasso, NEG, MCP), as well as a penalty that has not been previously applied to GWAS (i.e. LOG). Using simulations that closely mirror real GWAS data, we show that our framework has high performance and reliably increases power to detect weak associations, while existing PMR methods can perform worse than single marker testing in overall performance. To demonstrate the empirical value of PUMA, we analyzed GWAS data for type 1 diabetes, Crohns's disease, and rheumatoid arthritis, three autoimmune diseases from the original Wellcome Trust Case Control Consortium. Our analysis replicates known associations for these diseases and we discover novel etiologically relevant susceptibility loci that are invisible to standard single marker tests, including six novel associations implicating genes involved in pancreatic function, insulin pathways and immune-cell function in type 1 diabetes; three novel associations implicating genes in pro- and anti-inflammatory pathways in Crohn's disease; and one novel association implicating a gene involved in apoptosis pathways in rheumatoid arthritis. We provide software for applying our PUMA analysis framework.
format article
author Gabriel E Hoffman
Benjamin A Logsdon
Jason G Mezey
author_facet Gabriel E Hoffman
Benjamin A Logsdon
Jason G Mezey
author_sort Gabriel E Hoffman
title PUMA: a unified framework for penalized multiple regression analysis of GWAS data.
title_short PUMA: a unified framework for penalized multiple regression analysis of GWAS data.
title_full PUMA: a unified framework for penalized multiple regression analysis of GWAS data.
title_fullStr PUMA: a unified framework for penalized multiple regression analysis of GWAS data.
title_full_unstemmed PUMA: a unified framework for penalized multiple regression analysis of GWAS data.
title_sort puma: a unified framework for penalized multiple regression analysis of gwas data.
publisher Public Library of Science (PLoS)
publishDate 2013
url https://doaj.org/article/df33d74e86dc45a2bc28814d4c499522
work_keys_str_mv AT gabrielehoffman pumaaunifiedframeworkforpenalizedmultipleregressionanalysisofgwasdata
AT benjaminalogsdon pumaaunifiedframeworkforpenalizedmultipleregressionanalysisofgwasdata
AT jasongmezey pumaaunifiedframeworkforpenalizedmultipleregressionanalysisofgwasdata
_version_ 1718424760399953920