PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes.

High-throughput data such as metabolomics, genomics, transcriptomics, and proteomics have become familiar data types within the "-omics" family. For this work, we focus on subsets that interact with one another and represent these "pathways" as graphs. Observed pathways often hav...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Charlie M Carpenter, Weiming Zhang, Lucas Gillenwater, Cameron Severn, Tusharkanti Ghosh, Russell Bowler, Katerina Kechris, Debashis Ghosh
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
Acceso en línea:https://doaj.org/article/5676f381da3d41819c493b41c1b39486
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:5676f381da3d41819c493b41c1b39486
record_format dspace
spelling oai:doaj.org-article:5676f381da3d41819c493b41c1b394862021-12-02T19:57:28ZPaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes.1553-734X1553-735810.1371/journal.pcbi.1008986https://doaj.org/article/5676f381da3d41819c493b41c1b394862021-10-01T00:00:00Zhttps://doi.org/10.1371/journal.pcbi.1008986https://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358High-throughput data such as metabolomics, genomics, transcriptomics, and proteomics have become familiar data types within the "-omics" family. For this work, we focus on subsets that interact with one another and represent these "pathways" as graphs. Observed pathways often have disjoint components, i.e., nodes or sets of nodes (metabolites, etc.) not connected to any other within the pathway, which notably lessens testing power. In this paper we propose the Pathway Integrated Regression-based Kernel Association Test (PaIRKAT), a new kernel machine regression method for incorporating known pathway information into the semi-parametric kernel regression framework. This work extends previous kernel machine approaches. This paper also contributes an application of a graph kernel regularization method for overcoming disconnected pathways. By incorporating a regularized or "smoothed" graph into a score test, PaIRKAT can provide more powerful tests for associations between biological pathways and phenotypes of interest and will be helpful in identifying novel pathways for targeted clinical research. We evaluate this method through several simulation studies and an application to real metabolomics data from the COPDGene study. Our simulation studies illustrate the robustness of this method to incorrect and incomplete pathway knowledge, and the real data analysis shows meaningful improvements of testing power in pathways. PaIRKAT was developed for application to metabolomic pathway data, but the techniques are easily generalizable to other data sources with a graph-like structure.Charlie M CarpenterWeiming ZhangLucas GillenwaterCameron SevernTusharkanti GhoshRussell BowlerKaterina KechrisDebashis GhoshPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 17, Iss 10, p e1008986 (2021)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Charlie M Carpenter
Weiming Zhang
Lucas Gillenwater
Cameron Severn
Tusharkanti Ghosh
Russell Bowler
Katerina Kechris
Debashis Ghosh
PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes.
description High-throughput data such as metabolomics, genomics, transcriptomics, and proteomics have become familiar data types within the "-omics" family. For this work, we focus on subsets that interact with one another and represent these "pathways" as graphs. Observed pathways often have disjoint components, i.e., nodes or sets of nodes (metabolites, etc.) not connected to any other within the pathway, which notably lessens testing power. In this paper we propose the Pathway Integrated Regression-based Kernel Association Test (PaIRKAT), a new kernel machine regression method for incorporating known pathway information into the semi-parametric kernel regression framework. This work extends previous kernel machine approaches. This paper also contributes an application of a graph kernel regularization method for overcoming disconnected pathways. By incorporating a regularized or "smoothed" graph into a score test, PaIRKAT can provide more powerful tests for associations between biological pathways and phenotypes of interest and will be helpful in identifying novel pathways for targeted clinical research. We evaluate this method through several simulation studies and an application to real metabolomics data from the COPDGene study. Our simulation studies illustrate the robustness of this method to incorrect and incomplete pathway knowledge, and the real data analysis shows meaningful improvements of testing power in pathways. PaIRKAT was developed for application to metabolomic pathway data, but the techniques are easily generalizable to other data sources with a graph-like structure.
format article
author Charlie M Carpenter
Weiming Zhang
Lucas Gillenwater
Cameron Severn
Tusharkanti Ghosh
Russell Bowler
Katerina Kechris
Debashis Ghosh
author_facet Charlie M Carpenter
Weiming Zhang
Lucas Gillenwater
Cameron Severn
Tusharkanti Ghosh
Russell Bowler
Katerina Kechris
Debashis Ghosh
author_sort Charlie M Carpenter
title PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes.
title_short PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes.
title_full PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes.
title_fullStr PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes.
title_full_unstemmed PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes.
title_sort pairkat: a pathway integrated regression-based kernel association test with applications to metabolomics and copd phenotypes.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/5676f381da3d41819c493b41c1b39486
work_keys_str_mv AT charliemcarpenter pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT weimingzhang pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT lucasgillenwater pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT cameronsevern pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT tusharkantighosh pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT russellbowler pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT katerinakechris pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT debashisghosh pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
_version_ 1718375839912951808