Selection of higher order regression models in the analysis of multi-factorial transcription data.

<h4>Introduction</h4>Many studies examine gene expression data that has been obtained under the influence of multiple factors, such as genetic background, environmental conditions, or exposure to diseases. The interplay of multiple factors may lead to effect modification and confounding....

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Olivia Prazeres da Costa, Arthur Hoffman, Johannes W Rey, Ulrich Mansmann, Thorsten Buch, Achim Tresch
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2014
Materias:
R
Q
Acceso en línea:https://doaj.org/article/6a8a645ed2eb47bbbc37454bfa88e351
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:6a8a645ed2eb47bbbc37454bfa88e351
record_format dspace
spelling oai:doaj.org-article:6a8a645ed2eb47bbbc37454bfa88e3512021-11-18T08:26:42ZSelection of higher order regression models in the analysis of multi-factorial transcription data.1932-620310.1371/journal.pone.0091840https://doaj.org/article/6a8a645ed2eb47bbbc37454bfa88e3512014-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24658540/?tool=EBIhttps://doaj.org/toc/1932-6203<h4>Introduction</h4>Many studies examine gene expression data that has been obtained under the influence of multiple factors, such as genetic background, environmental conditions, or exposure to diseases. The interplay of multiple factors may lead to effect modification and confounding. Higher order linear regression models can account for these effects. We present a new methodology for linear model selection and apply it to microarray data of bone marrow-derived macrophages. This experiment investigates the influence of three variable factors: the genetic background of the mice from which the macrophages were obtained, Yersinia enterocolitica infection (two strains, and a mock control), and treatment/non-treatment with interferon-γ.<h4>Results</h4>We set up four different linear regression models in a hierarchical order. We introduce the eruption plot as a new practical tool for model selection complementary to global testing. It visually compares the size and significance of effect estimates between two nested models. Using this methodology we were able to select the most appropriate model by keeping only relevant factors showing additional explanatory power. Application to experimental data allowed us to qualify the interaction of factors as either neutral (no interaction), alleviating (co-occurring effects are weaker than expected from the single effects), or aggravating (stronger than expected). We find a biologically meaningful gene cluster of putative C2TA target genes that appear to be co-regulated with MHC class II genes.<h4>Conclusions</h4>We introduced the eruption plot as a tool for visual model comparison to identify relevant higher order interactions in the analysis of expression data obtained under the influence of multiple factors. We conclude that model selection in higher order linear regression models should generally be performed for the analysis of multi-factorial microarray data.Olivia Prazeres da CostaArthur HoffmanJohannes W ReyUlrich MansmannThorsten BuchAchim TreschPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 9, Iss 3, p e91840 (2014)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Olivia Prazeres da Costa
Arthur Hoffman
Johannes W Rey
Ulrich Mansmann
Thorsten Buch
Achim Tresch
Selection of higher order regression models in the analysis of multi-factorial transcription data.
description <h4>Introduction</h4>Many studies examine gene expression data that has been obtained under the influence of multiple factors, such as genetic background, environmental conditions, or exposure to diseases. The interplay of multiple factors may lead to effect modification and confounding. Higher order linear regression models can account for these effects. We present a new methodology for linear model selection and apply it to microarray data of bone marrow-derived macrophages. This experiment investigates the influence of three variable factors: the genetic background of the mice from which the macrophages were obtained, Yersinia enterocolitica infection (two strains, and a mock control), and treatment/non-treatment with interferon-γ.<h4>Results</h4>We set up four different linear regression models in a hierarchical order. We introduce the eruption plot as a new practical tool for model selection complementary to global testing. It visually compares the size and significance of effect estimates between two nested models. Using this methodology we were able to select the most appropriate model by keeping only relevant factors showing additional explanatory power. Application to experimental data allowed us to qualify the interaction of factors as either neutral (no interaction), alleviating (co-occurring effects are weaker than expected from the single effects), or aggravating (stronger than expected). We find a biologically meaningful gene cluster of putative C2TA target genes that appear to be co-regulated with MHC class II genes.<h4>Conclusions</h4>We introduced the eruption plot as a tool for visual model comparison to identify relevant higher order interactions in the analysis of expression data obtained under the influence of multiple factors. We conclude that model selection in higher order linear regression models should generally be performed for the analysis of multi-factorial microarray data.
format article
author Olivia Prazeres da Costa
Arthur Hoffman
Johannes W Rey
Ulrich Mansmann
Thorsten Buch
Achim Tresch
author_facet Olivia Prazeres da Costa
Arthur Hoffman
Johannes W Rey
Ulrich Mansmann
Thorsten Buch
Achim Tresch
author_sort Olivia Prazeres da Costa
title Selection of higher order regression models in the analysis of multi-factorial transcription data.
title_short Selection of higher order regression models in the analysis of multi-factorial transcription data.
title_full Selection of higher order regression models in the analysis of multi-factorial transcription data.
title_fullStr Selection of higher order regression models in the analysis of multi-factorial transcription data.
title_full_unstemmed Selection of higher order regression models in the analysis of multi-factorial transcription data.
title_sort selection of higher order regression models in the analysis of multi-factorial transcription data.
publisher Public Library of Science (PLoS)
publishDate 2014
url https://doaj.org/article/6a8a645ed2eb47bbbc37454bfa88e351
work_keys_str_mv AT oliviaprazeresdacosta selectionofhigherorderregressionmodelsintheanalysisofmultifactorialtranscriptiondata
AT arthurhoffman selectionofhigherorderregressionmodelsintheanalysisofmultifactorialtranscriptiondata
AT johanneswrey selectionofhigherorderregressionmodelsintheanalysisofmultifactorialtranscriptiondata
AT ulrichmansmann selectionofhigherorderregressionmodelsintheanalysisofmultifactorialtranscriptiondata
AT thorstenbuch selectionofhigherorderregressionmodelsintheanalysisofmultifactorialtranscriptiondata
AT achimtresch selectionofhigherorderregressionmodelsintheanalysisofmultifactorialtranscriptiondata
_version_ 1718421807263907840