Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.
Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcri...
Guardado en:
Autores principales: | , , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2013
|
Materias: | |
Acceso en línea: | https://doaj.org/article/b2924a71dadf489e8607ad370e407a38 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:b2924a71dadf489e8607ad370e407a38 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:b2924a71dadf489e8607ad370e407a382021-11-18T07:37:18ZNormalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.1932-620310.1371/journal.pone.0068141https://doaj.org/article/b2924a71dadf489e8607ad370e407a382013-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23874524/?tool=EBIhttps://doaj.org/toc/1932-6203Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcriptome, including novel transcriptional events. However, as with earlier expression assays, analysis of RNA-sequencing data requires carefully accounting for factors that may introduce systematic, confounding variability in the expression measurements, resulting in spurious correlations. Here, we consider the problem of modeling and removing the effects of known and hidden confounding factors from RNA-sequencing data. We describe a unified residual framework that encapsulates existing approaches, and using this framework, present a novel method, HCP (Hidden Covariates with Prior). HCP uses a more informed assumption about the confounding factors, and performs as well or better than existing approaches while having a much lower computational cost. Our experiments demonstrate that accounting for known and hidden factors with appropriate models improves the quality of RNA-sequencing data in two very different tasks: detecting genetic variations that are associated with nearby expression variations (cis-eQTLs), and constructing accurate co-expression networks.Sara MostafaviAlexis BattleXiaowei ZhuAlexander E UrbanDouglas LevinsonStephen B MontgomeryDaphne KollerPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 8, Iss 7, p e68141 (2013) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Sara Mostafavi Alexis Battle Xiaowei Zhu Alexander E Urban Douglas Levinson Stephen B Montgomery Daphne Koller Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge. |
description |
Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcriptome, including novel transcriptional events. However, as with earlier expression assays, analysis of RNA-sequencing data requires carefully accounting for factors that may introduce systematic, confounding variability in the expression measurements, resulting in spurious correlations. Here, we consider the problem of modeling and removing the effects of known and hidden confounding factors from RNA-sequencing data. We describe a unified residual framework that encapsulates existing approaches, and using this framework, present a novel method, HCP (Hidden Covariates with Prior). HCP uses a more informed assumption about the confounding factors, and performs as well or better than existing approaches while having a much lower computational cost. Our experiments demonstrate that accounting for known and hidden factors with appropriate models improves the quality of RNA-sequencing data in two very different tasks: detecting genetic variations that are associated with nearby expression variations (cis-eQTLs), and constructing accurate co-expression networks. |
format |
article |
author |
Sara Mostafavi Alexis Battle Xiaowei Zhu Alexander E Urban Douglas Levinson Stephen B Montgomery Daphne Koller |
author_facet |
Sara Mostafavi Alexis Battle Xiaowei Zhu Alexander E Urban Douglas Levinson Stephen B Montgomery Daphne Koller |
author_sort |
Sara Mostafavi |
title |
Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge. |
title_short |
Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge. |
title_full |
Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge. |
title_fullStr |
Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge. |
title_full_unstemmed |
Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge. |
title_sort |
normalizing rna-sequencing data by modeling hidden covariates with prior knowledge. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2013 |
url |
https://doaj.org/article/b2924a71dadf489e8607ad370e407a38 |
work_keys_str_mv |
AT saramostafavi normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT alexisbattle normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT xiaoweizhu normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT alexandereurban normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT douglaslevinson normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT stephenbmontgomery normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT daphnekoller normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge |
_version_ |
1718423185056071680 |