Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.

Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcri...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sara Mostafavi, Alexis Battle, Xiaowei Zhu, Alexander E Urban, Douglas Levinson, Stephen B Montgomery, Daphne Koller
Format:	article
Language:	EN
Published:	Public Library of Science (PLoS) 2013
Subjects:	Medicine R Science Q
Online Access:	https://doaj.org/article/b2924a71dadf489e8607ad370e407a38
Tags:	Add Tag No Tags, Be the first to tag this record!

id	oai:doaj.org-article:b2924a71dadf489e8607ad370e407a38
record_format	dspace
spelling	oai:doaj.org-article:b2924a71dadf489e8607ad370e407a382021-11-18T07:37:18ZNormalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.1932-620310.1371/journal.pone.0068141https://doaj.org/article/b2924a71dadf489e8607ad370e407a382013-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/23874524/?tool=EBIhttps://doaj.org/toc/1932-6203Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcriptome, including novel transcriptional events. However, as with earlier expression assays, analysis of RNA-sequencing data requires carefully accounting for factors that may introduce systematic, confounding variability in the expression measurements, resulting in spurious correlations. Here, we consider the problem of modeling and removing the effects of known and hidden confounding factors from RNA-sequencing data. We describe a unified residual framework that encapsulates existing approaches, and using this framework, present a novel method, HCP (Hidden Covariates with Prior). HCP uses a more informed assumption about the confounding factors, and performs as well or better than existing approaches while having a much lower computational cost. Our experiments demonstrate that accounting for known and hidden factors with appropriate models improves the quality of RNA-sequencing data in two very different tasks: detecting genetic variations that are associated with nearby expression variations (cis-eQTLs), and constructing accurate co-expression networks.Sara MostafaviAlexis BattleXiaowei ZhuAlexander E UrbanDouglas LevinsonStephen B MontgomeryDaphne KollerPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 8, Iss 7, p e68141 (2013)
institution	DOAJ
collection	DOAJ
language	EN
topic	Medicine R Science Q
spellingShingle	Medicine R Science Q Sara Mostafavi Alexis Battle Xiaowei Zhu Alexander E Urban Douglas Levinson Stephen B Montgomery Daphne Koller Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.
description	Transcriptomic assays that measure expression levels are widely used to study the manifestation of environmental or genetic variations in cellular processes. RNA-sequencing in particular has the potential to considerably improve such understanding because of its capacity to assay the entire transcriptome, including novel transcriptional events. However, as with earlier expression assays, analysis of RNA-sequencing data requires carefully accounting for factors that may introduce systematic, confounding variability in the expression measurements, resulting in spurious correlations. Here, we consider the problem of modeling and removing the effects of known and hidden confounding factors from RNA-sequencing data. We describe a unified residual framework that encapsulates existing approaches, and using this framework, present a novel method, HCP (Hidden Covariates with Prior). HCP uses a more informed assumption about the confounding factors, and performs as well or better than existing approaches while having a much lower computational cost. Our experiments demonstrate that accounting for known and hidden factors with appropriate models improves the quality of RNA-sequencing data in two very different tasks: detecting genetic variations that are associated with nearby expression variations (cis-eQTLs), and constructing accurate co-expression networks.
format	article
author	Sara Mostafavi Alexis Battle Xiaowei Zhu Alexander E Urban Douglas Levinson Stephen B Montgomery Daphne Koller
author_facet	Sara Mostafavi Alexis Battle Xiaowei Zhu Alexander E Urban Douglas Levinson Stephen B Montgomery Daphne Koller
author_sort	Sara Mostafavi
title	Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.
title_short	Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.
title_full	Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.
title_fullStr	Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.
title_full_unstemmed	Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.
title_sort	normalizing rna-sequencing data by modeling hidden covariates with prior knowledge.
publisher	Public Library of Science (PLoS)
publishDate	2013
url	https://doaj.org/article/b2924a71dadf489e8607ad370e407a38
work_keys_str_mv	AT saramostafavi normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT alexisbattle normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT xiaoweizhu normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT alexandereurban normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT douglaslevinson normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT stephenbmontgomery normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge AT daphnekoller normalizingrnasequencingdatabymodelinghiddencovariateswithpriorknowledge
_version_	1718423185056071680

Normalizing RNA-sequencing data by modeling hidden covariates with prior knowledge.

Similar Items