Data-driven detection of subtype-specific differentially expressed genes

Abstract Among multiple subtypes of tissue or cell, subtype-specific differentially-expressed genes (SDEGs) are defined as being most-upregulated in only one subtype but not in any other. Detecting SDEGs plays a critical role in the molecular characterization and deconvolution of multicellular compl...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Lulu Chen, Yingzhou Lu, Chiung-Ting Wu, Robert Clarke, Guoqiang Yu, Jennifer E. Van Eyk, David M. Herrington, Yue Wang
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/d7b2495d65de403294df9998bc8b969d
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:d7b2495d65de403294df9998bc8b969d
record_format dspace
spelling oai:doaj.org-article:d7b2495d65de403294df9998bc8b969d2021-12-02T15:23:02ZData-driven detection of subtype-specific differentially expressed genes10.1038/s41598-020-79704-12045-2322https://doaj.org/article/d7b2495d65de403294df9998bc8b969d2021-01-01T00:00:00Zhttps://doi.org/10.1038/s41598-020-79704-1https://doaj.org/toc/2045-2322Abstract Among multiple subtypes of tissue or cell, subtype-specific differentially-expressed genes (SDEGs) are defined as being most-upregulated in only one subtype but not in any other. Detecting SDEGs plays a critical role in the molecular characterization and deconvolution of multicellular complex tissues. Classic differential analysis assumes a null hypothesis whose test statistic is not subtype-specific, thus can produce a high false positive rate and/or lower detection power. Here we first introduce a One-Versus-Everyone Fold Change (OVE-FC) test for detecting SDEGs. We then propose a scaled test statistic (OVE-sFC) for assessing the statistical significance of SDEGs that applies a mixture null distribution model and a tailored permutation test. The OVE-FC/sFC test was validated on both type 1 error rate and detection power using extensive simulation data sets generated from real gene expression profiles of purified subtype samples. The OVE-FC/sFC test was then applied to two benchmark gene expression data sets of purified subtype samples and detected many known or previously unknown SDEGs. Subsequent supervised deconvolution results on synthesized bulk expression data, obtained using the SDEGs detected from the independent purified expression data by the OVE-FC/sFC test, showed superior performance in deconvolution accuracy when compared with popular peer methods.Lulu ChenYingzhou LuChiung-Ting WuRobert ClarkeGuoqiang YuJennifer E. Van EykDavid M. HerringtonYue WangNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-12 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Lulu Chen
Yingzhou Lu
Chiung-Ting Wu
Robert Clarke
Guoqiang Yu
Jennifer E. Van Eyk
David M. Herrington
Yue Wang
Data-driven detection of subtype-specific differentially expressed genes
description Abstract Among multiple subtypes of tissue or cell, subtype-specific differentially-expressed genes (SDEGs) are defined as being most-upregulated in only one subtype but not in any other. Detecting SDEGs plays a critical role in the molecular characterization and deconvolution of multicellular complex tissues. Classic differential analysis assumes a null hypothesis whose test statistic is not subtype-specific, thus can produce a high false positive rate and/or lower detection power. Here we first introduce a One-Versus-Everyone Fold Change (OVE-FC) test for detecting SDEGs. We then propose a scaled test statistic (OVE-sFC) for assessing the statistical significance of SDEGs that applies a mixture null distribution model and a tailored permutation test. The OVE-FC/sFC test was validated on both type 1 error rate and detection power using extensive simulation data sets generated from real gene expression profiles of purified subtype samples. The OVE-FC/sFC test was then applied to two benchmark gene expression data sets of purified subtype samples and detected many known or previously unknown SDEGs. Subsequent supervised deconvolution results on synthesized bulk expression data, obtained using the SDEGs detected from the independent purified expression data by the OVE-FC/sFC test, showed superior performance in deconvolution accuracy when compared with popular peer methods.
format article
author Lulu Chen
Yingzhou Lu
Chiung-Ting Wu
Robert Clarke
Guoqiang Yu
Jennifer E. Van Eyk
David M. Herrington
Yue Wang
author_facet Lulu Chen
Yingzhou Lu
Chiung-Ting Wu
Robert Clarke
Guoqiang Yu
Jennifer E. Van Eyk
David M. Herrington
Yue Wang
author_sort Lulu Chen
title Data-driven detection of subtype-specific differentially expressed genes
title_short Data-driven detection of subtype-specific differentially expressed genes
title_full Data-driven detection of subtype-specific differentially expressed genes
title_fullStr Data-driven detection of subtype-specific differentially expressed genes
title_full_unstemmed Data-driven detection of subtype-specific differentially expressed genes
title_sort data-driven detection of subtype-specific differentially expressed genes
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/d7b2495d65de403294df9998bc8b969d
work_keys_str_mv AT luluchen datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT yingzhoulu datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT chiungtingwu datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT robertclarke datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT guoqiangyu datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT jenniferevaneyk datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT davidmherrington datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
AT yuewang datadrivendetectionofsubtypespecificdifferentiallyexpressedgenes
_version_ 1718387355211005952