RgCop-A regularized copula based method for gene selection in single-cell RNA-seq data.

Gene selection in unannotated large single cell RNA sequencing (scRNA-seq) data is important and crucial step in the preliminary step of downstream analysis. The existing approaches are primarily based on high variation (highly variable genes) or significant high expression (highly expressed genes)...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Snehalika Lall, Sumanta Ray, Sanghamitra Bandyopadhyay
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
Acceso en línea:https://doaj.org/article/2d9005ad73a34ac7b9a1c6243dcf01ac
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:2d9005ad73a34ac7b9a1c6243dcf01ac
record_format dspace
spelling oai:doaj.org-article:2d9005ad73a34ac7b9a1c6243dcf01ac2021-12-02T19:57:28ZRgCop-A regularized copula based method for gene selection in single-cell RNA-seq data.1553-734X1553-735810.1371/journal.pcbi.1009464https://doaj.org/article/2d9005ad73a34ac7b9a1c6243dcf01ac2021-10-01T00:00:00Zhttps://doi.org/10.1371/journal.pcbi.1009464https://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Gene selection in unannotated large single cell RNA sequencing (scRNA-seq) data is important and crucial step in the preliminary step of downstream analysis. The existing approaches are primarily based on high variation (highly variable genes) or significant high expression (highly expressed genes) failed to provide stable and predictive feature set due to technical noise present in the data. Here, we propose RgCop, a novel regularized copula based method for gene selection from large single cell RNA-seq data. RgCop utilizes copula correlation (Ccor), a robust equitable dependence measure that captures multivariate dependency among a set of genes in single cell expression data. We formulate an objective function by adding l1 regularization term with Ccor to penalizes the redundant co-efficient of features/genes, resulting non-redundant effective features/genes set. Results show a significant improvement in the clustering/classification performance of real life scRNA-seq data over the other state-of-the-art. RgCop performs extremely well in capturing dependence among the features of noisy data due to the scale invariant property of copula, thereby improving the stability of the method. Moreover, the differentially expressed (DE) genes identified from the clusters of scRNA-seq data are found to provide an accurate annotation of cells. Finally, the features/genes obtained from RgCop is able to annotate the unknown cells with high accuracy.Snehalika LallSumanta RaySanghamitra BandyopadhyayPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 17, Iss 10, p e1009464 (2021)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Snehalika Lall
Sumanta Ray
Sanghamitra Bandyopadhyay
RgCop-A regularized copula based method for gene selection in single-cell RNA-seq data.
description Gene selection in unannotated large single cell RNA sequencing (scRNA-seq) data is important and crucial step in the preliminary step of downstream analysis. The existing approaches are primarily based on high variation (highly variable genes) or significant high expression (highly expressed genes) failed to provide stable and predictive feature set due to technical noise present in the data. Here, we propose RgCop, a novel regularized copula based method for gene selection from large single cell RNA-seq data. RgCop utilizes copula correlation (Ccor), a robust equitable dependence measure that captures multivariate dependency among a set of genes in single cell expression data. We formulate an objective function by adding l1 regularization term with Ccor to penalizes the redundant co-efficient of features/genes, resulting non-redundant effective features/genes set. Results show a significant improvement in the clustering/classification performance of real life scRNA-seq data over the other state-of-the-art. RgCop performs extremely well in capturing dependence among the features of noisy data due to the scale invariant property of copula, thereby improving the stability of the method. Moreover, the differentially expressed (DE) genes identified from the clusters of scRNA-seq data are found to provide an accurate annotation of cells. Finally, the features/genes obtained from RgCop is able to annotate the unknown cells with high accuracy.
format article
author Snehalika Lall
Sumanta Ray
Sanghamitra Bandyopadhyay
author_facet Snehalika Lall
Sumanta Ray
Sanghamitra Bandyopadhyay
author_sort Snehalika Lall
title RgCop-A regularized copula based method for gene selection in single-cell RNA-seq data.
title_short RgCop-A regularized copula based method for gene selection in single-cell RNA-seq data.
title_full RgCop-A regularized copula based method for gene selection in single-cell RNA-seq data.
title_fullStr RgCop-A regularized copula based method for gene selection in single-cell RNA-seq data.
title_full_unstemmed RgCop-A regularized copula based method for gene selection in single-cell RNA-seq data.
title_sort rgcop-a regularized copula based method for gene selection in single-cell rna-seq data.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/2d9005ad73a34ac7b9a1c6243dcf01ac
work_keys_str_mv AT snehalikalall rgcoparegularizedcopulabasedmethodforgeneselectioninsinglecellrnaseqdata
AT sumantaray rgcoparegularizedcopulabasedmethodforgeneselectioninsinglecellrnaseqdata
AT sanghamitrabandyopadhyay rgcoparegularizedcopulabasedmethodforgeneselectioninsinglecellrnaseqdata
_version_ 1718375830763077632