Accurate Single-Cell Clustering through Ensemble Similarity Learning

Single-cell sequencing provides novel means to interpret the transcriptomic profiles of individual cells. To obtain in-depth analysis of single-cell sequencing, it requires effective computational methods to accurately predict single-cell clusters because single-cell sequencing techniques only provi...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Hyundoo Jeong, Sungtae Shin, Hong-Gi Yeom
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/81efa5a7ee3344549d181b43bcc5a857
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:81efa5a7ee3344549d181b43bcc5a857
record_format dspace
spelling oai:doaj.org-article:81efa5a7ee3344549d181b43bcc5a8572021-11-25T17:40:34ZAccurate Single-Cell Clustering through Ensemble Similarity Learning10.3390/genes121116702073-4425https://doaj.org/article/81efa5a7ee3344549d181b43bcc5a8572021-10-01T00:00:00Zhttps://www.mdpi.com/2073-4425/12/11/1670https://doaj.org/toc/2073-4425Single-cell sequencing provides novel means to interpret the transcriptomic profiles of individual cells. To obtain in-depth analysis of single-cell sequencing, it requires effective computational methods to accurately predict single-cell clusters because single-cell sequencing techniques only provide the transcriptomic profiles of each cell. Although an accurate estimation of the cell-to-cell similarity is an essential first step to derive reliable single-cell clustering results, it is challenging to obtain the accurate similarity measurement because it highly depends on a selection of genes for similarity evaluations and the optimal set of genes for the accurate similarity estimation is typically unknown. Moreover, due to technical limitations, single-cell sequencing includes a larger number of artificial zeros, and the technical noise makes it difficult to develop effective single-cell clustering algorithms. Here, we describe a novel single-cell clustering algorithm that can accurately predict single-cell clusters in large-scale single-cell sequencing by effectively reducing the zero-inflated noise and accurately estimating the cell-to-cell similarities. First, we construct an ensemble similarity network based on different similarity estimates, and reduce the artificial noise using a random walk with restart framework. Finally, starting from a larger number small size but highly consistent clusters, we iteratively merge a pair of clusters with the maximum similarities until it reaches the predicted number of clusters. Extensive performance evaluation shows that the proposed single-cell clustering algorithm can yield the accurate single-cell clustering results and it can help deciphering the key messages underlying complex biological mechanisms.Hyundoo JeongSungtae ShinHong-Gi YeomMDPI AGarticlesingle-cell RNA sequencingzero-inflated noise reductionensemble similarity estimationcorrespondence networkvisualization and clusteringimputationGeneticsQH426-470ENGenes, Vol 12, Iss 1670, p 1670 (2021)
institution DOAJ
collection DOAJ
language EN
topic single-cell RNA sequencing
zero-inflated noise reduction
ensemble similarity estimation
correspondence network
visualization and clustering
imputation
Genetics
QH426-470
spellingShingle single-cell RNA sequencing
zero-inflated noise reduction
ensemble similarity estimation
correspondence network
visualization and clustering
imputation
Genetics
QH426-470
Hyundoo Jeong
Sungtae Shin
Hong-Gi Yeom
Accurate Single-Cell Clustering through Ensemble Similarity Learning
description Single-cell sequencing provides novel means to interpret the transcriptomic profiles of individual cells. To obtain in-depth analysis of single-cell sequencing, it requires effective computational methods to accurately predict single-cell clusters because single-cell sequencing techniques only provide the transcriptomic profiles of each cell. Although an accurate estimation of the cell-to-cell similarity is an essential first step to derive reliable single-cell clustering results, it is challenging to obtain the accurate similarity measurement because it highly depends on a selection of genes for similarity evaluations and the optimal set of genes for the accurate similarity estimation is typically unknown. Moreover, due to technical limitations, single-cell sequencing includes a larger number of artificial zeros, and the technical noise makes it difficult to develop effective single-cell clustering algorithms. Here, we describe a novel single-cell clustering algorithm that can accurately predict single-cell clusters in large-scale single-cell sequencing by effectively reducing the zero-inflated noise and accurately estimating the cell-to-cell similarities. First, we construct an ensemble similarity network based on different similarity estimates, and reduce the artificial noise using a random walk with restart framework. Finally, starting from a larger number small size but highly consistent clusters, we iteratively merge a pair of clusters with the maximum similarities until it reaches the predicted number of clusters. Extensive performance evaluation shows that the proposed single-cell clustering algorithm can yield the accurate single-cell clustering results and it can help deciphering the key messages underlying complex biological mechanisms.
format article
author Hyundoo Jeong
Sungtae Shin
Hong-Gi Yeom
author_facet Hyundoo Jeong
Sungtae Shin
Hong-Gi Yeom
author_sort Hyundoo Jeong
title Accurate Single-Cell Clustering through Ensemble Similarity Learning
title_short Accurate Single-Cell Clustering through Ensemble Similarity Learning
title_full Accurate Single-Cell Clustering through Ensemble Similarity Learning
title_fullStr Accurate Single-Cell Clustering through Ensemble Similarity Learning
title_full_unstemmed Accurate Single-Cell Clustering through Ensemble Similarity Learning
title_sort accurate single-cell clustering through ensemble similarity learning
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/81efa5a7ee3344549d181b43bcc5a857
work_keys_str_mv AT hyundoojeong accuratesinglecellclusteringthroughensemblesimilaritylearning
AT sungtaeshin accuratesinglecellclusteringthroughensemblesimilaritylearning
AT honggiyeom accuratesinglecellclusteringthroughensemblesimilaritylearning
_version_ 1718412114397233152