Identification of Pan-Cancer Biomarkers Based on the Gene Expression Profiles of Cancer Cell Lines

There are many types of cancers. Although they share some hallmarks, such as proliferation and metastasis, they are still very different from many perspectives. They grow on different organ or tissues. Does each cancer have a unique gene expression pattern that makes it different from other cancer t...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: ShiJian Ding, Hao Li, Yu-Hang Zhang, XianChao Zhou, KaiYan Feng, ZhanDong Li, Lei Chen, Tao Huang, Yu-Dong Cai
Formato: article
Lenguaje:EN
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://doaj.org/article/b6ecdcbcfc4042e28b83231f75ac0449
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:b6ecdcbcfc4042e28b83231f75ac0449
record_format dspace
spelling oai:doaj.org-article:b6ecdcbcfc4042e28b83231f75ac04492021-12-01T18:49:54ZIdentification of Pan-Cancer Biomarkers Based on the Gene Expression Profiles of Cancer Cell Lines2296-634X10.3389/fcell.2021.781285https://doaj.org/article/b6ecdcbcfc4042e28b83231f75ac04492021-11-01T00:00:00Zhttps://www.frontiersin.org/articles/10.3389/fcell.2021.781285/fullhttps://doaj.org/toc/2296-634XThere are many types of cancers. Although they share some hallmarks, such as proliferation and metastasis, they are still very different from many perspectives. They grow on different organ or tissues. Does each cancer have a unique gene expression pattern that makes it different from other cancer types? After the Cancer Genome Atlas (TCGA) project, there are more and more pan-cancer studies. Researchers want to get robust gene expression signature from pan-cancer patients. But there is large variance in cancer patients due to heterogeneity. To get robust results, the sample size will be too large to recruit. In this study, we tried another approach to get robust pan-cancer biomarkers by using the cell line data to reduce the variance. We applied several advanced computational methods to analyze the Cancer Cell Line Encyclopedia (CCLE) gene expression profiles which included 988 cell lines from 20 cancer types. Two feature selection methods, including Boruta, and max-relevance and min-redundancy methods, were applied to the cell line gene expression data one by one, generating a feature list. Such list was fed into incremental feature selection method, incorporating one classification algorithm, to extract biomarkers, construct optimal classifiers and decision rules. The optimal classifiers provided good performance, which can be useful tools to identify cell lines from different cancer types, whereas the biomarkers (e.g. NCKAP1, TNFRSF12A, LAMB2, FKBP9, PFN2, TOM1L1) and rules identified in this work may provide a meaningful and precise reference for differentiating multiple types of cancer and contribute to the personalized treatment of tumors.ShiJian DingHao LiYu-Hang ZhangXianChao ZhouKaiYan FengZhanDong LiLei ChenTao HuangTao HuangYu-Dong CaiFrontiers Media S.A.articlepan-cancer studyfeature selectionclassification algorithmdecision rulebiomarkerBiology (General)QH301-705.5ENFrontiers in Cell and Developmental Biology, Vol 9 (2021)
institution DOAJ
collection DOAJ
language EN
topic pan-cancer study
feature selection
classification algorithm
decision rule
biomarker
Biology (General)
QH301-705.5
spellingShingle pan-cancer study
feature selection
classification algorithm
decision rule
biomarker
Biology (General)
QH301-705.5
ShiJian Ding
Hao Li
Yu-Hang Zhang
XianChao Zhou
KaiYan Feng
ZhanDong Li
Lei Chen
Tao Huang
Tao Huang
Yu-Dong Cai
Identification of Pan-Cancer Biomarkers Based on the Gene Expression Profiles of Cancer Cell Lines
description There are many types of cancers. Although they share some hallmarks, such as proliferation and metastasis, they are still very different from many perspectives. They grow on different organ or tissues. Does each cancer have a unique gene expression pattern that makes it different from other cancer types? After the Cancer Genome Atlas (TCGA) project, there are more and more pan-cancer studies. Researchers want to get robust gene expression signature from pan-cancer patients. But there is large variance in cancer patients due to heterogeneity. To get robust results, the sample size will be too large to recruit. In this study, we tried another approach to get robust pan-cancer biomarkers by using the cell line data to reduce the variance. We applied several advanced computational methods to analyze the Cancer Cell Line Encyclopedia (CCLE) gene expression profiles which included 988 cell lines from 20 cancer types. Two feature selection methods, including Boruta, and max-relevance and min-redundancy methods, were applied to the cell line gene expression data one by one, generating a feature list. Such list was fed into incremental feature selection method, incorporating one classification algorithm, to extract biomarkers, construct optimal classifiers and decision rules. The optimal classifiers provided good performance, which can be useful tools to identify cell lines from different cancer types, whereas the biomarkers (e.g. NCKAP1, TNFRSF12A, LAMB2, FKBP9, PFN2, TOM1L1) and rules identified in this work may provide a meaningful and precise reference for differentiating multiple types of cancer and contribute to the personalized treatment of tumors.
format article
author ShiJian Ding
Hao Li
Yu-Hang Zhang
XianChao Zhou
KaiYan Feng
ZhanDong Li
Lei Chen
Tao Huang
Tao Huang
Yu-Dong Cai
author_facet ShiJian Ding
Hao Li
Yu-Hang Zhang
XianChao Zhou
KaiYan Feng
ZhanDong Li
Lei Chen
Tao Huang
Tao Huang
Yu-Dong Cai
author_sort ShiJian Ding
title Identification of Pan-Cancer Biomarkers Based on the Gene Expression Profiles of Cancer Cell Lines
title_short Identification of Pan-Cancer Biomarkers Based on the Gene Expression Profiles of Cancer Cell Lines
title_full Identification of Pan-Cancer Biomarkers Based on the Gene Expression Profiles of Cancer Cell Lines
title_fullStr Identification of Pan-Cancer Biomarkers Based on the Gene Expression Profiles of Cancer Cell Lines
title_full_unstemmed Identification of Pan-Cancer Biomarkers Based on the Gene Expression Profiles of Cancer Cell Lines
title_sort identification of pan-cancer biomarkers based on the gene expression profiles of cancer cell lines
publisher Frontiers Media S.A.
publishDate 2021
url https://doaj.org/article/b6ecdcbcfc4042e28b83231f75ac0449
work_keys_str_mv AT shijianding identificationofpancancerbiomarkersbasedonthegeneexpressionprofilesofcancercelllines
AT haoli identificationofpancancerbiomarkersbasedonthegeneexpressionprofilesofcancercelllines
AT yuhangzhang identificationofpancancerbiomarkersbasedonthegeneexpressionprofilesofcancercelllines
AT xianchaozhou identificationofpancancerbiomarkersbasedonthegeneexpressionprofilesofcancercelllines
AT kaiyanfeng identificationofpancancerbiomarkersbasedonthegeneexpressionprofilesofcancercelllines
AT zhandongli identificationofpancancerbiomarkersbasedonthegeneexpressionprofilesofcancercelllines
AT leichen identificationofpancancerbiomarkersbasedonthegeneexpressionprofilesofcancercelllines
AT taohuang identificationofpancancerbiomarkersbasedonthegeneexpressionprofilesofcancercelllines
AT taohuang identificationofpancancerbiomarkersbasedonthegeneexpressionprofilesofcancercelllines
AT yudongcai identificationofpancancerbiomarkersbasedonthegeneexpressionprofilesofcancercelllines
_version_ 1718404715931238400