An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria

Tumor heterogeneity significantly increases the difficulty of tumor treatment. The same drugs and treatment methods have different effects on different tumor subtypes. Therefore, tumor heterogeneity is one of the main sources of poor prognosis, recurrence and metastasis. At present, there have been...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Xia Chen, Yexiong Lin, Qiang Qu, Bin Ning, Haowen Chen, Xiong Li
Formato: article
Lenguaje:EN
Publicado: AIMS Press 2021
Materias:
Acceso en línea:https://doaj.org/article/9a26eafa31734643a30c0ab3ee83fe35
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:9a26eafa31734643a30c0ab3ee83fe35
record_format dspace
spelling oai:doaj.org-article:9a26eafa31734643a30c0ab3ee83fe352021-11-23T02:34:07ZAn epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria10.3934/mbe.20213821551-0018https://doaj.org/article/9a26eafa31734643a30c0ab3ee83fe352021-09-01T00:00:00Zhttps://www.aimspress.com/article/doi/10.3934/mbe.2021382?viewType=HTMLhttps://doaj.org/toc/1551-0018Tumor heterogeneity significantly increases the difficulty of tumor treatment. The same drugs and treatment methods have different effects on different tumor subtypes. Therefore, tumor heterogeneity is one of the main sources of poor prognosis, recurrence and metastasis. At present, there have been some computational methods to study tumor heterogeneity from the level of genome, transcriptome, and histology, but these methods still have certain limitations. In this study, we proposed an epistasis and heterogeneity analysis method based on genomic single nucleotide polymorphism (SNP) data. First of all, a maximum correlation and maximum consistence criteria was designed based on Bayesian network score K2 and information entropy for evaluating genomic epistasis. As the number of SNPs increases, the epistasis combination space increases sharply, resulting in a combination explosion phenomenon. Therefore, we next use an improved genetic algorithm to search the SNP epistatic combination space for identifying potential feasible epistasis solutions. Multiple epistasis solutions represent different pathogenic gene combinations, which may lead to different tumor subtypes, that is, heterogeneity. Finally, the XGBoost classifier is trained with feature SNPs selected that constitute multiple sets of epistatic solutions to verify that considering tumor heterogeneity is beneficial to improve the accuracy of tumor subtype prediction. In order to demonstrate the effectiveness of our method, the power of multiple epistatic recognition and the accuracy of tumor subtype classification measures are evaluated. Extensive simulation results show that our method has better power and prediction accuracy than previous methods.Xia ChenYexiong LinQiang Qu Bin NingHaowen ChenXiong LiAIMS Pressarticlegenetic algorithmbayesian networkinformation entropytumor subtype classificationgenome variationBiotechnologyTP248.13-248.65MathematicsQA1-939ENMathematical Biosciences and Engineering, Vol 18, Iss 6, Pp 7711-7726 (2021)
institution DOAJ
collection DOAJ
language EN
topic genetic algorithm
bayesian network
information entropy
tumor subtype classification
genome variation
Biotechnology
TP248.13-248.65
Mathematics
QA1-939
spellingShingle genetic algorithm
bayesian network
information entropy
tumor subtype classification
genome variation
Biotechnology
TP248.13-248.65
Mathematics
QA1-939
Xia Chen
Yexiong Lin
Qiang Qu
Bin Ning
Haowen Chen
Xiong Li
An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria
description Tumor heterogeneity significantly increases the difficulty of tumor treatment. The same drugs and treatment methods have different effects on different tumor subtypes. Therefore, tumor heterogeneity is one of the main sources of poor prognosis, recurrence and metastasis. At present, there have been some computational methods to study tumor heterogeneity from the level of genome, transcriptome, and histology, but these methods still have certain limitations. In this study, we proposed an epistasis and heterogeneity analysis method based on genomic single nucleotide polymorphism (SNP) data. First of all, a maximum correlation and maximum consistence criteria was designed based on Bayesian network score K2 and information entropy for evaluating genomic epistasis. As the number of SNPs increases, the epistasis combination space increases sharply, resulting in a combination explosion phenomenon. Therefore, we next use an improved genetic algorithm to search the SNP epistatic combination space for identifying potential feasible epistasis solutions. Multiple epistasis solutions represent different pathogenic gene combinations, which may lead to different tumor subtypes, that is, heterogeneity. Finally, the XGBoost classifier is trained with feature SNPs selected that constitute multiple sets of epistatic solutions to verify that considering tumor heterogeneity is beneficial to improve the accuracy of tumor subtype prediction. In order to demonstrate the effectiveness of our method, the power of multiple epistatic recognition and the accuracy of tumor subtype classification measures are evaluated. Extensive simulation results show that our method has better power and prediction accuracy than previous methods.
format article
author Xia Chen
Yexiong Lin
Qiang Qu
Bin Ning
Haowen Chen
Xiong Li
author_facet Xia Chen
Yexiong Lin
Qiang Qu
Bin Ning
Haowen Chen
Xiong Li
author_sort Xia Chen
title An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria
title_short An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria
title_full An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria
title_fullStr An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria
title_full_unstemmed An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria
title_sort epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria
publisher AIMS Press
publishDate 2021
url https://doaj.org/article/9a26eafa31734643a30c0ab3ee83fe35
work_keys_str_mv AT xiachen anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
AT yexionglin anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
AT qiangqu anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
AT binning anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
AT haowenchen anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
AT xiongli anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
AT xiachen epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
AT yexionglin epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
AT qiangqu epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
AT binning epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
AT haowenchen epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
AT xiongli epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria
_version_ 1718417391183986688