An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria
Tumor heterogeneity significantly increases the difficulty of tumor treatment. The same drugs and treatment methods have different effects on different tumor subtypes. Therefore, tumor heterogeneity is one of the main sources of poor prognosis, recurrence and metastasis. At present, there have been...
Guardado en:
Autores principales: | , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
AIMS Press
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/9a26eafa31734643a30c0ab3ee83fe35 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:9a26eafa31734643a30c0ab3ee83fe35 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:9a26eafa31734643a30c0ab3ee83fe352021-11-23T02:34:07ZAn epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria10.3934/mbe.20213821551-0018https://doaj.org/article/9a26eafa31734643a30c0ab3ee83fe352021-09-01T00:00:00Zhttps://www.aimspress.com/article/doi/10.3934/mbe.2021382?viewType=HTMLhttps://doaj.org/toc/1551-0018Tumor heterogeneity significantly increases the difficulty of tumor treatment. The same drugs and treatment methods have different effects on different tumor subtypes. Therefore, tumor heterogeneity is one of the main sources of poor prognosis, recurrence and metastasis. At present, there have been some computational methods to study tumor heterogeneity from the level of genome, transcriptome, and histology, but these methods still have certain limitations. In this study, we proposed an epistasis and heterogeneity analysis method based on genomic single nucleotide polymorphism (SNP) data. First of all, a maximum correlation and maximum consistence criteria was designed based on Bayesian network score K2 and information entropy for evaluating genomic epistasis. As the number of SNPs increases, the epistasis combination space increases sharply, resulting in a combination explosion phenomenon. Therefore, we next use an improved genetic algorithm to search the SNP epistatic combination space for identifying potential feasible epistasis solutions. Multiple epistasis solutions represent different pathogenic gene combinations, which may lead to different tumor subtypes, that is, heterogeneity. Finally, the XGBoost classifier is trained with feature SNPs selected that constitute multiple sets of epistatic solutions to verify that considering tumor heterogeneity is beneficial to improve the accuracy of tumor subtype prediction. In order to demonstrate the effectiveness of our method, the power of multiple epistatic recognition and the accuracy of tumor subtype classification measures are evaluated. Extensive simulation results show that our method has better power and prediction accuracy than previous methods.Xia ChenYexiong LinQiang Qu Bin NingHaowen ChenXiong LiAIMS Pressarticlegenetic algorithmbayesian networkinformation entropytumor subtype classificationgenome variationBiotechnologyTP248.13-248.65MathematicsQA1-939ENMathematical Biosciences and Engineering, Vol 18, Iss 6, Pp 7711-7726 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
genetic algorithm bayesian network information entropy tumor subtype classification genome variation Biotechnology TP248.13-248.65 Mathematics QA1-939 |
spellingShingle |
genetic algorithm bayesian network information entropy tumor subtype classification genome variation Biotechnology TP248.13-248.65 Mathematics QA1-939 Xia Chen Yexiong Lin Qiang Qu Bin Ning Haowen Chen Xiong Li An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria |
description |
Tumor heterogeneity significantly increases the difficulty of tumor treatment. The same drugs and treatment methods have different effects on different tumor subtypes. Therefore, tumor heterogeneity is one of the main sources of poor prognosis, recurrence and metastasis. At present, there have been some computational methods to study tumor heterogeneity from the level of genome, transcriptome, and histology, but these methods still have certain limitations. In this study, we proposed an epistasis and heterogeneity analysis method based on genomic single nucleotide polymorphism (SNP) data. First of all, a maximum correlation and maximum consistence criteria was designed based on Bayesian network score K2 and information entropy for evaluating genomic epistasis. As the number of SNPs increases, the epistasis combination space increases sharply, resulting in a combination explosion phenomenon. Therefore, we next use an improved genetic algorithm to search the SNP epistatic combination space for identifying potential feasible epistasis solutions. Multiple epistasis solutions represent different pathogenic gene combinations, which may lead to different tumor subtypes, that is, heterogeneity. Finally, the XGBoost classifier is trained with feature SNPs selected that constitute multiple sets of epistatic solutions to verify that considering tumor heterogeneity is beneficial to improve the accuracy of tumor subtype prediction. In order to demonstrate the effectiveness of our method, the power of multiple epistatic recognition and the accuracy of tumor subtype classification measures are evaluated. Extensive simulation results show that our method has better power and prediction accuracy than previous methods. |
format |
article |
author |
Xia Chen Yexiong Lin Qiang Qu Bin Ning Haowen Chen Xiong Li |
author_facet |
Xia Chen Yexiong Lin Qiang Qu Bin Ning Haowen Chen Xiong Li |
author_sort |
Xia Chen |
title |
An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria |
title_short |
An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria |
title_full |
An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria |
title_fullStr |
An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria |
title_full_unstemmed |
An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria |
title_sort |
epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria |
publisher |
AIMS Press |
publishDate |
2021 |
url |
https://doaj.org/article/9a26eafa31734643a30c0ab3ee83fe35 |
work_keys_str_mv |
AT xiachen anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria AT yexionglin anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria AT qiangqu anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria AT binning anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria AT haowenchen anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria AT xiongli anepistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria AT xiachen epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria AT yexionglin epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria AT qiangqu epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria AT binning epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria AT haowenchen epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria AT xiongli epistasisandheterogeneityanalysismethodbasedonmaximumcorrelationandmaximumconsistencecriteria |
_version_ |
1718417391183986688 |