Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data

Background: Current diagnostic methods for colorectal cancer (CRC) are colonoscopy and sigmoidoscopy, which are invasive and complex procedures with possible complications. This study aimed to determine models for CRC identification that involve minimally invasive, affordable, portable, and accurate...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Hui Li MS, Jianmei Lin BS, Yanhong Xiao MS, Wenwen Zheng MS, Lu Zhao PhD, Xiangling Yang PhD, Minsheng Zhong MS, Huanliang Liu MD, PhD
Formato: article
Lenguaje:EN
Publicado: SAGE Publishing 2021
Materias:
Acceso en línea:https://doaj.org/article/ca2bfa8624df486da2556762203b5033
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:ca2bfa8624df486da2556762203b5033
record_format dspace
spelling oai:doaj.org-article:ca2bfa8624df486da2556762203b50332021-11-21T02:04:56ZColorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data1533-033810.1177/15330338211058352https://doaj.org/article/ca2bfa8624df486da2556762203b50332021-11-01T00:00:00Zhttps://doi.org/10.1177/15330338211058352https://doaj.org/toc/1533-0338Background: Current diagnostic methods for colorectal cancer (CRC) are colonoscopy and sigmoidoscopy, which are invasive and complex procedures with possible complications. This study aimed to determine models for CRC identification that involve minimally invasive, affordable, portable, and accurate screening variables. Methods: This was a retrospective study that used data from electronic medical records of patients with CRC and healthy individuals between July 2017 and June 2018. Laboratory data, including liver enzymes, lipid profiles, complete blood counts, and tumor biomarkers, were extracted from the electronic medical records. Five machine learning models (logistic regression, random forest, k-nearest neighbors, support vector machine [SVM], and naïve Bayes) were used to identify CRC. The performances were evaluated using the areas under the curve (AUCs), sensitivity, specificity, positive predictive values (PPV), and negative predictive values (NPV). Results: A total of 1164 electronic medical records (CRC patients: 582; healthy controls: 582) were included. The logistic regression model achieved the highest performance in identifying CRC (AUC: 0.865, sensitivity: 89.5%, specificity: 83.5%, PPV: 84.4%, NPV: 88.9%). The first four weighted features in the model were carcinoembryonic antigen (CEA), hemoglobin (HGB), lipoprotein (a) (Lp(a)), and high-density lipoprotein (HDL). A diagnostic model for CRC was established based on the four indicators, with an AUC of 0.849 (0.840-0.860) for identifying all CRC patients, and it performed best in discriminating patients with late colon cancer from healthy individuals with an AUC of 0.905 (0.889-0.929). Conclusions: The logistic regression model based on CEA, HGB, Lp(a), and HDL might be a powerful, noninvasive, and cost-effective method to identify CRC.Hui Li MSJianmei Lin BSYanhong Xiao MSWenwen Zheng MSLu Zhao PhDXiangling Yang PhDMinsheng Zhong MSHuanliang Liu MD, PhDSAGE PublishingarticleNeoplasms. Tumors. Oncology. Including cancer and carcinogensRC254-282ENTechnology in Cancer Research & Treatment, Vol 20 (2021)
institution DOAJ
collection DOAJ
language EN
topic Neoplasms. Tumors. Oncology. Including cancer and carcinogens
RC254-282
spellingShingle Neoplasms. Tumors. Oncology. Including cancer and carcinogens
RC254-282
Hui Li MS
Jianmei Lin BS
Yanhong Xiao MS
Wenwen Zheng MS
Lu Zhao PhD
Xiangling Yang PhD
Minsheng Zhong MS
Huanliang Liu MD, PhD
Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
description Background: Current diagnostic methods for colorectal cancer (CRC) are colonoscopy and sigmoidoscopy, which are invasive and complex procedures with possible complications. This study aimed to determine models for CRC identification that involve minimally invasive, affordable, portable, and accurate screening variables. Methods: This was a retrospective study that used data from electronic medical records of patients with CRC and healthy individuals between July 2017 and June 2018. Laboratory data, including liver enzymes, lipid profiles, complete blood counts, and tumor biomarkers, were extracted from the electronic medical records. Five machine learning models (logistic regression, random forest, k-nearest neighbors, support vector machine [SVM], and naïve Bayes) were used to identify CRC. The performances were evaluated using the areas under the curve (AUCs), sensitivity, specificity, positive predictive values (PPV), and negative predictive values (NPV). Results: A total of 1164 electronic medical records (CRC patients: 582; healthy controls: 582) were included. The logistic regression model achieved the highest performance in identifying CRC (AUC: 0.865, sensitivity: 89.5%, specificity: 83.5%, PPV: 84.4%, NPV: 88.9%). The first four weighted features in the model were carcinoembryonic antigen (CEA), hemoglobin (HGB), lipoprotein (a) (Lp(a)), and high-density lipoprotein (HDL). A diagnostic model for CRC was established based on the four indicators, with an AUC of 0.849 (0.840-0.860) for identifying all CRC patients, and it performed best in discriminating patients with late colon cancer from healthy individuals with an AUC of 0.905 (0.889-0.929). Conclusions: The logistic regression model based on CEA, HGB, Lp(a), and HDL might be a powerful, noninvasive, and cost-effective method to identify CRC.
format article
author Hui Li MS
Jianmei Lin BS
Yanhong Xiao MS
Wenwen Zheng MS
Lu Zhao PhD
Xiangling Yang PhD
Minsheng Zhong MS
Huanliang Liu MD, PhD
author_facet Hui Li MS
Jianmei Lin BS
Yanhong Xiao MS
Wenwen Zheng MS
Lu Zhao PhD
Xiangling Yang PhD
Minsheng Zhong MS
Huanliang Liu MD, PhD
author_sort Hui Li MS
title Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
title_short Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
title_full Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
title_fullStr Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
title_full_unstemmed Colorectal Cancer Detected by Machine Learning Models Using Conventional Laboratory Test Data
title_sort colorectal cancer detected by machine learning models using conventional laboratory test data
publisher SAGE Publishing
publishDate 2021
url https://doaj.org/article/ca2bfa8624df486da2556762203b5033
work_keys_str_mv AT huilims colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT jianmeilinbs colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT yanhongxiaoms colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT wenwenzhengms colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT luzhaophd colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT xianglingyangphd colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT minshengzhongms colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
AT huanliangliumdphd colorectalcancerdetectedbymachinelearningmodelsusingconventionallaboratorytestdata
_version_ 1718419389965926400