Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning

Breast cancer is one of the most common cancers in women all over the world. Due to the improvement of medical treatments, most of the breast cancer patients would be in remission. However, the patients have to face the next challenge, the recurrence of breast cancer which may cause more severe effe...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Yang Pei-Tse, Wu Wen-Shuo, Wu Chia-Chun, Shih Yi-Nuo, Hsieh Chung-Ho, Hsu Jia-Lien
Formato: article
Lenguaje:EN
Publicado: De Gruyter 2021
Materias:
R
Acceso en línea:https://doaj.org/article/b3caf70b408442378144d6d813871a4e
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:b3caf70b408442378144d6d813871a4e
record_format dspace
spelling oai:doaj.org-article:b3caf70b408442378144d6d813871a4e2021-12-05T14:10:54ZBreast cancer recurrence prediction with ensemble methods and cost-sensitive learning2391-546310.1515/med-2021-0282https://doaj.org/article/b3caf70b408442378144d6d813871a4e2021-05-01T00:00:00Zhttps://doi.org/10.1515/med-2021-0282https://doaj.org/toc/2391-5463Breast cancer is one of the most common cancers in women all over the world. Due to the improvement of medical treatments, most of the breast cancer patients would be in remission. However, the patients have to face the next challenge, the recurrence of breast cancer which may cause more severe effects, and even death. The prediction of breast cancer recurrence is crucial for reducing mortality. This paper proposes a prediction model for the recurrence of breast cancer based on clinical nominal and numeric features. In this study, our data consist of 1,061 patients from Breast Cancer Registry from Shin Kong Wu Ho-Su Memorial Hospital between 2011 and 2016, in which 37 records are denoted as breast cancer recurrence. Each record has 85 features. Our approach consists of three stages. First, we perform data preprocessing and feature selection techniques to consolidate the dataset. Among all features, six features are identified for further processing in the following stages. Next, we apply resampling techniques to resolve the issue of class imbalance. Finally, we construct two classifiers, AdaBoost and cost-sensitive learning, to predict the risk of recurrence and carry out the performance evaluation in three-fold cross-validation. By applying the AdaBoost method, we achieve accuracy of 0.973 and sensitivity of 0.675. By combining the AdaBoost and cost-sensitive method of our model, we achieve a reasonable accuracy of 0.468 and substantially high sensitivity of 0.947 which guarantee almost no false dismissal. Our model can be used as a supporting tool in the setting and evaluation of the follow-up visit for early intervention and more advanced treatments to lower cancer mortality.Yang Pei-TseWu Wen-ShuoWu Chia-ChunShih Yi-NuoHsieh Chung-HoHsu Jia-LienDe Gruyterarticlerecurrent breast cancermachine learningclassificationadaboostcost-sensitive methodMedicineRENOpen Medicine, Vol 16, Iss 1, Pp 754-768 (2021)
institution DOAJ
collection DOAJ
language EN
topic recurrent breast cancer
machine learning
classification
adaboost
cost-sensitive method
Medicine
R
spellingShingle recurrent breast cancer
machine learning
classification
adaboost
cost-sensitive method
Medicine
R
Yang Pei-Tse
Wu Wen-Shuo
Wu Chia-Chun
Shih Yi-Nuo
Hsieh Chung-Ho
Hsu Jia-Lien
Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning
description Breast cancer is one of the most common cancers in women all over the world. Due to the improvement of medical treatments, most of the breast cancer patients would be in remission. However, the patients have to face the next challenge, the recurrence of breast cancer which may cause more severe effects, and even death. The prediction of breast cancer recurrence is crucial for reducing mortality. This paper proposes a prediction model for the recurrence of breast cancer based on clinical nominal and numeric features. In this study, our data consist of 1,061 patients from Breast Cancer Registry from Shin Kong Wu Ho-Su Memorial Hospital between 2011 and 2016, in which 37 records are denoted as breast cancer recurrence. Each record has 85 features. Our approach consists of three stages. First, we perform data preprocessing and feature selection techniques to consolidate the dataset. Among all features, six features are identified for further processing in the following stages. Next, we apply resampling techniques to resolve the issue of class imbalance. Finally, we construct two classifiers, AdaBoost and cost-sensitive learning, to predict the risk of recurrence and carry out the performance evaluation in three-fold cross-validation. By applying the AdaBoost method, we achieve accuracy of 0.973 and sensitivity of 0.675. By combining the AdaBoost and cost-sensitive method of our model, we achieve a reasonable accuracy of 0.468 and substantially high sensitivity of 0.947 which guarantee almost no false dismissal. Our model can be used as a supporting tool in the setting and evaluation of the follow-up visit for early intervention and more advanced treatments to lower cancer mortality.
format article
author Yang Pei-Tse
Wu Wen-Shuo
Wu Chia-Chun
Shih Yi-Nuo
Hsieh Chung-Ho
Hsu Jia-Lien
author_facet Yang Pei-Tse
Wu Wen-Shuo
Wu Chia-Chun
Shih Yi-Nuo
Hsieh Chung-Ho
Hsu Jia-Lien
author_sort Yang Pei-Tse
title Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning
title_short Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning
title_full Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning
title_fullStr Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning
title_full_unstemmed Breast cancer recurrence prediction with ensemble methods and cost-sensitive learning
title_sort breast cancer recurrence prediction with ensemble methods and cost-sensitive learning
publisher De Gruyter
publishDate 2021
url https://doaj.org/article/b3caf70b408442378144d6d813871a4e
work_keys_str_mv AT yangpeitse breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning
AT wuwenshuo breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning
AT wuchiachun breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning
AT shihyinuo breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning
AT hsiehchungho breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning
AT hsujialien breastcancerrecurrencepredictionwithensemblemethodsandcostsensitivelearning
_version_ 1718371623314128896