Optimized Breast Cancer Classification using Feature Selection and Outliers Detection

Breast cancer is the second most commonly diagnosed cancer in women throughout the world. It is on the rise, especially in developing countries, where the majority of cases are discovered late. Breast cancer develops when cancerous tumors form on the surface of the breast cells. The absence of accu...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: A. B Yusuf, R. M Dima, S. K Aina
Formato: article
Lenguaje:EN
Publicado: Nigerian Society of Physical Sciences 2021
Materias:
Acceso en línea:https://doaj.org/article/6ef25c2f3a4240c4ae37c931d13b9c1f
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:6ef25c2f3a4240c4ae37c931d13b9c1f
record_format dspace
spelling oai:doaj.org-article:6ef25c2f3a4240c4ae37c931d13b9c1f2021-11-30T12:19:06ZOptimized Breast Cancer Classification using Feature Selection and Outliers Detection10.46481/jnsps.2021.3312714-28172714-4704https://doaj.org/article/6ef25c2f3a4240c4ae37c931d13b9c1f2021-11-01T00:00:00Zhttps://journal.nsps.org.ng/index.php/jnsps/article/view/331https://doaj.org/toc/2714-2817https://doaj.org/toc/2714-4704 Breast cancer is the second most commonly diagnosed cancer in women throughout the world. It is on the rise, especially in developing countries, where the majority of cases are discovered late. Breast cancer develops when cancerous tumors form on the surface of the breast cells. The absence of accurate prognostic models to assist physicians recognize symptoms early makes it difficult to develop a treatment plan that would help patients live longer. However, machine learning techniques have recently been used to improve the accuracy and speed of breast cancer diagnosis. If the accuracy is flawless, the model will be more efficient, and the solution to breast cancer diagnosis will be better. Nevertheless, the primary difficulty for systems developed to detect breast cancer using machine-learning models is attaining the greatest classification accuracy and picking the most predictive feature useful for increasing accuracy. As a result, breast cancer prognosis remains a difficulty in today's society. This research seeks to address a flaw in an existing technique that is unable to enhance classification of continuous-valued data, particularly its accuracy and the selection of optimal features for breast cancer prediction. In order to address these issues, this study examines the impact of outliers and feature reduction on the Wisconsin Diagnostic Breast Cancer Dataset, which was tested using seven different machine learning algorithms. The results show that Logistic Regression, Random Forest, and Adaboost classifiers achieved the greatest accuracy of 99.12%, on removal of outliers from the dataset. Also, this filtered dataset with feature selection, on the other hand, has the greatest accuracy of 100% and 99.12% with Random Forest and Gradient boost classifiers, respectively. When compared to other state-of-the-art approaches, the two suggested strategies outperformed the unfiltered data in terms of accuracy. The suggested architecture might be a useful tool for radiologists to reduce the number of false negatives and positives. As a result, the efficiency of breast cancer diagnosis analysis will be increased. A. B YusufR. M DimaS. K AinaNigerian Society of Physical SciencesarticlePhysicsQC1-999ENJournal of Nigerian Society of Physical Sciences, Vol 3, Iss 4 (2021)
institution DOAJ
collection DOAJ
language EN
topic Physics
QC1-999
spellingShingle Physics
QC1-999
A. B Yusuf
R. M Dima
S. K Aina
Optimized Breast Cancer Classification using Feature Selection and Outliers Detection
description Breast cancer is the second most commonly diagnosed cancer in women throughout the world. It is on the rise, especially in developing countries, where the majority of cases are discovered late. Breast cancer develops when cancerous tumors form on the surface of the breast cells. The absence of accurate prognostic models to assist physicians recognize symptoms early makes it difficult to develop a treatment plan that would help patients live longer. However, machine learning techniques have recently been used to improve the accuracy and speed of breast cancer diagnosis. If the accuracy is flawless, the model will be more efficient, and the solution to breast cancer diagnosis will be better. Nevertheless, the primary difficulty for systems developed to detect breast cancer using machine-learning models is attaining the greatest classification accuracy and picking the most predictive feature useful for increasing accuracy. As a result, breast cancer prognosis remains a difficulty in today's society. This research seeks to address a flaw in an existing technique that is unable to enhance classification of continuous-valued data, particularly its accuracy and the selection of optimal features for breast cancer prediction. In order to address these issues, this study examines the impact of outliers and feature reduction on the Wisconsin Diagnostic Breast Cancer Dataset, which was tested using seven different machine learning algorithms. The results show that Logistic Regression, Random Forest, and Adaboost classifiers achieved the greatest accuracy of 99.12%, on removal of outliers from the dataset. Also, this filtered dataset with feature selection, on the other hand, has the greatest accuracy of 100% and 99.12% with Random Forest and Gradient boost classifiers, respectively. When compared to other state-of-the-art approaches, the two suggested strategies outperformed the unfiltered data in terms of accuracy. The suggested architecture might be a useful tool for radiologists to reduce the number of false negatives and positives. As a result, the efficiency of breast cancer diagnosis analysis will be increased.
format article
author A. B Yusuf
R. M Dima
S. K Aina
author_facet A. B Yusuf
R. M Dima
S. K Aina
author_sort A. B Yusuf
title Optimized Breast Cancer Classification using Feature Selection and Outliers Detection
title_short Optimized Breast Cancer Classification using Feature Selection and Outliers Detection
title_full Optimized Breast Cancer Classification using Feature Selection and Outliers Detection
title_fullStr Optimized Breast Cancer Classification using Feature Selection and Outliers Detection
title_full_unstemmed Optimized Breast Cancer Classification using Feature Selection and Outliers Detection
title_sort optimized breast cancer classification using feature selection and outliers detection
publisher Nigerian Society of Physical Sciences
publishDate 2021
url https://doaj.org/article/6ef25c2f3a4240c4ae37c931d13b9c1f
work_keys_str_mv AT abyusuf optimizedbreastcancerclassificationusingfeatureselectionandoutliersdetection
AT rmdima optimizedbreastcancerclassificationusingfeatureselectionandoutliersdetection
AT skaina optimizedbreastcancerclassificationusingfeatureselectionandoutliersdetection
_version_ 1718406627295494144