A tree based approach for multi-class classification of surgical procedures using structured and unstructured data

Abstract Background In surgical department, CPT code assignment has been a complicated manual human effort, that entails significant related knowledge and experience. While there are several studies using CPTs to make predictions in surgical services, literature on predicting CPTs in surgical and ot...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Tannaz Khaleghi, Alper Murat, Suzan Arslanturk
Formato: article
Lenguaje:EN
Publicado: BMC 2021
Materias:
Acceso en línea:https://doaj.org/article/4b2c8060fcd7477b9fd8596b535186ce
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:4b2c8060fcd7477b9fd8596b535186ce
record_format dspace
spelling oai:doaj.org-article:4b2c8060fcd7477b9fd8596b535186ce2021-11-28T12:26:11ZA tree based approach for multi-class classification of surgical procedures using structured and unstructured data10.1186/s12911-021-01665-w1472-6947https://doaj.org/article/4b2c8060fcd7477b9fd8596b535186ce2021-11-01T00:00:00Zhttps://doi.org/10.1186/s12911-021-01665-whttps://doaj.org/toc/1472-6947Abstract Background In surgical department, CPT code assignment has been a complicated manual human effort, that entails significant related knowledge and experience. While there are several studies using CPTs to make predictions in surgical services, literature on predicting CPTs in surgical and other services using text features is very sparse. This study improves the prediction of CPTs by the means of informative features and a novel re-prioritization algorithm. Methods The input data used in this study is composed of both structured and unstructured data. The ground truth labels (CPTs) are obtained from medical coding databases using relative value units which indicates the major operational procedures in each surgery case. In the modeling process, we first utilize Random Forest multi-class classification model to predict the CPT codes. Second, we extract the key information such as label probabilities, feature importance measures, and medical term frequency. Then, the indicated factors are used in a novel algorithm to rearrange the alternative CPT codes in the list of potential candidates based on the calculated weights. Results To evaluate the performance of both phases, prediction and complementary improvement, we report the accuracy scores of multi-class CPT prediction tasks for datasets of 5 key surgery case specialities. The Random Forest model performs the classification task with 74–76% when predicting the primary CPT (accuracy@1) versus the CPT set (accuracy@2) with respect to two filtering conditions on CPT codes. The complementary algorithm improves the results from initial step by 8% on average. Furthermore, the incorporated text features enhanced the quality of the output by 20–35%. The model outperforms the state-of-the-art neural network model with respect to accuracy, precision and recall. Conclusions We have established a robust framework based on a decision tree predictive model. We predict the surgical codes more accurately and robust compared to the state-of-the-art deep neural structures which can help immensely in both surgery billing and scheduling purposes in such units.Tannaz KhaleghiAlper MuratSuzan ArslanturkBMCarticleCurrent procedure terminology (CPT) codeMachine learningEnsemble learningImportance weightRandom ForestMulti-class classificationComputer applications to medicine. Medical informaticsR858-859.7ENBMC Medical Informatics and Decision Making, Vol 21, Iss 1, Pp 1-12 (2021)
institution DOAJ
collection DOAJ
language EN
topic Current procedure terminology (CPT) code
Machine learning
Ensemble learning
Importance weight
Random Forest
Multi-class classification
Computer applications to medicine. Medical informatics
R858-859.7
spellingShingle Current procedure terminology (CPT) code
Machine learning
Ensemble learning
Importance weight
Random Forest
Multi-class classification
Computer applications to medicine. Medical informatics
R858-859.7
Tannaz Khaleghi
Alper Murat
Suzan Arslanturk
A tree based approach for multi-class classification of surgical procedures using structured and unstructured data
description Abstract Background In surgical department, CPT code assignment has been a complicated manual human effort, that entails significant related knowledge and experience. While there are several studies using CPTs to make predictions in surgical services, literature on predicting CPTs in surgical and other services using text features is very sparse. This study improves the prediction of CPTs by the means of informative features and a novel re-prioritization algorithm. Methods The input data used in this study is composed of both structured and unstructured data. The ground truth labels (CPTs) are obtained from medical coding databases using relative value units which indicates the major operational procedures in each surgery case. In the modeling process, we first utilize Random Forest multi-class classification model to predict the CPT codes. Second, we extract the key information such as label probabilities, feature importance measures, and medical term frequency. Then, the indicated factors are used in a novel algorithm to rearrange the alternative CPT codes in the list of potential candidates based on the calculated weights. Results To evaluate the performance of both phases, prediction and complementary improvement, we report the accuracy scores of multi-class CPT prediction tasks for datasets of 5 key surgery case specialities. The Random Forest model performs the classification task with 74–76% when predicting the primary CPT (accuracy@1) versus the CPT set (accuracy@2) with respect to two filtering conditions on CPT codes. The complementary algorithm improves the results from initial step by 8% on average. Furthermore, the incorporated text features enhanced the quality of the output by 20–35%. The model outperforms the state-of-the-art neural network model with respect to accuracy, precision and recall. Conclusions We have established a robust framework based on a decision tree predictive model. We predict the surgical codes more accurately and robust compared to the state-of-the-art deep neural structures which can help immensely in both surgery billing and scheduling purposes in such units.
format article
author Tannaz Khaleghi
Alper Murat
Suzan Arslanturk
author_facet Tannaz Khaleghi
Alper Murat
Suzan Arslanturk
author_sort Tannaz Khaleghi
title A tree based approach for multi-class classification of surgical procedures using structured and unstructured data
title_short A tree based approach for multi-class classification of surgical procedures using structured and unstructured data
title_full A tree based approach for multi-class classification of surgical procedures using structured and unstructured data
title_fullStr A tree based approach for multi-class classification of surgical procedures using structured and unstructured data
title_full_unstemmed A tree based approach for multi-class classification of surgical procedures using structured and unstructured data
title_sort tree based approach for multi-class classification of surgical procedures using structured and unstructured data
publisher BMC
publishDate 2021
url https://doaj.org/article/4b2c8060fcd7477b9fd8596b535186ce
work_keys_str_mv AT tannazkhaleghi atreebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata
AT alpermurat atreebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata
AT suzanarslanturk atreebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata
AT tannazkhaleghi treebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata
AT alpermurat treebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata
AT suzanarslanturk treebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata
_version_ 1718407949356892160