A tree based approach for multi-class classification of surgical procedures using structured and unstructured data
Abstract Background In surgical department, CPT code assignment has been a complicated manual human effort, that entails significant related knowledge and experience. While there are several studies using CPTs to make predictions in surgical services, literature on predicting CPTs in surgical and ot...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
BMC
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/4b2c8060fcd7477b9fd8596b535186ce |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:4b2c8060fcd7477b9fd8596b535186ce |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:4b2c8060fcd7477b9fd8596b535186ce2021-11-28T12:26:11ZA tree based approach for multi-class classification of surgical procedures using structured and unstructured data10.1186/s12911-021-01665-w1472-6947https://doaj.org/article/4b2c8060fcd7477b9fd8596b535186ce2021-11-01T00:00:00Zhttps://doi.org/10.1186/s12911-021-01665-whttps://doaj.org/toc/1472-6947Abstract Background In surgical department, CPT code assignment has been a complicated manual human effort, that entails significant related knowledge and experience. While there are several studies using CPTs to make predictions in surgical services, literature on predicting CPTs in surgical and other services using text features is very sparse. This study improves the prediction of CPTs by the means of informative features and a novel re-prioritization algorithm. Methods The input data used in this study is composed of both structured and unstructured data. The ground truth labels (CPTs) are obtained from medical coding databases using relative value units which indicates the major operational procedures in each surgery case. In the modeling process, we first utilize Random Forest multi-class classification model to predict the CPT codes. Second, we extract the key information such as label probabilities, feature importance measures, and medical term frequency. Then, the indicated factors are used in a novel algorithm to rearrange the alternative CPT codes in the list of potential candidates based on the calculated weights. Results To evaluate the performance of both phases, prediction and complementary improvement, we report the accuracy scores of multi-class CPT prediction tasks for datasets of 5 key surgery case specialities. The Random Forest model performs the classification task with 74–76% when predicting the primary CPT (accuracy@1) versus the CPT set (accuracy@2) with respect to two filtering conditions on CPT codes. The complementary algorithm improves the results from initial step by 8% on average. Furthermore, the incorporated text features enhanced the quality of the output by 20–35%. The model outperforms the state-of-the-art neural network model with respect to accuracy, precision and recall. Conclusions We have established a robust framework based on a decision tree predictive model. We predict the surgical codes more accurately and robust compared to the state-of-the-art deep neural structures which can help immensely in both surgery billing and scheduling purposes in such units.Tannaz KhaleghiAlper MuratSuzan ArslanturkBMCarticleCurrent procedure terminology (CPT) codeMachine learningEnsemble learningImportance weightRandom ForestMulti-class classificationComputer applications to medicine. Medical informaticsR858-859.7ENBMC Medical Informatics and Decision Making, Vol 21, Iss 1, Pp 1-12 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Current procedure terminology (CPT) code Machine learning Ensemble learning Importance weight Random Forest Multi-class classification Computer applications to medicine. Medical informatics R858-859.7 |
spellingShingle |
Current procedure terminology (CPT) code Machine learning Ensemble learning Importance weight Random Forest Multi-class classification Computer applications to medicine. Medical informatics R858-859.7 Tannaz Khaleghi Alper Murat Suzan Arslanturk A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
description |
Abstract Background In surgical department, CPT code assignment has been a complicated manual human effort, that entails significant related knowledge and experience. While there are several studies using CPTs to make predictions in surgical services, literature on predicting CPTs in surgical and other services using text features is very sparse. This study improves the prediction of CPTs by the means of informative features and a novel re-prioritization algorithm. Methods The input data used in this study is composed of both structured and unstructured data. The ground truth labels (CPTs) are obtained from medical coding databases using relative value units which indicates the major operational procedures in each surgery case. In the modeling process, we first utilize Random Forest multi-class classification model to predict the CPT codes. Second, we extract the key information such as label probabilities, feature importance measures, and medical term frequency. Then, the indicated factors are used in a novel algorithm to rearrange the alternative CPT codes in the list of potential candidates based on the calculated weights. Results To evaluate the performance of both phases, prediction and complementary improvement, we report the accuracy scores of multi-class CPT prediction tasks for datasets of 5 key surgery case specialities. The Random Forest model performs the classification task with 74–76% when predicting the primary CPT (accuracy@1) versus the CPT set (accuracy@2) with respect to two filtering conditions on CPT codes. The complementary algorithm improves the results from initial step by 8% on average. Furthermore, the incorporated text features enhanced the quality of the output by 20–35%. The model outperforms the state-of-the-art neural network model with respect to accuracy, precision and recall. Conclusions We have established a robust framework based on a decision tree predictive model. We predict the surgical codes more accurately and robust compared to the state-of-the-art deep neural structures which can help immensely in both surgery billing and scheduling purposes in such units. |
format |
article |
author |
Tannaz Khaleghi Alper Murat Suzan Arslanturk |
author_facet |
Tannaz Khaleghi Alper Murat Suzan Arslanturk |
author_sort |
Tannaz Khaleghi |
title |
A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
title_short |
A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
title_full |
A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
title_fullStr |
A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
title_full_unstemmed |
A tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
title_sort |
tree based approach for multi-class classification of surgical procedures using structured and unstructured data |
publisher |
BMC |
publishDate |
2021 |
url |
https://doaj.org/article/4b2c8060fcd7477b9fd8596b535186ce |
work_keys_str_mv |
AT tannazkhaleghi atreebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata AT alpermurat atreebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata AT suzanarslanturk atreebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata AT tannazkhaleghi treebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata AT alpermurat treebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata AT suzanarslanturk treebasedapproachformulticlassclassificationofsurgicalproceduresusingstructuredandunstructureddata |
_version_ |
1718407949356892160 |