Critical evaluation of deep neural networks for wrist fracture detection

Abstract Wrist Fracture is the most common type of fracture with a high incidence rate. Conventional radiography (i.e. X-ray imaging) is used for wrist fracture detection routinely, but occasionally fracture delineation poses issues and an additional confirmation by computed tomography (CT) is neede...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Abu Mohammed Raisuddin, Elias Vaattovaara, Mika Nevalainen, Marko Nikki, Elina Järvenpää, Kaisa Makkonen, Pekka Pinola, Tuula Palsio, Arttu Niemensivu, Osmo Tervonen, Aleksei Tiulpin
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/e86336d998a043168ba26f739ea5a6ac
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:e86336d998a043168ba26f739ea5a6ac
record_format dspace
spelling oai:doaj.org-article:e86336d998a043168ba26f739ea5a6ac2021-12-02T11:39:20ZCritical evaluation of deep neural networks for wrist fracture detection10.1038/s41598-021-85570-22045-2322https://doaj.org/article/e86336d998a043168ba26f739ea5a6ac2021-03-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-85570-2https://doaj.org/toc/2045-2322Abstract Wrist Fracture is the most common type of fracture with a high incidence rate. Conventional radiography (i.e. X-ray imaging) is used for wrist fracture detection routinely, but occasionally fracture delineation poses issues and an additional confirmation by computed tomography (CT) is needed for diagnosis. Recent advances in the field of Deep Learning (DL), a subfield of Artificial Intelligence (AI), have shown that wrist fracture detection can be automated using Convolutional Neural Networks. However, previous studies did not pay close attention to the difficult cases which can only be confirmed via CT imaging. In this study, we have developed and analyzed a state-of-the-art DL-based pipeline for wrist (distal radius) fracture detection—DeepWrist, and evaluated it against one general population test set, and one challenging test set comprising only cases requiring confirmation by CT. Our results reveal that a typical state-of-the-art approach, such as DeepWrist, while having a near-perfect performance on the general independent test set, has a substantially lower performance on the challenging test set—average precision of 0.99 (0.99–0.99) versus 0.64 (0.46–0.83), respectively. Similarly, the area under the ROC curve was of 0.99 (0.98–0.99) versus 0.84 (0.72–0.93), respectively. Our findings highlight the importance of a meticulous analysis of DL-based models before clinical use, and unearth the need for more challenging settings for testing medical AI systems.Abu Mohammed RaisuddinElias VaattovaaraMika NevalainenMarko NikkiElina JärvenpääKaisa MakkonenPekka PinolaTuula PalsioArttu NiemensivuOsmo TervonenAleksei TiulpinNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-11 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Abu Mohammed Raisuddin
Elias Vaattovaara
Mika Nevalainen
Marko Nikki
Elina Järvenpää
Kaisa Makkonen
Pekka Pinola
Tuula Palsio
Arttu Niemensivu
Osmo Tervonen
Aleksei Tiulpin
Critical evaluation of deep neural networks for wrist fracture detection
description Abstract Wrist Fracture is the most common type of fracture with a high incidence rate. Conventional radiography (i.e. X-ray imaging) is used for wrist fracture detection routinely, but occasionally fracture delineation poses issues and an additional confirmation by computed tomography (CT) is needed for diagnosis. Recent advances in the field of Deep Learning (DL), a subfield of Artificial Intelligence (AI), have shown that wrist fracture detection can be automated using Convolutional Neural Networks. However, previous studies did not pay close attention to the difficult cases which can only be confirmed via CT imaging. In this study, we have developed and analyzed a state-of-the-art DL-based pipeline for wrist (distal radius) fracture detection—DeepWrist, and evaluated it against one general population test set, and one challenging test set comprising only cases requiring confirmation by CT. Our results reveal that a typical state-of-the-art approach, such as DeepWrist, while having a near-perfect performance on the general independent test set, has a substantially lower performance on the challenging test set—average precision of 0.99 (0.99–0.99) versus 0.64 (0.46–0.83), respectively. Similarly, the area under the ROC curve was of 0.99 (0.98–0.99) versus 0.84 (0.72–0.93), respectively. Our findings highlight the importance of a meticulous analysis of DL-based models before clinical use, and unearth the need for more challenging settings for testing medical AI systems.
format article
author Abu Mohammed Raisuddin
Elias Vaattovaara
Mika Nevalainen
Marko Nikki
Elina Järvenpää
Kaisa Makkonen
Pekka Pinola
Tuula Palsio
Arttu Niemensivu
Osmo Tervonen
Aleksei Tiulpin
author_facet Abu Mohammed Raisuddin
Elias Vaattovaara
Mika Nevalainen
Marko Nikki
Elina Järvenpää
Kaisa Makkonen
Pekka Pinola
Tuula Palsio
Arttu Niemensivu
Osmo Tervonen
Aleksei Tiulpin
author_sort Abu Mohammed Raisuddin
title Critical evaluation of deep neural networks for wrist fracture detection
title_short Critical evaluation of deep neural networks for wrist fracture detection
title_full Critical evaluation of deep neural networks for wrist fracture detection
title_fullStr Critical evaluation of deep neural networks for wrist fracture detection
title_full_unstemmed Critical evaluation of deep neural networks for wrist fracture detection
title_sort critical evaluation of deep neural networks for wrist fracture detection
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/e86336d998a043168ba26f739ea5a6ac
work_keys_str_mv AT abumohammedraisuddin criticalevaluationofdeepneuralnetworksforwristfracturedetection
AT eliasvaattovaara criticalevaluationofdeepneuralnetworksforwristfracturedetection
AT mikanevalainen criticalevaluationofdeepneuralnetworksforwristfracturedetection
AT markonikki criticalevaluationofdeepneuralnetworksforwristfracturedetection
AT elinajarvenpaa criticalevaluationofdeepneuralnetworksforwristfracturedetection
AT kaisamakkonen criticalevaluationofdeepneuralnetworksforwristfracturedetection
AT pekkapinola criticalevaluationofdeepneuralnetworksforwristfracturedetection
AT tuulapalsio criticalevaluationofdeepneuralnetworksforwristfracturedetection
AT arttuniemensivu criticalevaluationofdeepneuralnetworksforwristfracturedetection
AT osmotervonen criticalevaluationofdeepneuralnetworksforwristfracturedetection
AT alekseitiulpin criticalevaluationofdeepneuralnetworksforwristfracturedetection
_version_ 1718395734663888896