Extreme Multi-Label ICD Classification: Sensitivity to Hospital Service and Time

This work deals with clinical text mining for automatic classification of Electronic Health Records (EHRs) with respect to the International Classification of Diseases (ICD). ICD is the international standard for the identification of diseases and health conditions in EHRs and the foundation for rep...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Alberto Blanco, Alicia Perez, Arantza Casillas
Formato:	article
Lenguaje:	EN
Publicado:	IEEE 2020
Materias:	Extreme multi-label classification electronic health records international classification of diseases classification across-time classification across hospital-services Electrical engineering. Electronics. Nuclear engineering TK1-9971
Acceso en línea:	https://doaj.org/article/52be9b4904104e77823556ef20307cbe
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:52be9b4904104e77823556ef20307cbe
record_format	dspace
spelling	oai:doaj.org-article:52be9b4904104e77823556ef20307cbe2021-11-19T00:05:26ZExtreme Multi-Label ICD Classification: Sensitivity to Hospital Service and Time2169-353610.1109/ACCESS.2020.3029429https://doaj.org/article/52be9b4904104e77823556ef20307cbe2020-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9215979/https://doaj.org/toc/2169-3536This work deals with clinical text mining for automatic classification of Electronic Health Records (EHRs) with respect to the International Classification of Diseases (ICD). ICD is the international standard for the identification of diseases and health conditions in EHRs and the foundation for reporting health statistics. Machine learning-based techniques have proven robust to infer classification models from EHRs. Since each EHR tends to involve multiple diseases, multi-label classification is required. The concern in this work is the versatility of the models inferred and their ability to generalise in two ways: as time goes ahead and across hospital services or health specialties. Indeed, in this work, we show the capabilities of a Bidirectional Recurrent Neural Network (RNN) with GRU units and ELMo embeddings on two corpora (a corpus comprising a set of EHRs within the Basque Health System, namely Osakidetza, and the well-known MIMIC-III corpus). To delve into and assess the versatility of the models, we focus on their resilience across hospital admissions taken over two different years and also across six distinct hospital services. In addition, we paid attention to the classification performance to estimate ICD codes of different granularity (e.g. with or without essential modifiers). Our best results are 39.55% and 47.28% F-Score for the Osakidetza and MIMIC-III datasets respectively, with the original main label-sets. Regarding the models evaluated per specialty, the most remarkable results are 57.00% and 72.74% F-Score, in the Cardiology and Nephrology medical services respectively.Alberto BlancoAlicia PerezArantza CasillasIEEEarticleExtreme multi-label classificationelectronic health recordsinternational classification of diseasesclassification across-timeclassification across hospital-servicesElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 8, Pp 183534-183545 (2020)
institution	DOAJ
collection	DOAJ
language	EN
topic	Extreme multi-label classification electronic health records international classification of diseases classification across-time classification across hospital-services Electrical engineering. Electronics. Nuclear engineering TK1-9971
spellingShingle	Extreme multi-label classification electronic health records international classification of diseases classification across-time classification across hospital-services Electrical engineering. Electronics. Nuclear engineering TK1-9971 Alberto Blanco Alicia Perez Arantza Casillas Extreme Multi-Label ICD Classification: Sensitivity to Hospital Service and Time
description	This work deals with clinical text mining for automatic classification of Electronic Health Records (EHRs) with respect to the International Classification of Diseases (ICD). ICD is the international standard for the identification of diseases and health conditions in EHRs and the foundation for reporting health statistics. Machine learning-based techniques have proven robust to infer classification models from EHRs. Since each EHR tends to involve multiple diseases, multi-label classification is required. The concern in this work is the versatility of the models inferred and their ability to generalise in two ways: as time goes ahead and across hospital services or health specialties. Indeed, in this work, we show the capabilities of a Bidirectional Recurrent Neural Network (RNN) with GRU units and ELMo embeddings on two corpora (a corpus comprising a set of EHRs within the Basque Health System, namely Osakidetza, and the well-known MIMIC-III corpus). To delve into and assess the versatility of the models, we focus on their resilience across hospital admissions taken over two different years and also across six distinct hospital services. In addition, we paid attention to the classification performance to estimate ICD codes of different granularity (e.g. with or without essential modifiers). Our best results are 39.55% and 47.28% F-Score for the Osakidetza and MIMIC-III datasets respectively, with the original main label-sets. Regarding the models evaluated per specialty, the most remarkable results are 57.00% and 72.74% F-Score, in the Cardiology and Nephrology medical services respectively.
format	article
author	Alberto Blanco Alicia Perez Arantza Casillas
author_facet	Alberto Blanco Alicia Perez Arantza Casillas
author_sort	Alberto Blanco
title	Extreme Multi-Label ICD Classification: Sensitivity to Hospital Service and Time
title_short	Extreme Multi-Label ICD Classification: Sensitivity to Hospital Service and Time
title_full	Extreme Multi-Label ICD Classification: Sensitivity to Hospital Service and Time
title_fullStr	Extreme Multi-Label ICD Classification: Sensitivity to Hospital Service and Time
title_full_unstemmed	Extreme Multi-Label ICD Classification: Sensitivity to Hospital Service and Time
title_sort	extreme multi-label icd classification: sensitivity to hospital service and time
publisher	IEEE
publishDate	2020
url	https://doaj.org/article/52be9b4904104e77823556ef20307cbe
work_keys_str_mv	AT albertoblanco extrememultilabelicdclassificationsensitivitytohospitalserviceandtime AT aliciaperez extrememultilabelicdclassificationsensitivitytohospitalserviceandtime AT arantzacasillas extrememultilabelicdclassificationsensitivitytohospitalserviceandtime
_version_	1718420668017541120

Extreme Multi-Label ICD Classification: Sensitivity to Hospital Service and Time

Ejemplares similares