An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records
Abstract Electronic health records (EHR) provide an unprecedented opportunity to conduct large, cost-efficient, population-based studies. However, the studies of heterogeneous diseases, such as chronic obstructive pulmonary disease (COPD), often require labor-intensive clinical review and testing, l...
Guardado en:
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Nature Portfolio
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/92a3cd4bff984a15bac3e86cb70bc9f3 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:92a3cd4bff984a15bac3e86cb70bc9f3 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:92a3cd4bff984a15bac3e86cb70bc9f32021-12-02T18:37:10ZAn independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records10.1038/s41598-021-98719-w2045-2322https://doaj.org/article/92a3cd4bff984a15bac3e86cb70bc9f32021-10-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-98719-whttps://doaj.org/toc/2045-2322Abstract Electronic health records (EHR) provide an unprecedented opportunity to conduct large, cost-efficient, population-based studies. However, the studies of heterogeneous diseases, such as chronic obstructive pulmonary disease (COPD), often require labor-intensive clinical review and testing, limiting widespread use of these important resources. To develop a generalizable and efficient method for accurate identification of large COPD cohorts in EHRs, a COPD datamart was developed from 3420 participants meeting inclusion criteria in the Mass General Brigham Biobank. Training and test sets were selected and labeled with gold-standard COPD classifications obtained from chart review by pulmonologists. Multiple classes of algorithms were built utilizing both structured (e.g. ICD codes) and unstructured (e.g. medical notes) data via elastic net regression. Models explicitly including and excluding spirometry features were compared. External validation of the final algorithm was conducted in an independent biobank with a different EHR system. The final COPD classification model demonstrated excellent positive predictive value (PPV; 91.7%), sensitivity (71.7%), and specificity (94.4%). This algorithm performed well not only within the MGBB, but also demonstrated similar or improved classification performance in an independent biobank (PPV 93.5%, sensitivity 61.4%, specificity 90%). Ancillary comparisons showed that the classification model built including a binary feature for FEV1/FVC produced substantially higher sensitivity than those excluding. This study fills a gap in COPD research involving population-based EHRs, providing an important resource for the rapid, automated classification of COPD cases that is both cost-efficient and requires minimal information from unstructured medical records.Su H. ChuEmily S. WanMichael H. ChoSergey GoryachevVivian GainerJames LinnemanErica J. ScottyScott J. HebbringShawn MurphyJessica Lasky-SuScott T. WeissJordan W. SmollerElizabeth KarlsonNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-9 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Su H. Chu Emily S. Wan Michael H. Cho Sergey Goryachev Vivian Gainer James Linneman Erica J. Scotty Scott J. Hebbring Shawn Murphy Jessica Lasky-Su Scott T. Weiss Jordan W. Smoller Elizabeth Karlson An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records |
description |
Abstract Electronic health records (EHR) provide an unprecedented opportunity to conduct large, cost-efficient, population-based studies. However, the studies of heterogeneous diseases, such as chronic obstructive pulmonary disease (COPD), often require labor-intensive clinical review and testing, limiting widespread use of these important resources. To develop a generalizable and efficient method for accurate identification of large COPD cohorts in EHRs, a COPD datamart was developed from 3420 participants meeting inclusion criteria in the Mass General Brigham Biobank. Training and test sets were selected and labeled with gold-standard COPD classifications obtained from chart review by pulmonologists. Multiple classes of algorithms were built utilizing both structured (e.g. ICD codes) and unstructured (e.g. medical notes) data via elastic net regression. Models explicitly including and excluding spirometry features were compared. External validation of the final algorithm was conducted in an independent biobank with a different EHR system. The final COPD classification model demonstrated excellent positive predictive value (PPV; 91.7%), sensitivity (71.7%), and specificity (94.4%). This algorithm performed well not only within the MGBB, but also demonstrated similar or improved classification performance in an independent biobank (PPV 93.5%, sensitivity 61.4%, specificity 90%). Ancillary comparisons showed that the classification model built including a binary feature for FEV1/FVC produced substantially higher sensitivity than those excluding. This study fills a gap in COPD research involving population-based EHRs, providing an important resource for the rapid, automated classification of COPD cases that is both cost-efficient and requires minimal information from unstructured medical records. |
format |
article |
author |
Su H. Chu Emily S. Wan Michael H. Cho Sergey Goryachev Vivian Gainer James Linneman Erica J. Scotty Scott J. Hebbring Shawn Murphy Jessica Lasky-Su Scott T. Weiss Jordan W. Smoller Elizabeth Karlson |
author_facet |
Su H. Chu Emily S. Wan Michael H. Cho Sergey Goryachev Vivian Gainer James Linneman Erica J. Scotty Scott J. Hebbring Shawn Murphy Jessica Lasky-Su Scott T. Weiss Jordan W. Smoller Elizabeth Karlson |
author_sort |
Su H. Chu |
title |
An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records |
title_short |
An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records |
title_full |
An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records |
title_fullStr |
An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records |
title_full_unstemmed |
An independently validated, portable algorithm for the rapid identification of COPD patients using electronic health records |
title_sort |
independently validated, portable algorithm for the rapid identification of copd patients using electronic health records |
publisher |
Nature Portfolio |
publishDate |
2021 |
url |
https://doaj.org/article/92a3cd4bff984a15bac3e86cb70bc9f3 |
work_keys_str_mv |
AT suhchu anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT emilyswan anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT michaelhcho anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT sergeygoryachev anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT viviangainer anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT jameslinneman anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT ericajscotty anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT scottjhebbring anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT shawnmurphy anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT jessicalaskysu anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT scotttweiss anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT jordanwsmoller anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT elizabethkarlson anindependentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT suhchu independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT emilyswan independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT michaelhcho independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT sergeygoryachev independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT viviangainer independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT jameslinneman independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT ericajscotty independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT scottjhebbring independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT shawnmurphy independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT jessicalaskysu independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT scotttweiss independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT jordanwsmoller independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords AT elizabethkarlson independentlyvalidatedportablealgorithmfortherapididentificationofcopdpatientsusingelectronichealthrecords |
_version_ |
1718377812252950528 |