Leveraging graph-based hierarchical medical entity embedding for healthcare applications

Abstract Automatic representation learning of key entities in electronic health record (EHR) data is a critical step for healthcare data mining that turns heterogeneous medical records into structured and actionable information. Here we propose ME2Vec, an algorithmic framework for learning continuou...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Tong Wu, Yunlong Wang, Yue Wang, Emily Zhao, Yilian Yuan
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/e0c414c7ba28497f8db9cf05f2a9f717
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:e0c414c7ba28497f8db9cf05f2a9f717
record_format dspace
spelling oai:doaj.org-article:e0c414c7ba28497f8db9cf05f2a9f7172021-12-02T11:36:15ZLeveraging graph-based hierarchical medical entity embedding for healthcare applications10.1038/s41598-021-85255-w2045-2322https://doaj.org/article/e0c414c7ba28497f8db9cf05f2a9f7172021-03-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-85255-whttps://doaj.org/toc/2045-2322Abstract Automatic representation learning of key entities in electronic health record (EHR) data is a critical step for healthcare data mining that turns heterogeneous medical records into structured and actionable information. Here we propose ME2Vec, an algorithmic framework for learning continuous low-dimensional embedding vectors of the most common entities in EHR: medical services, doctors, and patients. ME2Vec features a hierarchical structure that encapsulates different node embedding schemes to cater for the unique characteristic of each medical entity. To embed medical services, we employ a biased-random-walk-based node embedding that leverages the irregular time intervals of medical services in EHR to embody their relative importance. To embed doctors and patients, we adhere to the principle “it’s what you do that defines you” and derive their embeddings based on their interactions with other types of entities through graph neural network and proximity-preserving network embedding, respectively. Using real-world clinical data, we demonstrate the efficacy of ME2Vec over competitive baselines on diagnosis prediction, readmission prediction, as well as recommending doctors to patients based on their medical conditions. In addition, medical service embeddings pretrained using ME2Vec can substantially improve the performance of sequential models in predicting patients clinical outcomes. Overall, ME2Vec can serve as a general-purpose representation learning algorithm for EHR data and benefit various downstream tasks in terms of both performance and interpretability.Tong WuYunlong WangYue WangEmily ZhaoYilian YuanNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-13 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Tong Wu
Yunlong Wang
Yue Wang
Emily Zhao
Yilian Yuan
Leveraging graph-based hierarchical medical entity embedding for healthcare applications
description Abstract Automatic representation learning of key entities in electronic health record (EHR) data is a critical step for healthcare data mining that turns heterogeneous medical records into structured and actionable information. Here we propose ME2Vec, an algorithmic framework for learning continuous low-dimensional embedding vectors of the most common entities in EHR: medical services, doctors, and patients. ME2Vec features a hierarchical structure that encapsulates different node embedding schemes to cater for the unique characteristic of each medical entity. To embed medical services, we employ a biased-random-walk-based node embedding that leverages the irregular time intervals of medical services in EHR to embody their relative importance. To embed doctors and patients, we adhere to the principle “it’s what you do that defines you” and derive their embeddings based on their interactions with other types of entities through graph neural network and proximity-preserving network embedding, respectively. Using real-world clinical data, we demonstrate the efficacy of ME2Vec over competitive baselines on diagnosis prediction, readmission prediction, as well as recommending doctors to patients based on their medical conditions. In addition, medical service embeddings pretrained using ME2Vec can substantially improve the performance of sequential models in predicting patients clinical outcomes. Overall, ME2Vec can serve as a general-purpose representation learning algorithm for EHR data and benefit various downstream tasks in terms of both performance and interpretability.
format article
author Tong Wu
Yunlong Wang
Yue Wang
Emily Zhao
Yilian Yuan
author_facet Tong Wu
Yunlong Wang
Yue Wang
Emily Zhao
Yilian Yuan
author_sort Tong Wu
title Leveraging graph-based hierarchical medical entity embedding for healthcare applications
title_short Leveraging graph-based hierarchical medical entity embedding for healthcare applications
title_full Leveraging graph-based hierarchical medical entity embedding for healthcare applications
title_fullStr Leveraging graph-based hierarchical medical entity embedding for healthcare applications
title_full_unstemmed Leveraging graph-based hierarchical medical entity embedding for healthcare applications
title_sort leveraging graph-based hierarchical medical entity embedding for healthcare applications
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/e0c414c7ba28497f8db9cf05f2a9f717
work_keys_str_mv AT tongwu leveraginggraphbasedhierarchicalmedicalentityembeddingforhealthcareapplications
AT yunlongwang leveraginggraphbasedhierarchicalmedicalentityembeddingforhealthcareapplications
AT yuewang leveraginggraphbasedhierarchicalmedicalentityembeddingforhealthcareapplications
AT emilyzhao leveraginggraphbasedhierarchicalmedicalentityembeddingforhealthcareapplications
AT yilianyuan leveraginggraphbasedhierarchicalmedicalentityembeddingforhealthcareapplications
_version_ 1718395794080399360