Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records.

<h4>Background</h4>Self-harm occurring within pregnancy and the postnatal year ("perinatal self-harm") is a clinically important yet under-researched topic. Current research likely under-estimates prevalence due to methodological limitations. Electronic healthcare records (EHRs...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Karyn Ayre, André Bittar, Joyce Kam, Somain Verma, Louise M Howard, Rina Dutta
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/f45093e9cdef45e788c4e570f1b5cf32
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:f45093e9cdef45e788c4e570f1b5cf32
record_format dspace
spelling oai:doaj.org-article:f45093e9cdef45e788c4e570f1b5cf322021-12-02T20:18:48ZDeveloping a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records.1932-620310.1371/journal.pone.0253809https://doaj.org/article/f45093e9cdef45e788c4e570f1b5cf322021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0253809https://doaj.org/toc/1932-6203<h4>Background</h4>Self-harm occurring within pregnancy and the postnatal year ("perinatal self-harm") is a clinically important yet under-researched topic. Current research likely under-estimates prevalence due to methodological limitations. Electronic healthcare records (EHRs) provide a source of clinically rich data on perinatal self-harm.<h4>Aims</h4>(1) To create a Natural Language Processing (NLP) tool that can, with acceptable precision and recall, identify mentions of acts of perinatal self-harm within EHRs. (2) To use this tool to identify service-users who have self-harmed perinatally, based on their EHRs.<h4>Methods</h4>We used the Clinical Record Interactive Search system to extract de-identified EHRs of secondary mental healthcare service-users at South London and Maudsley NHS Foundation Trust. We developed a tool that applied several layers of linguistic processing based on the spaCy NLP library for Python. We evaluated mention-level performance in the following domains: span, status, temporality and polarity. Evaluation was done against a manually coded reference standard. Mention-level performance was reported as precision, recall, F-score and Cohen's kappa for each domain. Performance was also assessed at 'service-user' level and explored whether a heuristic rule improved this. We report per-class statistics for service-user performance, as well as likelihood ratios and post-test probabilities.<h4>Results</h4>Mention-level performance: micro-averaged F-score, precision and recall for span, polarity and temporality >0.8. Kappa for status 0.68, temporality 0.62, polarity 0.91. Service-user level performance with heuristic: F-score, precision, recall of minority class 0.69, macro-averaged F-score 0.81, positive LR 9.4 (4.8-19), post-test probability 69.0% (53-82%). Considering the task difficulty, the tool performs well, although temporality was the attribute with the lowest level of annotator agreement.<h4>Conclusions</h4>It is feasible to develop an NLP tool that identifies, with acceptable validity, mentions of perinatal self-harm within EHRs, although with limitations regarding temporality. Using a heuristic rule, it can also function at a service-user-level.Karyn AyreAndré BittarJoyce KamSomain VermaLouise M HowardRina DuttaPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 8, p e0253809 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Karyn Ayre
André Bittar
Joyce Kam
Somain Verma
Louise M Howard
Rina Dutta
Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records.
description <h4>Background</h4>Self-harm occurring within pregnancy and the postnatal year ("perinatal self-harm") is a clinically important yet under-researched topic. Current research likely under-estimates prevalence due to methodological limitations. Electronic healthcare records (EHRs) provide a source of clinically rich data on perinatal self-harm.<h4>Aims</h4>(1) To create a Natural Language Processing (NLP) tool that can, with acceptable precision and recall, identify mentions of acts of perinatal self-harm within EHRs. (2) To use this tool to identify service-users who have self-harmed perinatally, based on their EHRs.<h4>Methods</h4>We used the Clinical Record Interactive Search system to extract de-identified EHRs of secondary mental healthcare service-users at South London and Maudsley NHS Foundation Trust. We developed a tool that applied several layers of linguistic processing based on the spaCy NLP library for Python. We evaluated mention-level performance in the following domains: span, status, temporality and polarity. Evaluation was done against a manually coded reference standard. Mention-level performance was reported as precision, recall, F-score and Cohen's kappa for each domain. Performance was also assessed at 'service-user' level and explored whether a heuristic rule improved this. We report per-class statistics for service-user performance, as well as likelihood ratios and post-test probabilities.<h4>Results</h4>Mention-level performance: micro-averaged F-score, precision and recall for span, polarity and temporality >0.8. Kappa for status 0.68, temporality 0.62, polarity 0.91. Service-user level performance with heuristic: F-score, precision, recall of minority class 0.69, macro-averaged F-score 0.81, positive LR 9.4 (4.8-19), post-test probability 69.0% (53-82%). Considering the task difficulty, the tool performs well, although temporality was the attribute with the lowest level of annotator agreement.<h4>Conclusions</h4>It is feasible to develop an NLP tool that identifies, with acceptable validity, mentions of perinatal self-harm within EHRs, although with limitations regarding temporality. Using a heuristic rule, it can also function at a service-user-level.
format article
author Karyn Ayre
André Bittar
Joyce Kam
Somain Verma
Louise M Howard
Rina Dutta
author_facet Karyn Ayre
André Bittar
Joyce Kam
Somain Verma
Louise M Howard
Rina Dutta
author_sort Karyn Ayre
title Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records.
title_short Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records.
title_full Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records.
title_fullStr Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records.
title_full_unstemmed Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records.
title_sort developing a natural language processing tool to identify perinatal self-harm in electronic healthcare records.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/f45093e9cdef45e788c4e570f1b5cf32
work_keys_str_mv AT karynayre developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords
AT andrebittar developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords
AT joycekam developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords
AT somainverma developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords
AT louisemhoward developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords
AT rinadutta developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords
_version_ 1718374261641445376