Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records.
<h4>Background</h4>Self-harm occurring within pregnancy and the postnatal year ("perinatal self-harm") is a clinically important yet under-researched topic. Current research likely under-estimates prevalence due to methodological limitations. Electronic healthcare records (EHRs...
Guardado en:
Autores principales: | , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/f45093e9cdef45e788c4e570f1b5cf32 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:f45093e9cdef45e788c4e570f1b5cf32 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:f45093e9cdef45e788c4e570f1b5cf322021-12-02T20:18:48ZDeveloping a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records.1932-620310.1371/journal.pone.0253809https://doaj.org/article/f45093e9cdef45e788c4e570f1b5cf322021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0253809https://doaj.org/toc/1932-6203<h4>Background</h4>Self-harm occurring within pregnancy and the postnatal year ("perinatal self-harm") is a clinically important yet under-researched topic. Current research likely under-estimates prevalence due to methodological limitations. Electronic healthcare records (EHRs) provide a source of clinically rich data on perinatal self-harm.<h4>Aims</h4>(1) To create a Natural Language Processing (NLP) tool that can, with acceptable precision and recall, identify mentions of acts of perinatal self-harm within EHRs. (2) To use this tool to identify service-users who have self-harmed perinatally, based on their EHRs.<h4>Methods</h4>We used the Clinical Record Interactive Search system to extract de-identified EHRs of secondary mental healthcare service-users at South London and Maudsley NHS Foundation Trust. We developed a tool that applied several layers of linguistic processing based on the spaCy NLP library for Python. We evaluated mention-level performance in the following domains: span, status, temporality and polarity. Evaluation was done against a manually coded reference standard. Mention-level performance was reported as precision, recall, F-score and Cohen's kappa for each domain. Performance was also assessed at 'service-user' level and explored whether a heuristic rule improved this. We report per-class statistics for service-user performance, as well as likelihood ratios and post-test probabilities.<h4>Results</h4>Mention-level performance: micro-averaged F-score, precision and recall for span, polarity and temporality >0.8. Kappa for status 0.68, temporality 0.62, polarity 0.91. Service-user level performance with heuristic: F-score, precision, recall of minority class 0.69, macro-averaged F-score 0.81, positive LR 9.4 (4.8-19), post-test probability 69.0% (53-82%). Considering the task difficulty, the tool performs well, although temporality was the attribute with the lowest level of annotator agreement.<h4>Conclusions</h4>It is feasible to develop an NLP tool that identifies, with acceptable validity, mentions of perinatal self-harm within EHRs, although with limitations regarding temporality. Using a heuristic rule, it can also function at a service-user-level.Karyn AyreAndré BittarJoyce KamSomain VermaLouise M HowardRina DuttaPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 8, p e0253809 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Karyn Ayre André Bittar Joyce Kam Somain Verma Louise M Howard Rina Dutta Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records. |
description |
<h4>Background</h4>Self-harm occurring within pregnancy and the postnatal year ("perinatal self-harm") is a clinically important yet under-researched topic. Current research likely under-estimates prevalence due to methodological limitations. Electronic healthcare records (EHRs) provide a source of clinically rich data on perinatal self-harm.<h4>Aims</h4>(1) To create a Natural Language Processing (NLP) tool that can, with acceptable precision and recall, identify mentions of acts of perinatal self-harm within EHRs. (2) To use this tool to identify service-users who have self-harmed perinatally, based on their EHRs.<h4>Methods</h4>We used the Clinical Record Interactive Search system to extract de-identified EHRs of secondary mental healthcare service-users at South London and Maudsley NHS Foundation Trust. We developed a tool that applied several layers of linguistic processing based on the spaCy NLP library for Python. We evaluated mention-level performance in the following domains: span, status, temporality and polarity. Evaluation was done against a manually coded reference standard. Mention-level performance was reported as precision, recall, F-score and Cohen's kappa for each domain. Performance was also assessed at 'service-user' level and explored whether a heuristic rule improved this. We report per-class statistics for service-user performance, as well as likelihood ratios and post-test probabilities.<h4>Results</h4>Mention-level performance: micro-averaged F-score, precision and recall for span, polarity and temporality >0.8. Kappa for status 0.68, temporality 0.62, polarity 0.91. Service-user level performance with heuristic: F-score, precision, recall of minority class 0.69, macro-averaged F-score 0.81, positive LR 9.4 (4.8-19), post-test probability 69.0% (53-82%). Considering the task difficulty, the tool performs well, although temporality was the attribute with the lowest level of annotator agreement.<h4>Conclusions</h4>It is feasible to develop an NLP tool that identifies, with acceptable validity, mentions of perinatal self-harm within EHRs, although with limitations regarding temporality. Using a heuristic rule, it can also function at a service-user-level. |
format |
article |
author |
Karyn Ayre André Bittar Joyce Kam Somain Verma Louise M Howard Rina Dutta |
author_facet |
Karyn Ayre André Bittar Joyce Kam Somain Verma Louise M Howard Rina Dutta |
author_sort |
Karyn Ayre |
title |
Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records. |
title_short |
Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records. |
title_full |
Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records. |
title_fullStr |
Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records. |
title_full_unstemmed |
Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records. |
title_sort |
developing a natural language processing tool to identify perinatal self-harm in electronic healthcare records. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2021 |
url |
https://doaj.org/article/f45093e9cdef45e788c4e570f1b5cf32 |
work_keys_str_mv |
AT karynayre developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords AT andrebittar developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords AT joycekam developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords AT somainverma developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords AT louisemhoward developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords AT rinadutta developinganaturallanguageprocessingtooltoidentifyperinatalselfharminelectronichealthcarerecords |
_version_ |
1718374261641445376 |