Developing a Natural Language Processing tool to identify perinatal self-harm in electronic healthcare records.

<h4>Background</h4>Self-harm occurring within pregnancy and the postnatal year ("perinatal self-harm") is a clinically important yet under-researched topic. Current research likely under-estimates prevalence due to methodological limitations. Electronic healthcare records (EHRs...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Karyn Ayre, André Bittar, Joyce Kam, Somain Verma, Louise M Howard, Rina Dutta
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/f45093e9cdef45e788c4e570f1b5cf32
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:<h4>Background</h4>Self-harm occurring within pregnancy and the postnatal year ("perinatal self-harm") is a clinically important yet under-researched topic. Current research likely under-estimates prevalence due to methodological limitations. Electronic healthcare records (EHRs) provide a source of clinically rich data on perinatal self-harm.<h4>Aims</h4>(1) To create a Natural Language Processing (NLP) tool that can, with acceptable precision and recall, identify mentions of acts of perinatal self-harm within EHRs. (2) To use this tool to identify service-users who have self-harmed perinatally, based on their EHRs.<h4>Methods</h4>We used the Clinical Record Interactive Search system to extract de-identified EHRs of secondary mental healthcare service-users at South London and Maudsley NHS Foundation Trust. We developed a tool that applied several layers of linguistic processing based on the spaCy NLP library for Python. We evaluated mention-level performance in the following domains: span, status, temporality and polarity. Evaluation was done against a manually coded reference standard. Mention-level performance was reported as precision, recall, F-score and Cohen's kappa for each domain. Performance was also assessed at 'service-user' level and explored whether a heuristic rule improved this. We report per-class statistics for service-user performance, as well as likelihood ratios and post-test probabilities.<h4>Results</h4>Mention-level performance: micro-averaged F-score, precision and recall for span, polarity and temporality >0.8. Kappa for status 0.68, temporality 0.62, polarity 0.91. Service-user level performance with heuristic: F-score, precision, recall of minority class 0.69, macro-averaged F-score 0.81, positive LR 9.4 (4.8-19), post-test probability 69.0% (53-82%). Considering the task difficulty, the tool performs well, although temporality was the attribute with the lowest level of annotator agreement.<h4>Conclusions</h4>It is feasible to develop an NLP tool that identifies, with acceptable validity, mentions of perinatal self-harm within EHRs, although with limitations regarding temporality. Using a heuristic rule, it can also function at a service-user-level.