Generation and evaluation of artificial mental health records for Natural Language Processing

Abstract A serious obstacle to the development of Natural Language Processing (NLP) methods in the clinical domain is the accessibility of textual data. The mental health domain is particularly challenging, partly because clinical documentation relies heavily on free text that is difficult to de-ide...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Julia Ive, Natalia Viani, Joyce Kam, Lucia Yin, Somain Verma, Stephen Puntis, Rudolf N. Cardinal, Angus Roberts, Robert Stewart, Sumithra Velupillai
Formato:	article
Lenguaje:	EN
Publicado:	Nature Portfolio 2020
Materias:	Computer applications to medicine. Medical informatics R858-859.7
Acceso en línea:	https://doaj.org/article/70bae84263554a11a15139b29889876f
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:70bae84263554a11a15139b29889876f
record_format	dspace
spelling	oai:doaj.org-article:70bae84263554a11a15139b29889876f2021-12-02T15:42:59ZGeneration and evaluation of artificial mental health records for Natural Language Processing10.1038/s41746-020-0267-x2398-6352https://doaj.org/article/70bae84263554a11a15139b29889876f2020-05-01T00:00:00Zhttps://doi.org/10.1038/s41746-020-0267-xhttps://doaj.org/toc/2398-6352Abstract A serious obstacle to the development of Natural Language Processing (NLP) methods in the clinical domain is the accessibility of textual data. The mental health domain is particularly challenging, partly because clinical documentation relies heavily on free text that is difficult to de-identify completely. This problem could be tackled by using artificial medical data. In this work, we present an approach to generate artificial clinical documents. We apply this approach to discharge summaries from a large mental healthcare provider and discharge summaries from an intensive care unit. We perform an extensive intrinsic evaluation where we (1) apply several measures of text preservation; (2) measure how much the model memorises training data; and (3) estimate clinical validity of the generated text based on a human evaluation task. Furthermore, we perform an extrinsic evaluation by studying the impact of using artificial text in a downstream NLP text classification task. We found that using this artificial data as training data can lead to classification results that are comparable to the original results. Additionally, using only a small amount of information from the original data to condition the generation of the artificial data is successful, which holds promise for reducing the risk of these artificial data retaining rare information from the original data. This is an important finding for our long-term goal of being able to generate artificial clinical data that can be released to the wider research community and accelerate advances in developing computational methods that use healthcare data.Julia IveNatalia VianiJoyce KamLucia YinSomain VermaStephen PuntisRudolf N. CardinalAngus RobertsRobert StewartSumithra VelupillaiNature PortfolioarticleComputer applications to medicine. Medical informaticsR858-859.7ENnpj Digital Medicine, Vol 3, Iss 1, Pp 1-9 (2020)
institution	DOAJ
collection	DOAJ
language	EN
topic	Computer applications to medicine. Medical informatics R858-859.7
spellingShingle	Computer applications to medicine. Medical informatics R858-859.7 Julia Ive Natalia Viani Joyce Kam Lucia Yin Somain Verma Stephen Puntis Rudolf N. Cardinal Angus Roberts Robert Stewart Sumithra Velupillai Generation and evaluation of artificial mental health records for Natural Language Processing
description	Abstract A serious obstacle to the development of Natural Language Processing (NLP) methods in the clinical domain is the accessibility of textual data. The mental health domain is particularly challenging, partly because clinical documentation relies heavily on free text that is difficult to de-identify completely. This problem could be tackled by using artificial medical data. In this work, we present an approach to generate artificial clinical documents. We apply this approach to discharge summaries from a large mental healthcare provider and discharge summaries from an intensive care unit. We perform an extensive intrinsic evaluation where we (1) apply several measures of text preservation; (2) measure how much the model memorises training data; and (3) estimate clinical validity of the generated text based on a human evaluation task. Furthermore, we perform an extrinsic evaluation by studying the impact of using artificial text in a downstream NLP text classification task. We found that using this artificial data as training data can lead to classification results that are comparable to the original results. Additionally, using only a small amount of information from the original data to condition the generation of the artificial data is successful, which holds promise for reducing the risk of these artificial data retaining rare information from the original data. This is an important finding for our long-term goal of being able to generate artificial clinical data that can be released to the wider research community and accelerate advances in developing computational methods that use healthcare data.
format	article
author	Julia Ive Natalia Viani Joyce Kam Lucia Yin Somain Verma Stephen Puntis Rudolf N. Cardinal Angus Roberts Robert Stewart Sumithra Velupillai
author_facet	Julia Ive Natalia Viani Joyce Kam Lucia Yin Somain Verma Stephen Puntis Rudolf N. Cardinal Angus Roberts Robert Stewart Sumithra Velupillai
author_sort	Julia Ive
title	Generation and evaluation of artificial mental health records for Natural Language Processing
title_short	Generation and evaluation of artificial mental health records for Natural Language Processing
title_full	Generation and evaluation of artificial mental health records for Natural Language Processing
title_fullStr	Generation and evaluation of artificial mental health records for Natural Language Processing
title_full_unstemmed	Generation and evaluation of artificial mental health records for Natural Language Processing
title_sort	generation and evaluation of artificial mental health records for natural language processing
publisher	Nature Portfolio
publishDate	2020
url	https://doaj.org/article/70bae84263554a11a15139b29889876f
work_keys_str_mv	AT juliaive generationandevaluationofartificialmentalhealthrecordsfornaturallanguageprocessing AT nataliaviani generationandevaluationofartificialmentalhealthrecordsfornaturallanguageprocessing AT joycekam generationandevaluationofartificialmentalhealthrecordsfornaturallanguageprocessing AT luciayin generationandevaluationofartificialmentalhealthrecordsfornaturallanguageprocessing AT somainverma generationandevaluationofartificialmentalhealthrecordsfornaturallanguageprocessing AT stephenpuntis generationandevaluationofartificialmentalhealthrecordsfornaturallanguageprocessing AT rudolfncardinal generationandevaluationofartificialmentalhealthrecordsfornaturallanguageprocessing AT angusroberts generationandevaluationofartificialmentalhealthrecordsfornaturallanguageprocessing AT robertstewart generationandevaluationofartificialmentalhealthrecordsfornaturallanguageprocessing AT sumithravelupillai generationandevaluationofartificialmentalhealthrecordsfornaturallanguageprocessing
_version_	1718385812957036544

Generation and evaluation of artificial mental health records for Natural Language Processing

Ejemplares similares