Obtención automática de palabras clave en textos clínicos: una aplicación de procesamiento del lenguaje natural a datos masivos de sospecha diagnóstica en Chile

Background: Free-text imposes a challenge in health data analysis since the lack of structure makes the extraction and integration of information difficult, particularly in the case of massive data. An appropriate machine-interpretation of electronic health records in Chile can unleash knowledge co...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Villena,Fabián, Dunstan,Jocelyn
Lenguaje:Spanish / Castilian
Publicado: Sociedad Médica de Santiago 2019
Materias:
Acceso en línea:http://www.scielo.cl/scielo.php?script=sci_arttext&pid=S0034-98872019001001229
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:scielo:S0034-98872019001001229
record_format dspace
spelling oai:scielo:S0034-988720190010012292020-01-16Obtención automática de palabras clave en textos clínicos: una aplicación de procesamiento del lenguaje natural a datos masivos de sospecha diagnóstica en ChileVillena,FabiánDunstan,Jocelyn Data Mining Information Storage and Retrieval Machine Learning Medical Informatics Natural Language Processing Background: Free-text imposes a challenge in health data analysis since the lack of structure makes the extraction and integration of information difficult, particularly in the case of massive data. An appropriate machine-interpretation of electronic health records in Chile can unleash knowledge contained in large volumes of clinical texts, expanding clinical management and national research capabilities. Aim: To illustrate the use of a weighted frequency algorithm to find keywords. This finding was carried out in the diagnostic suspicion field of the Chilean specialty consultation waiting list, for diseases not covered by the Chilean Explicit Health Guarantees plan. Material and Methods: The waiting lists for a first specialty consultation for the period 2008-2018 were obtained from 17 out of 29 Chilean health services, and total of 2,592,925 diagnostic suspicions were identified. A natural language processing technique called Term Frequency–Inverse Document Frequency was used for the retrieval of diagnostic suspicion keywords. Results: For each specialty, four key words with the highest weighted frequency were determined. Word clouds showing words weighted by their importance were created to obtain a visual representation. These are available at cimt.uchile.cl/lechile/. Conclusions: The algorithm allowed to summarize unstructured clinical free-text data, improving its usefulness and accessibility.info:eu-repo/semantics/openAccessSociedad Médica de SantiagoRevista médica de Chile v.147 n.10 20192019-10-01text/htmlhttp://www.scielo.cl/scielo.php?script=sci_arttext&pid=S0034-98872019001001229es10.4067/s0034-98872019001001229
institution Scielo Chile
collection Scielo Chile
language Spanish / Castilian
topic Data Mining
Information Storage and Retrieval
Machine Learning
Medical Informatics
Natural Language Processing
spellingShingle Data Mining
Information Storage and Retrieval
Machine Learning
Medical Informatics
Natural Language Processing
Villena,Fabián
Dunstan,Jocelyn
Obtención automática de palabras clave en textos clínicos: una aplicación de procesamiento del lenguaje natural a datos masivos de sospecha diagnóstica en Chile
description Background: Free-text imposes a challenge in health data analysis since the lack of structure makes the extraction and integration of information difficult, particularly in the case of massive data. An appropriate machine-interpretation of electronic health records in Chile can unleash knowledge contained in large volumes of clinical texts, expanding clinical management and national research capabilities. Aim: To illustrate the use of a weighted frequency algorithm to find keywords. This finding was carried out in the diagnostic suspicion field of the Chilean specialty consultation waiting list, for diseases not covered by the Chilean Explicit Health Guarantees plan. Material and Methods: The waiting lists for a first specialty consultation for the period 2008-2018 were obtained from 17 out of 29 Chilean health services, and total of 2,592,925 diagnostic suspicions were identified. A natural language processing technique called Term Frequency–Inverse Document Frequency was used for the retrieval of diagnostic suspicion keywords. Results: For each specialty, four key words with the highest weighted frequency were determined. Word clouds showing words weighted by their importance were created to obtain a visual representation. These are available at cimt.uchile.cl/lechile/. Conclusions: The algorithm allowed to summarize unstructured clinical free-text data, improving its usefulness and accessibility.
author Villena,Fabián
Dunstan,Jocelyn
author_facet Villena,Fabián
Dunstan,Jocelyn
author_sort Villena,Fabián
title Obtención automática de palabras clave en textos clínicos: una aplicación de procesamiento del lenguaje natural a datos masivos de sospecha diagnóstica en Chile
title_short Obtención automática de palabras clave en textos clínicos: una aplicación de procesamiento del lenguaje natural a datos masivos de sospecha diagnóstica en Chile
title_full Obtención automática de palabras clave en textos clínicos: una aplicación de procesamiento del lenguaje natural a datos masivos de sospecha diagnóstica en Chile
title_fullStr Obtención automática de palabras clave en textos clínicos: una aplicación de procesamiento del lenguaje natural a datos masivos de sospecha diagnóstica en Chile
title_full_unstemmed Obtención automática de palabras clave en textos clínicos: una aplicación de procesamiento del lenguaje natural a datos masivos de sospecha diagnóstica en Chile
title_sort obtención automática de palabras clave en textos clínicos: una aplicación de procesamiento del lenguaje natural a datos masivos de sospecha diagnóstica en chile
publisher Sociedad Médica de Santiago
publishDate 2019
url http://www.scielo.cl/scielo.php?script=sci_arttext&pid=S0034-98872019001001229
work_keys_str_mv AT villenafabian obtencionautomaticadepalabrasclaveentextosclinicosunaaplicaciondeprocesamientodellenguajenaturaladatosmasivosdesospechadiagnosticaenchile
AT dunstanjocelyn obtencionautomaticadepalabrasclaveentextosclinicosunaaplicaciondeprocesamientodellenguajenaturaladatosmasivosdesospechadiagnosticaenchile
_version_ 1718437085554147328