Assessing the difficulty of annotating medical data in crowdworking with help of experiments.

<h4>Background</h4>As healthcare-related data proliferate, there is need to annotate them expertly for the purposes of personalized medicine. Crowdworking is an alternative to expensive expert labour. Annotation corresponds to diagnosis, so comparing unlabeled records to labeled ones see...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Anne Rother, Uli Niemann, Tommy Hielscher, Henry Völzke, Till Ittermann, Myra Spiliopoulou
Formato:	article
Lenguaje:	EN
Publicado:	Public Library of Science (PLoS) 2021
Materias:	Medicine R Science Q
Acceso en línea:	https://doaj.org/article/2d783f0e320e43568325ef391c1b3ac0
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:2d783f0e320e43568325ef391c1b3ac0
record_format	dspace
spelling	oai:doaj.org-article:2d783f0e320e43568325ef391c1b3ac02021-12-02T20:08:57ZAssessing the difficulty of annotating medical data in crowdworking with help of experiments.1932-620310.1371/journal.pone.0254764https://doaj.org/article/2d783f0e320e43568325ef391c1b3ac02021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0254764https://doaj.org/toc/1932-6203<h4>Background</h4>As healthcare-related data proliferate, there is need to annotate them expertly for the purposes of personalized medicine. Crowdworking is an alternative to expensive expert labour. Annotation corresponds to diagnosis, so comparing unlabeled records to labeled ones seems more appropriate for crowdworkers without medical expertise. We modeled the comparison of a record to two other records as a triplet annotation task, and we conducted an experiment to investigate to what extend sensor-measured stress, task duration, uncertainty of the annotators and agreement among the annotators could predict annotation correctness.<h4>Materials and methods</h4>We conducted an annotation experiment on health data from a population-based study. The triplet annotation task was to decide whether an individual was more similar to a healthy one or to one with a given disorder. We used hepatic steatosis as example disorder, and described the individuals with 10 pre-selected characteristics related to this disorder. We recorded task duration, electro-dermal activity as stress indicator, and uncertainty as stated by the experiment participants (n = 29 non-experts and three experts) for 30 triplets. We built an Artificial Similarity-Based Annotator (ASBA) and compared its correctness and uncertainty to that of the experiment participants.<h4>Results</h4>We found no correlation between correctness and either of stated uncertainty, stress and task duration. Annotator agreement has not been predictive either. Notably, for some tasks, annotators agreed unanimously on an incorrect annotation. When controlling for Triplet ID, we identified significant correlations, indicating that correctness, stress levels and annotation duration depend on the task itself. Average correctness among the experiment participants was slightly lower than achieved by ASBA. Triplet annotation turned to be similarly difficult for experts as for non-experts.<h4>Conclusion</h4>Our lab experiment indicates that the task of triplet annotation must be prepared cautiously if delegated to crowdworkers. Neither certainty nor agreement among annotators should be assumed to imply correct annotation, because annotators may misjudge difficult tasks as easy and agree on incorrect annotations. Further research is needed to improve visualizations for complex tasks, to judiciously decide how much information to provide, Out-of-the-lab experiments in crowdworker setting are needed to identify appropriate designs of a human-annotation task, and to assess under what circumstances non-human annotation should be preferred.Anne RotherUli NiemannTommy HielscherHenry VölzkeTill IttermannMyra SpiliopoulouPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 7, p e0254764 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Medicine R Science Q
spellingShingle	Medicine R Science Q Anne Rother Uli Niemann Tommy Hielscher Henry Völzke Till Ittermann Myra Spiliopoulou Assessing the difficulty of annotating medical data in crowdworking with help of experiments.
description	<h4>Background</h4>As healthcare-related data proliferate, there is need to annotate them expertly for the purposes of personalized medicine. Crowdworking is an alternative to expensive expert labour. Annotation corresponds to diagnosis, so comparing unlabeled records to labeled ones seems more appropriate for crowdworkers without medical expertise. We modeled the comparison of a record to two other records as a triplet annotation task, and we conducted an experiment to investigate to what extend sensor-measured stress, task duration, uncertainty of the annotators and agreement among the annotators could predict annotation correctness.<h4>Materials and methods</h4>We conducted an annotation experiment on health data from a population-based study. The triplet annotation task was to decide whether an individual was more similar to a healthy one or to one with a given disorder. We used hepatic steatosis as example disorder, and described the individuals with 10 pre-selected characteristics related to this disorder. We recorded task duration, electro-dermal activity as stress indicator, and uncertainty as stated by the experiment participants (n = 29 non-experts and three experts) for 30 triplets. We built an Artificial Similarity-Based Annotator (ASBA) and compared its correctness and uncertainty to that of the experiment participants.<h4>Results</h4>We found no correlation between correctness and either of stated uncertainty, stress and task duration. Annotator agreement has not been predictive either. Notably, for some tasks, annotators agreed unanimously on an incorrect annotation. When controlling for Triplet ID, we identified significant correlations, indicating that correctness, stress levels and annotation duration depend on the task itself. Average correctness among the experiment participants was slightly lower than achieved by ASBA. Triplet annotation turned to be similarly difficult for experts as for non-experts.<h4>Conclusion</h4>Our lab experiment indicates that the task of triplet annotation must be prepared cautiously if delegated to crowdworkers. Neither certainty nor agreement among annotators should be assumed to imply correct annotation, because annotators may misjudge difficult tasks as easy and agree on incorrect annotations. Further research is needed to improve visualizations for complex tasks, to judiciously decide how much information to provide, Out-of-the-lab experiments in crowdworker setting are needed to identify appropriate designs of a human-annotation task, and to assess under what circumstances non-human annotation should be preferred.
format	article
author	Anne Rother Uli Niemann Tommy Hielscher Henry Völzke Till Ittermann Myra Spiliopoulou
author_facet	Anne Rother Uli Niemann Tommy Hielscher Henry Völzke Till Ittermann Myra Spiliopoulou
author_sort	Anne Rother
title	Assessing the difficulty of annotating medical data in crowdworking with help of experiments.
title_short	Assessing the difficulty of annotating medical data in crowdworking with help of experiments.
title_full	Assessing the difficulty of annotating medical data in crowdworking with help of experiments.
title_fullStr	Assessing the difficulty of annotating medical data in crowdworking with help of experiments.
title_full_unstemmed	Assessing the difficulty of annotating medical data in crowdworking with help of experiments.
title_sort	assessing the difficulty of annotating medical data in crowdworking with help of experiments.
publisher	Public Library of Science (PLoS)
publishDate	2021
url	https://doaj.org/article/2d783f0e320e43568325ef391c1b3ac0
work_keys_str_mv	AT annerother assessingthedifficultyofannotatingmedicaldataincrowdworkingwithhelpofexperiments AT uliniemann assessingthedifficultyofannotatingmedicaldataincrowdworkingwithhelpofexperiments AT tommyhielscher assessingthedifficultyofannotatingmedicaldataincrowdworkingwithhelpofexperiments AT henryvolzke assessingthedifficultyofannotatingmedicaldataincrowdworkingwithhelpofexperiments AT tillittermann assessingthedifficultyofannotatingmedicaldataincrowdworkingwithhelpofexperiments AT myraspiliopoulou assessingthedifficultyofannotatingmedicaldataincrowdworkingwithhelpofexperiments
_version_	1718375112752758784

Assessing the difficulty of annotating medical data in crowdworking with help of experiments.

Ejemplares similares