Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking.

The identification of phenotype descriptions in the scientific literature, case reports and patient records is a rewarding task for bio-medical text mining. Any progress will support knowledge discovery and linkage to other resources. However because of their wide variation a number of challenges st...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Nigel Collier, Mai-vu Tran, Hoang-quynh Le, Quang-Thuy Ha, Anika Oellrich, Dietrich Rebholz-Schuhmann
Formato:	article
Lenguaje:	EN
Publicado:	Public Library of Science (PLoS) 2013
Materias:	Medicine R Science Q
Acceso en línea:	https://doaj.org/article/e2f76355001a45e794b7c989a9e7d346
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:e2f76355001a45e794b7c989a9e7d346
record_format	dspace
spelling	oai:doaj.org-article:e2f76355001a45e794b7c989a9e7d3462021-11-18T08:51:18ZLearning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking.1932-620310.1371/journal.pone.0072965https://doaj.org/article/e2f76355001a45e794b7c989a9e7d3462013-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24155869/?tool=EBIhttps://doaj.org/toc/1932-6203The identification of phenotype descriptions in the scientific literature, case reports and patient records is a rewarding task for bio-medical text mining. Any progress will support knowledge discovery and linkage to other resources. However because of their wide variation a number of challenges still remain in terms of their identification and semantic normalisation before they can be fully exploited for research purposes. This paper presents novel techniques for identifying potential complex phenotype mentions by exploiting a hybrid model based on machine learning, rules and dictionary matching. A systematic study is made of how to combine sequence labels from these modules as well as the merits of various ontological resources. We evaluated our approach on a subset of Medline abstracts cited by the Online Mendelian Inheritance of Man database related to auto-immune diseases. Using partial matching the best micro-averaged F-score for phenotypes and five other entity classes was 79.9%. A best performance of 75.3% was achieved for phenotype candidates using all semantics resources. We observed the advantage of using SVM-based learn-to-rank for sequence label combination over maximum entropy and a priority list approach. The results indicate that the identification of simple entity types such as chemicals and genes are robustly supported by single semantic resources, whereas phenotypes require combinations. Altogether we conclude that our approach coped well with the compositional structure of phenotypes in the auto-immune domain.Nigel CollierMai-vu TranHoang-quynh LeQuang-Thuy HaAnika OellrichDietrich Rebholz-SchuhmannPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 8, Iss 10, p e72965 (2013)
institution	DOAJ
collection	DOAJ
language	EN
topic	Medicine R Science Q
spellingShingle	Medicine R Science Q Nigel Collier Mai-vu Tran Hoang-quynh Le Quang-Thuy Ha Anika Oellrich Dietrich Rebholz-Schuhmann Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking.
description	The identification of phenotype descriptions in the scientific literature, case reports and patient records is a rewarding task for bio-medical text mining. Any progress will support knowledge discovery and linkage to other resources. However because of their wide variation a number of challenges still remain in terms of their identification and semantic normalisation before they can be fully exploited for research purposes. This paper presents novel techniques for identifying potential complex phenotype mentions by exploiting a hybrid model based on machine learning, rules and dictionary matching. A systematic study is made of how to combine sequence labels from these modules as well as the merits of various ontological resources. We evaluated our approach on a subset of Medline abstracts cited by the Online Mendelian Inheritance of Man database related to auto-immune diseases. Using partial matching the best micro-averaged F-score for phenotypes and five other entity classes was 79.9%. A best performance of 75.3% was achieved for phenotype candidates using all semantics resources. We observed the advantage of using SVM-based learn-to-rank for sequence label combination over maximum entropy and a priority list approach. The results indicate that the identification of simple entity types such as chemicals and genes are robustly supported by single semantic resources, whereas phenotypes require combinations. Altogether we conclude that our approach coped well with the compositional structure of phenotypes in the auto-immune domain.
format	article
author	Nigel Collier Mai-vu Tran Hoang-quynh Le Quang-Thuy Ha Anika Oellrich Dietrich Rebholz-Schuhmann
author_facet	Nigel Collier Mai-vu Tran Hoang-quynh Le Quang-Thuy Ha Anika Oellrich Dietrich Rebholz-Schuhmann
author_sort	Nigel Collier
title	Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking.
title_short	Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking.
title_full	Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking.
title_fullStr	Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking.
title_full_unstemmed	Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking.
title_sort	learning to recognize phenotype candidates in the auto-immune literature using svm re-ranking.
publisher	Public Library of Science (PLoS)
publishDate	2013
url	https://doaj.org/article/e2f76355001a45e794b7c989a9e7d346
work_keys_str_mv	AT nigelcollier learningtorecognizephenotypecandidatesintheautoimmuneliteratureusingsvmreranking AT maivutran learningtorecognizephenotypecandidatesintheautoimmuneliteratureusingsvmreranking AT hoangquynhle learningtorecognizephenotypecandidatesintheautoimmuneliteratureusingsvmreranking AT quangthuyha learningtorecognizephenotypecandidatesintheautoimmuneliteratureusingsvmreranking AT anikaoellrich learningtorecognizephenotypecandidatesintheautoimmuneliteratureusingsvmreranking AT dietrichrebholzschuhmann learningtorecognizephenotypecandidatesintheautoimmuneliteratureusingsvmreranking
_version_	1718421293073694720

Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking.

Ejemplares similares