Evaluation and cross-comparison of lexical entities of biological interest (LexEBI).

<h4>Motivation</h4>Biomedical entities, their identifiers and names, are essential in the representation of biomedical facts and knowledge. In the same way, the complete set of biomedical and chemical terms, i.e. the biomedical "term space" (the "Lexeome"), forms a ke...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Dietrich Rebholz-Schuhmann, Jee-Hyub Kim, Ying Yan, Abhishek Dixit, Caroline Friteyre, Robert Hoehndorf, Rolf Backofen, Ian Lewin
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2013
Materias:
R
Q
Acceso en línea:https://doaj.org/article/8aad4c2a66b24b7e8c99b607729749ed
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:8aad4c2a66b24b7e8c99b607729749ed
record_format dspace
spelling oai:doaj.org-article:8aad4c2a66b24b7e8c99b607729749ed2021-11-18T08:52:32ZEvaluation and cross-comparison of lexical entities of biological interest (LexEBI).1932-620310.1371/journal.pone.0075185https://doaj.org/article/8aad4c2a66b24b7e8c99b607729749ed2013-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24124474/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203<h4>Motivation</h4>Biomedical entities, their identifiers and names, are essential in the representation of biomedical facts and knowledge. In the same way, the complete set of biomedical and chemical terms, i.e. the biomedical "term space" (the "Lexeome"), forms a key resource to achieve the full integration of the scientific literature with biomedical data resources: any identified named entity can immediately be normalized to the correct database entry. This goal does not only require that we are aware of all existing terms, but would also profit from knowing all their senses and their semantic interpretation (ambiguities, nestedness).<h4>Result</h4>This study compiles a resource for lexical terms of biomedical interest in a standard format (called "LexEBI"), determines the overall number of terms, their reuse in different resources and the nestedness of terms. LexEBI comprises references for protein and gene entries and their term variants and chemical entities amongst other terms. In addition, disease terms have been identified from Medline and PubmedCentral and added to LexEBI. Our analysis demonstrates that the baseforms of terms from the different semantic types show only little polysemous use. Nonetheless, the term variants of protein and gene names (PGNs) frequently contain species mentions, which should have been avoided according to protein annotation guidelines. Furthermore, the protein and gene entities as well as the chemical entities, both do comprise enzymes leading to hierarchical polysemy, and a large portion of PGNs make reference to a chemical entity. Altogether, according to our analysis based on the Medline distribution, 401,869 unique PGNs in the documents contain a reference to 25,022 chemical entities, 3,125 disease terms or 1,576 species mentions.<h4>Conclusion</h4>LexEBI delivers the complete biomedical and chemical Lexeome in a standardized representation (http://www.ebi.ac.uk/Rebholz-srv/LexEBI/). The resource provides the disease terms as open source content, and fully interlinks terms across resources.Dietrich Rebholz-SchuhmannJee-Hyub KimYing YanAbhishek DixitCaroline FriteyreRobert HoehndorfRolf BackofenIan LewinPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 8, Iss 10, p e75185 (2013)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Dietrich Rebholz-Schuhmann
Jee-Hyub Kim
Ying Yan
Abhishek Dixit
Caroline Friteyre
Robert Hoehndorf
Rolf Backofen
Ian Lewin
Evaluation and cross-comparison of lexical entities of biological interest (LexEBI).
description <h4>Motivation</h4>Biomedical entities, their identifiers and names, are essential in the representation of biomedical facts and knowledge. In the same way, the complete set of biomedical and chemical terms, i.e. the biomedical "term space" (the "Lexeome"), forms a key resource to achieve the full integration of the scientific literature with biomedical data resources: any identified named entity can immediately be normalized to the correct database entry. This goal does not only require that we are aware of all existing terms, but would also profit from knowing all their senses and their semantic interpretation (ambiguities, nestedness).<h4>Result</h4>This study compiles a resource for lexical terms of biomedical interest in a standard format (called "LexEBI"), determines the overall number of terms, their reuse in different resources and the nestedness of terms. LexEBI comprises references for protein and gene entries and their term variants and chemical entities amongst other terms. In addition, disease terms have been identified from Medline and PubmedCentral and added to LexEBI. Our analysis demonstrates that the baseforms of terms from the different semantic types show only little polysemous use. Nonetheless, the term variants of protein and gene names (PGNs) frequently contain species mentions, which should have been avoided according to protein annotation guidelines. Furthermore, the protein and gene entities as well as the chemical entities, both do comprise enzymes leading to hierarchical polysemy, and a large portion of PGNs make reference to a chemical entity. Altogether, according to our analysis based on the Medline distribution, 401,869 unique PGNs in the documents contain a reference to 25,022 chemical entities, 3,125 disease terms or 1,576 species mentions.<h4>Conclusion</h4>LexEBI delivers the complete biomedical and chemical Lexeome in a standardized representation (http://www.ebi.ac.uk/Rebholz-srv/LexEBI/). The resource provides the disease terms as open source content, and fully interlinks terms across resources.
format article
author Dietrich Rebholz-Schuhmann
Jee-Hyub Kim
Ying Yan
Abhishek Dixit
Caroline Friteyre
Robert Hoehndorf
Rolf Backofen
Ian Lewin
author_facet Dietrich Rebholz-Schuhmann
Jee-Hyub Kim
Ying Yan
Abhishek Dixit
Caroline Friteyre
Robert Hoehndorf
Rolf Backofen
Ian Lewin
author_sort Dietrich Rebholz-Schuhmann
title Evaluation and cross-comparison of lexical entities of biological interest (LexEBI).
title_short Evaluation and cross-comparison of lexical entities of biological interest (LexEBI).
title_full Evaluation and cross-comparison of lexical entities of biological interest (LexEBI).
title_fullStr Evaluation and cross-comparison of lexical entities of biological interest (LexEBI).
title_full_unstemmed Evaluation and cross-comparison of lexical entities of biological interest (LexEBI).
title_sort evaluation and cross-comparison of lexical entities of biological interest (lexebi).
publisher Public Library of Science (PLoS)
publishDate 2013
url https://doaj.org/article/8aad4c2a66b24b7e8c99b607729749ed
work_keys_str_mv AT dietrichrebholzschuhmann evaluationandcrosscomparisonoflexicalentitiesofbiologicalinterestlexebi
AT jeehyubkim evaluationandcrosscomparisonoflexicalentitiesofbiologicalinterestlexebi
AT yingyan evaluationandcrosscomparisonoflexicalentitiesofbiologicalinterestlexebi
AT abhishekdixit evaluationandcrosscomparisonoflexicalentitiesofbiologicalinterestlexebi
AT carolinefriteyre evaluationandcrosscomparisonoflexicalentitiesofbiologicalinterestlexebi
AT roberthoehndorf evaluationandcrosscomparisonoflexicalentitiesofbiologicalinterestlexebi
AT rolfbackofen evaluationandcrosscomparisonoflexicalentitiesofbiologicalinterestlexebi
AT ianlewin evaluationandcrosscomparisonoflexicalentitiesofbiologicalinterestlexebi
_version_ 1718421214084464640