Text mining of gene–phenotype associations reveals new phenotypic profiles of autism-associated genes

Abstract Autism is a spectrum disorder with wide variation in type and severity of symptoms. Understanding gene–phenotype associations is vital to unravel the disease mechanisms and advance its diagnosis and treatment. To date, several databases have stored a large portion of gene–phenotype associat...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Sijie Li, Ziqi Guo, Jacob B. Ioffe, Yunfei Hu, Yi Zhen, Xin Zhou
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/cf105b54b2c14a6e844db2f624cc83a9
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:cf105b54b2c14a6e844db2f624cc83a9
record_format dspace
spelling oai:doaj.org-article:cf105b54b2c14a6e844db2f624cc83a92021-12-02T16:06:43ZText mining of gene–phenotype associations reveals new phenotypic profiles of autism-associated genes10.1038/s41598-021-94742-z2045-2322https://doaj.org/article/cf105b54b2c14a6e844db2f624cc83a92021-07-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-94742-zhttps://doaj.org/toc/2045-2322Abstract Autism is a spectrum disorder with wide variation in type and severity of symptoms. Understanding gene–phenotype associations is vital to unravel the disease mechanisms and advance its diagnosis and treatment. To date, several databases have stored a large portion of gene–phenotype associations which are mainly obtained from genetic experiments. However, a large proportion of gene–phenotype associations are still buried in the autism-related literature and there are limited resources to investigate autism-associated gene–phenotype associations. Given the abundance of the autism-related literature, we were thus motivated to develop Autism_genepheno, a text mining pipeline to identify sentence-level mentions of autism-associated genes and phenotypes in literature through natural language processing methods. We have generated a comprehensive database of gene–phenotype associations in the last five years’ autism-related literature that can be easily updated as new literature becomes available. We have evaluated our pipeline through several different approaches, and we are able to rank and select top autism-associated genes through their unique and wide spectrum of phenotypic profiles, which could provide a unique resource for the diagnosis and treatment of autism. The data resources and the Autism_genpheno pipeline are available at: https://github.com/maiziezhoulab/Autism_genepheno .Sijie LiZiqi GuoJacob B. IoffeYunfei HuYi ZhenXin ZhouNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-12 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Sijie Li
Ziqi Guo
Jacob B. Ioffe
Yunfei Hu
Yi Zhen
Xin Zhou
Text mining of gene–phenotype associations reveals new phenotypic profiles of autism-associated genes
description Abstract Autism is a spectrum disorder with wide variation in type and severity of symptoms. Understanding gene–phenotype associations is vital to unravel the disease mechanisms and advance its diagnosis and treatment. To date, several databases have stored a large portion of gene–phenotype associations which are mainly obtained from genetic experiments. However, a large proportion of gene–phenotype associations are still buried in the autism-related literature and there are limited resources to investigate autism-associated gene–phenotype associations. Given the abundance of the autism-related literature, we were thus motivated to develop Autism_genepheno, a text mining pipeline to identify sentence-level mentions of autism-associated genes and phenotypes in literature through natural language processing methods. We have generated a comprehensive database of gene–phenotype associations in the last five years’ autism-related literature that can be easily updated as new literature becomes available. We have evaluated our pipeline through several different approaches, and we are able to rank and select top autism-associated genes through their unique and wide spectrum of phenotypic profiles, which could provide a unique resource for the diagnosis and treatment of autism. The data resources and the Autism_genpheno pipeline are available at: https://github.com/maiziezhoulab/Autism_genepheno .
format article
author Sijie Li
Ziqi Guo
Jacob B. Ioffe
Yunfei Hu
Yi Zhen
Xin Zhou
author_facet Sijie Li
Ziqi Guo
Jacob B. Ioffe
Yunfei Hu
Yi Zhen
Xin Zhou
author_sort Sijie Li
title Text mining of gene–phenotype associations reveals new phenotypic profiles of autism-associated genes
title_short Text mining of gene–phenotype associations reveals new phenotypic profiles of autism-associated genes
title_full Text mining of gene–phenotype associations reveals new phenotypic profiles of autism-associated genes
title_fullStr Text mining of gene–phenotype associations reveals new phenotypic profiles of autism-associated genes
title_full_unstemmed Text mining of gene–phenotype associations reveals new phenotypic profiles of autism-associated genes
title_sort text mining of gene–phenotype associations reveals new phenotypic profiles of autism-associated genes
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/cf105b54b2c14a6e844db2f624cc83a9
work_keys_str_mv AT sijieli textminingofgenephenotypeassociationsrevealsnewphenotypicprofilesofautismassociatedgenes
AT ziqiguo textminingofgenephenotypeassociationsrevealsnewphenotypicprofilesofautismassociatedgenes
AT jacobbioffe textminingofgenephenotypeassociationsrevealsnewphenotypicprofilesofautismassociatedgenes
AT yunfeihu textminingofgenephenotypeassociationsrevealsnewphenotypicprofilesofautismassociatedgenes
AT yizhen textminingofgenephenotypeassociationsrevealsnewphenotypicprofilesofautismassociatedgenes
AT xinzhou textminingofgenephenotypeassociationsrevealsnewphenotypicprofilesofautismassociatedgenes
_version_ 1718384932765564928