Integrative data mining highlights candidate genes for monogenic myopathies.

Inherited myopathies are a heterogeneous group of disabling disorders with still barely understood pathological mechanisms. Around 40% of afflicted patients remain without a molecular diagnosis after exclusion of known genes. The advent of high-throughput sequencing has opened avenues to the discove...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Osorio Abath Neto, Olivier Tassy, Valérie Biancalana, Edmar Zanoteli, Olivier Pourquié, Jocelyn Laporte
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2014
Materias:
R
Q
Acceso en línea:https://doaj.org/article/fe83ca2ca40748258b75b18a02cad7b0
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:fe83ca2ca40748258b75b18a02cad7b0
record_format dspace
spelling oai:doaj.org-article:fe83ca2ca40748258b75b18a02cad7b02021-11-25T05:55:09ZIntegrative data mining highlights candidate genes for monogenic myopathies.1932-620310.1371/journal.pone.0110888https://doaj.org/article/fe83ca2ca40748258b75b18a02cad7b02014-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0110888https://doaj.org/toc/1932-6203Inherited myopathies are a heterogeneous group of disabling disorders with still barely understood pathological mechanisms. Around 40% of afflicted patients remain without a molecular diagnosis after exclusion of known genes. The advent of high-throughput sequencing has opened avenues to the discovery of new implicated genes, but a working list of prioritized candidate genes is necessary to deal with the complexity of analyzing large-scale sequencing data. Here we used an integrative data mining strategy to analyze the genetic network linked to myopathies, derive specific signatures for inherited myopathy and related disorders, and identify and rank candidate genes for these groups. Training sets of genes were selected after literature review and used in Manteia, a public web-based data mining system, to extract disease group signatures in the form of enriched descriptor terms, which include functional annotation, human and mouse phenotypes, as well as biological pathways and protein interactions. These specific signatures were then used as an input to mine and rank candidate genes, followed by filtration against skeletal muscle expression and association with known diseases. Signatures and identified candidate genes highlight both potential common pathological mechanisms and allelic disease groups. Recent discoveries of gene associations to diseases, like B3GALNT2, GMPPB and B3GNT1 to congenital muscular dystrophies, were prioritized in the ranked lists, suggesting a posteriori validation of our approach and predictions. We show an example of how the ranked lists can be used to help analyze high-throughput sequencing data to identify candidate genes, and highlight the best candidate genes matching genomic regions linked to myopathies without known causative genes. This strategy can be automatized to generate fresh candidate gene lists, which help cope with database annotation updates as new knowledge is incorporated.Osorio Abath NetoOlivier TassyValérie BiancalanaEdmar ZanoteliOlivier PourquiéJocelyn LaportePublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 9, Iss 10, p e110888 (2014)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Osorio Abath Neto
Olivier Tassy
Valérie Biancalana
Edmar Zanoteli
Olivier Pourquié
Jocelyn Laporte
Integrative data mining highlights candidate genes for monogenic myopathies.
description Inherited myopathies are a heterogeneous group of disabling disorders with still barely understood pathological mechanisms. Around 40% of afflicted patients remain without a molecular diagnosis after exclusion of known genes. The advent of high-throughput sequencing has opened avenues to the discovery of new implicated genes, but a working list of prioritized candidate genes is necessary to deal with the complexity of analyzing large-scale sequencing data. Here we used an integrative data mining strategy to analyze the genetic network linked to myopathies, derive specific signatures for inherited myopathy and related disorders, and identify and rank candidate genes for these groups. Training sets of genes were selected after literature review and used in Manteia, a public web-based data mining system, to extract disease group signatures in the form of enriched descriptor terms, which include functional annotation, human and mouse phenotypes, as well as biological pathways and protein interactions. These specific signatures were then used as an input to mine and rank candidate genes, followed by filtration against skeletal muscle expression and association with known diseases. Signatures and identified candidate genes highlight both potential common pathological mechanisms and allelic disease groups. Recent discoveries of gene associations to diseases, like B3GALNT2, GMPPB and B3GNT1 to congenital muscular dystrophies, were prioritized in the ranked lists, suggesting a posteriori validation of our approach and predictions. We show an example of how the ranked lists can be used to help analyze high-throughput sequencing data to identify candidate genes, and highlight the best candidate genes matching genomic regions linked to myopathies without known causative genes. This strategy can be automatized to generate fresh candidate gene lists, which help cope with database annotation updates as new knowledge is incorporated.
format article
author Osorio Abath Neto
Olivier Tassy
Valérie Biancalana
Edmar Zanoteli
Olivier Pourquié
Jocelyn Laporte
author_facet Osorio Abath Neto
Olivier Tassy
Valérie Biancalana
Edmar Zanoteli
Olivier Pourquié
Jocelyn Laporte
author_sort Osorio Abath Neto
title Integrative data mining highlights candidate genes for monogenic myopathies.
title_short Integrative data mining highlights candidate genes for monogenic myopathies.
title_full Integrative data mining highlights candidate genes for monogenic myopathies.
title_fullStr Integrative data mining highlights candidate genes for monogenic myopathies.
title_full_unstemmed Integrative data mining highlights candidate genes for monogenic myopathies.
title_sort integrative data mining highlights candidate genes for monogenic myopathies.
publisher Public Library of Science (PLoS)
publishDate 2014
url https://doaj.org/article/fe83ca2ca40748258b75b18a02cad7b0
work_keys_str_mv AT osorioabathneto integrativedatamininghighlightscandidategenesformonogenicmyopathies
AT oliviertassy integrativedatamininghighlightscandidategenesformonogenicmyopathies
AT valeriebiancalana integrativedatamininghighlightscandidategenesformonogenicmyopathies
AT edmarzanoteli integrativedatamininghighlightscandidategenesformonogenicmyopathies
AT olivierpourquie integrativedatamininghighlightscandidategenesformonogenicmyopathies
AT jocelynlaporte integrativedatamininghighlightscandidategenesformonogenicmyopathies
_version_ 1718414439175159808