MitoScape: A big-data, machine-learning platform for obtaining mitochondrial DNA from next-generation sequencing data.
The growing number of next-generation sequencing (NGS) data presents a unique opportunity to study the combined impact of mitochondrial and nuclear-encoded genetic variation in complex disease. Mitochondrial DNA variants and in particular, heteroplasmic variants, are critical for determining human d...
Guardado en:
Autores principales: | , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/117d8a4df8de4858b795a6db63fbde98 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:117d8a4df8de4858b795a6db63fbde98 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:117d8a4df8de4858b795a6db63fbde982021-12-02T19:58:12ZMitoScape: A big-data, machine-learning platform for obtaining mitochondrial DNA from next-generation sequencing data.1553-734X1553-735810.1371/journal.pcbi.1009594https://doaj.org/article/117d8a4df8de4858b795a6db63fbde982021-11-01T00:00:00Zhttps://doi.org/10.1371/journal.pcbi.1009594https://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358The growing number of next-generation sequencing (NGS) data presents a unique opportunity to study the combined impact of mitochondrial and nuclear-encoded genetic variation in complex disease. Mitochondrial DNA variants and in particular, heteroplasmic variants, are critical for determining human disease severity. While there are approaches for obtaining mitochondrial DNA variants from NGS data, these software do not account for the unique characteristics of mitochondrial genetics and can be inaccurate even for homoplasmic variants. We introduce MitoScape, a novel, big-data, software for extracting mitochondrial DNA sequences from NGS. MitoScape adopts a novel departure from other algorithms by using machine learning to model the unique characteristics of mitochondrial genetics. We also employ a novel approach of using rho-zero (mitochondrial DNA-depleted) data to model nuclear-encoded mitochondrial sequences. We showed that MitoScape produces accurate heteroplasmy estimates using gold-standard mitochondrial DNA data. We provide a comprehensive comparison of the most common tools for obtaining mtDNA variants from NGS and showed that MitoScape had superior performance to compared tools in every statistically category we compared, including false positives and false negatives. By applying MitoScape to common disease examples, we illustrate how MitoScape facilitates important heteroplasmy-disease association discoveries by expanding upon a reported association between hypertrophic cardiomyopathy and mitochondrial haplogroup T in men (adjusted p-value = 0.003). The improved accuracy of mitochondrial DNA variants produced by MitoScape will be instrumental in diagnosing disease in the context of personalized medicine and clinical diagnostics.Larry N SinghBrian EnnisBryn LoneraganNoah L TsaoM Isabel G Lopez SanchezJianping LiPatrick AcheampongOanh TranIan A TrounceYuankun ZhuPrasanth PotluriRegeneron Genetics CenterBeverly S EmanuelDaniel J RaderZoltan AranyScott M DamrauerAdam C ResnickStewart A AndersonDouglas C WallacePublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 17, Iss 11, p e1009594 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Biology (General) QH301-705.5 |
spellingShingle |
Biology (General) QH301-705.5 Larry N Singh Brian Ennis Bryn Loneragan Noah L Tsao M Isabel G Lopez Sanchez Jianping Li Patrick Acheampong Oanh Tran Ian A Trounce Yuankun Zhu Prasanth Potluri Regeneron Genetics Center Beverly S Emanuel Daniel J Rader Zoltan Arany Scott M Damrauer Adam C Resnick Stewart A Anderson Douglas C Wallace MitoScape: A big-data, machine-learning platform for obtaining mitochondrial DNA from next-generation sequencing data. |
description |
The growing number of next-generation sequencing (NGS) data presents a unique opportunity to study the combined impact of mitochondrial and nuclear-encoded genetic variation in complex disease. Mitochondrial DNA variants and in particular, heteroplasmic variants, are critical for determining human disease severity. While there are approaches for obtaining mitochondrial DNA variants from NGS data, these software do not account for the unique characteristics of mitochondrial genetics and can be inaccurate even for homoplasmic variants. We introduce MitoScape, a novel, big-data, software for extracting mitochondrial DNA sequences from NGS. MitoScape adopts a novel departure from other algorithms by using machine learning to model the unique characteristics of mitochondrial genetics. We also employ a novel approach of using rho-zero (mitochondrial DNA-depleted) data to model nuclear-encoded mitochondrial sequences. We showed that MitoScape produces accurate heteroplasmy estimates using gold-standard mitochondrial DNA data. We provide a comprehensive comparison of the most common tools for obtaining mtDNA variants from NGS and showed that MitoScape had superior performance to compared tools in every statistically category we compared, including false positives and false negatives. By applying MitoScape to common disease examples, we illustrate how MitoScape facilitates important heteroplasmy-disease association discoveries by expanding upon a reported association between hypertrophic cardiomyopathy and mitochondrial haplogroup T in men (adjusted p-value = 0.003). The improved accuracy of mitochondrial DNA variants produced by MitoScape will be instrumental in diagnosing disease in the context of personalized medicine and clinical diagnostics. |
format |
article |
author |
Larry N Singh Brian Ennis Bryn Loneragan Noah L Tsao M Isabel G Lopez Sanchez Jianping Li Patrick Acheampong Oanh Tran Ian A Trounce Yuankun Zhu Prasanth Potluri Regeneron Genetics Center Beverly S Emanuel Daniel J Rader Zoltan Arany Scott M Damrauer Adam C Resnick Stewart A Anderson Douglas C Wallace |
author_facet |
Larry N Singh Brian Ennis Bryn Loneragan Noah L Tsao M Isabel G Lopez Sanchez Jianping Li Patrick Acheampong Oanh Tran Ian A Trounce Yuankun Zhu Prasanth Potluri Regeneron Genetics Center Beverly S Emanuel Daniel J Rader Zoltan Arany Scott M Damrauer Adam C Resnick Stewart A Anderson Douglas C Wallace |
author_sort |
Larry N Singh |
title |
MitoScape: A big-data, machine-learning platform for obtaining mitochondrial DNA from next-generation sequencing data. |
title_short |
MitoScape: A big-data, machine-learning platform for obtaining mitochondrial DNA from next-generation sequencing data. |
title_full |
MitoScape: A big-data, machine-learning platform for obtaining mitochondrial DNA from next-generation sequencing data. |
title_fullStr |
MitoScape: A big-data, machine-learning platform for obtaining mitochondrial DNA from next-generation sequencing data. |
title_full_unstemmed |
MitoScape: A big-data, machine-learning platform for obtaining mitochondrial DNA from next-generation sequencing data. |
title_sort |
mitoscape: a big-data, machine-learning platform for obtaining mitochondrial dna from next-generation sequencing data. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2021 |
url |
https://doaj.org/article/117d8a4df8de4858b795a6db63fbde98 |
work_keys_str_mv |
AT larrynsingh mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT brianennis mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT brynloneragan mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT noahltsao mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT misabelglopezsanchez mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT jianpingli mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT patrickacheampong mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT oanhtran mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT ianatrounce mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT yuankunzhu mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT prasanthpotluri mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT regenerongeneticscenter mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT beverlysemanuel mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT danieljrader mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT zoltanarany mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT scottmdamrauer mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT adamcresnick mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT stewartaanderson mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata AT douglascwallace mitoscapeabigdatamachinelearningplatformforobtainingmitochondrialdnafromnextgenerationsequencingdata |
_version_ |
1718375764456374272 |