Feasibility of predicting allele specific expression from DNA sequencing using machine learning
Abstract Allele specific expression (ASE) concerns divergent expression quantity of alternative alleles and is measured by RNA sequencing. Multiple studies show that ASE plays a role in hereditary diseases by modulating penetrance or phenotype severity. However, genome diagnostics is based on DNA se...
Guardado en:
Autores principales: | , , , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Nature Portfolio
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/046bc28e1cc24a58a64d6f222d461d04 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:046bc28e1cc24a58a64d6f222d461d04 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:046bc28e1cc24a58a64d6f222d461d042021-12-02T15:45:31ZFeasibility of predicting allele specific expression from DNA sequencing using machine learning10.1038/s41598-021-89904-y2045-2322https://doaj.org/article/046bc28e1cc24a58a64d6f222d461d042021-05-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-89904-yhttps://doaj.org/toc/2045-2322Abstract Allele specific expression (ASE) concerns divergent expression quantity of alternative alleles and is measured by RNA sequencing. Multiple studies show that ASE plays a role in hereditary diseases by modulating penetrance or phenotype severity. However, genome diagnostics is based on DNA sequencing and therefore neglects gene expression regulation such as ASE. To take advantage of ASE in absence of RNA sequencing, it must be predicted using only DNA variation. We have constructed ASE models from BIOS (n = 3432) and GTEx (n = 369) that predict ASE using DNA features. These models are highly reproducible and comprise many different feature types, highlighting the complex regulation that underlies ASE. We applied the BIOS-trained model to population variants in three genes in which ASE plays a clinically relevant role: BRCA2, RET and NF1. This resulted in predicted ASE effects for 27 variants, of which 10 were known pathogenic variants. We demonstrated that ASE can be predicted from DNA features using machine learning. Future efforts may improve sensitivity and translate these models into a new type of genome diagnostic tool that prioritizes candidate pathogenic variants or regulators thereof for follow-up validation by RNA sequencing. All used code and machine learning models are available at GitHub and Zenodo.Zhenhua ZhangFreerk van DijkNiek de KleinMariëlle E van GijnLude H FrankeRichard J SinkeMorris A SwertzK Joeri van der VeldeNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-11 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Zhenhua Zhang Freerk van Dijk Niek de Klein Mariëlle E van Gijn Lude H Franke Richard J Sinke Morris A Swertz K Joeri van der Velde Feasibility of predicting allele specific expression from DNA sequencing using machine learning |
description |
Abstract Allele specific expression (ASE) concerns divergent expression quantity of alternative alleles and is measured by RNA sequencing. Multiple studies show that ASE plays a role in hereditary diseases by modulating penetrance or phenotype severity. However, genome diagnostics is based on DNA sequencing and therefore neglects gene expression regulation such as ASE. To take advantage of ASE in absence of RNA sequencing, it must be predicted using only DNA variation. We have constructed ASE models from BIOS (n = 3432) and GTEx (n = 369) that predict ASE using DNA features. These models are highly reproducible and comprise many different feature types, highlighting the complex regulation that underlies ASE. We applied the BIOS-trained model to population variants in three genes in which ASE plays a clinically relevant role: BRCA2, RET and NF1. This resulted in predicted ASE effects for 27 variants, of which 10 were known pathogenic variants. We demonstrated that ASE can be predicted from DNA features using machine learning. Future efforts may improve sensitivity and translate these models into a new type of genome diagnostic tool that prioritizes candidate pathogenic variants or regulators thereof for follow-up validation by RNA sequencing. All used code and machine learning models are available at GitHub and Zenodo. |
format |
article |
author |
Zhenhua Zhang Freerk van Dijk Niek de Klein Mariëlle E van Gijn Lude H Franke Richard J Sinke Morris A Swertz K Joeri van der Velde |
author_facet |
Zhenhua Zhang Freerk van Dijk Niek de Klein Mariëlle E van Gijn Lude H Franke Richard J Sinke Morris A Swertz K Joeri van der Velde |
author_sort |
Zhenhua Zhang |
title |
Feasibility of predicting allele specific expression from DNA sequencing using machine learning |
title_short |
Feasibility of predicting allele specific expression from DNA sequencing using machine learning |
title_full |
Feasibility of predicting allele specific expression from DNA sequencing using machine learning |
title_fullStr |
Feasibility of predicting allele specific expression from DNA sequencing using machine learning |
title_full_unstemmed |
Feasibility of predicting allele specific expression from DNA sequencing using machine learning |
title_sort |
feasibility of predicting allele specific expression from dna sequencing using machine learning |
publisher |
Nature Portfolio |
publishDate |
2021 |
url |
https://doaj.org/article/046bc28e1cc24a58a64d6f222d461d04 |
work_keys_str_mv |
AT zhenhuazhang feasibilityofpredictingallelespecificexpressionfromdnasequencingusingmachinelearning AT freerkvandijk feasibilityofpredictingallelespecificexpressionfromdnasequencingusingmachinelearning AT niekdeklein feasibilityofpredictingallelespecificexpressionfromdnasequencingusingmachinelearning AT marielleevangijn feasibilityofpredictingallelespecificexpressionfromdnasequencingusingmachinelearning AT ludehfranke feasibilityofpredictingallelespecificexpressionfromdnasequencingusingmachinelearning AT richardjsinke feasibilityofpredictingallelespecificexpressionfromdnasequencingusingmachinelearning AT morrisaswertz feasibilityofpredictingallelespecificexpressionfromdnasequencingusingmachinelearning AT kjoerivandervelde feasibilityofpredictingallelespecificexpressionfromdnasequencingusingmachinelearning |
_version_ |
1718385731204808704 |