Identification of natural selection in genomic data with deep convolutional neural network

Abstract Background With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Arnaud Nguembang Fadja, Fabrizio Riguzzi, Giorgio Bertorelle, Emiliano Trucchi
Formato: article
Lenguaje:EN
Publicado: BMC 2021
Materias:
Acceso en línea:https://doaj.org/article/5e540e5ba9c846429f8779d37d55e3f3
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:5e540e5ba9c846429f8779d37d55e3f3
record_format dspace
spelling oai:doaj.org-article:5e540e5ba9c846429f8779d37d55e3f32021-12-05T12:03:54ZIdentification of natural selection in genomic data with deep convolutional neural network10.1186/s13040-021-00280-91756-0381https://doaj.org/article/5e540e5ba9c846429f8779d37d55e3f32021-12-01T00:00:00Zhttps://doi.org/10.1186/s13040-021-00280-9https://doaj.org/toc/1756-0381Abstract Background With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural Networks have been proposed to make inferences on demographic and adaptive processes using genomic data. Even though it was already shown to be powerful and efficient in different fields of investigation, Supervised Machine Learning has still to be explored as to unfold its enormous potential in evolutionary genomics. Results The paper proposes a method based on Supervised Machine Learning for classifying genomic data, represented as windows of genomic sequences from a sample of individuals belonging to the same population. A Convolutional Neural Network is used to test whether a genomic window shows the signature of natural selection. Training performed on simulated data show that the proposed model can accurately predict neutral and selection processes on portions of genomes taken from real populations with almost 90% accuracy.Arnaud Nguembang FadjaFabrizio RiguzziGiorgio BertorelleEmiliano TrucchiBMCarticleGenomic dataInference of natural selectionDeep LearningConvolutional Neural NetworksComputer applications to medicine. Medical informaticsR858-859.7AnalysisQA299.6-433ENBioData Mining, Vol 14, Iss 1, Pp 1-18 (2021)
institution DOAJ
collection DOAJ
language EN
topic Genomic data
Inference of natural selection
Deep Learning
Convolutional Neural Networks
Computer applications to medicine. Medical informatics
R858-859.7
Analysis
QA299.6-433
spellingShingle Genomic data
Inference of natural selection
Deep Learning
Convolutional Neural Networks
Computer applications to medicine. Medical informatics
R858-859.7
Analysis
QA299.6-433
Arnaud Nguembang Fadja
Fabrizio Riguzzi
Giorgio Bertorelle
Emiliano Trucchi
Identification of natural selection in genomic data with deep convolutional neural network
description Abstract Background With the increase in the size of genomic datasets describing variability in populations, extracting relevant information becomes increasingly useful as well as complex. Recently, computational methodologies such as Supervised Machine Learning and specifically Convolutional Neural Networks have been proposed to make inferences on demographic and adaptive processes using genomic data. Even though it was already shown to be powerful and efficient in different fields of investigation, Supervised Machine Learning has still to be explored as to unfold its enormous potential in evolutionary genomics. Results The paper proposes a method based on Supervised Machine Learning for classifying genomic data, represented as windows of genomic sequences from a sample of individuals belonging to the same population. A Convolutional Neural Network is used to test whether a genomic window shows the signature of natural selection. Training performed on simulated data show that the proposed model can accurately predict neutral and selection processes on portions of genomes taken from real populations with almost 90% accuracy.
format article
author Arnaud Nguembang Fadja
Fabrizio Riguzzi
Giorgio Bertorelle
Emiliano Trucchi
author_facet Arnaud Nguembang Fadja
Fabrizio Riguzzi
Giorgio Bertorelle
Emiliano Trucchi
author_sort Arnaud Nguembang Fadja
title Identification of natural selection in genomic data with deep convolutional neural network
title_short Identification of natural selection in genomic data with deep convolutional neural network
title_full Identification of natural selection in genomic data with deep convolutional neural network
title_fullStr Identification of natural selection in genomic data with deep convolutional neural network
title_full_unstemmed Identification of natural selection in genomic data with deep convolutional neural network
title_sort identification of natural selection in genomic data with deep convolutional neural network
publisher BMC
publishDate 2021
url https://doaj.org/article/5e540e5ba9c846429f8779d37d55e3f3
work_keys_str_mv AT arnaudnguembangfadja identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork
AT fabrizioriguzzi identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork
AT giorgiobertorelle identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork
AT emilianotrucchi identificationofnaturalselectioningenomicdatawithdeepconvolutionalneuralnetwork
_version_ 1718372255865503744