Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network.

Recurrent neural networks with memory and attention mechanisms are widely used in natural language processing because they can capture short and long term sequential information for diverse tasks. We propose an integrated deep learning model for microbial DNA sequence data, which exploits convolutio...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Zhengqiao Zhao, Stephen Woloszynek, Felix Agbavor, Joshua Chang Mell, Bahrad A Sokhansanj, Gail L Rosen
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
Acceso en línea:https://doaj.org/article/f942b69fd3b3444fabf60265b9f20db6
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:f942b69fd3b3444fabf60265b9f20db6
record_format dspace
spelling oai:doaj.org-article:f942b69fd3b3444fabf60265b9f20db62021-12-02T19:57:45ZLearning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network.1553-734X1553-735810.1371/journal.pcbi.1009345https://doaj.org/article/f942b69fd3b3444fabf60265b9f20db62021-09-01T00:00:00Zhttps://doi.org/10.1371/journal.pcbi.1009345https://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Recurrent neural networks with memory and attention mechanisms are widely used in natural language processing because they can capture short and long term sequential information for diverse tasks. We propose an integrated deep learning model for microbial DNA sequence data, which exploits convolutional neural networks, recurrent neural networks, and attention mechanisms to predict taxonomic classifications and sample-associated attributes, such as the relationship between the microbiome and host phenotype, on the read/sequence level. In this paper, we develop this novel deep learning approach and evaluate its application to amplicon sequences. We apply our approach to short DNA reads and full sequences of 16S ribosomal RNA (rRNA) marker genes, which identify the heterogeneity of a microbial community sample. We demonstrate that our implementation of a novel attention-based deep network architecture, Read2Pheno, achieves read-level phenotypic prediction. Training Read2Pheno models will encode sequences (reads) into dense, meaningful representations: learned embedded vectors output from the intermediate layer of the network model, which can provide biological insight when visualized. The attention layer of Read2Pheno models can also automatically identify nucleotide regions in reads/sequences which are particularly informative for classification. As such, this novel approach can avoid pre/post-processing and manual interpretation required with conventional approaches to microbiome sequence classification. We further show, as proof-of-concept, that aggregating read-level information can robustly predict microbial community properties, host phenotype, and taxonomic classification, with performance at least comparable to conventional approaches. An implementation of the attention-based deep learning network is available at https://github.com/EESI/sequence_attention (a python package) and https://github.com/EESI/seq2att (a command line tool).Zhengqiao ZhaoStephen WoloszynekFelix AgbavorJoshua Chang MellBahrad A SokhansanjGail L RosenPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 17, Iss 9, p e1009345 (2021)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Zhengqiao Zhao
Stephen Woloszynek
Felix Agbavor
Joshua Chang Mell
Bahrad A Sokhansanj
Gail L Rosen
Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network.
description Recurrent neural networks with memory and attention mechanisms are widely used in natural language processing because they can capture short and long term sequential information for diverse tasks. We propose an integrated deep learning model for microbial DNA sequence data, which exploits convolutional neural networks, recurrent neural networks, and attention mechanisms to predict taxonomic classifications and sample-associated attributes, such as the relationship between the microbiome and host phenotype, on the read/sequence level. In this paper, we develop this novel deep learning approach and evaluate its application to amplicon sequences. We apply our approach to short DNA reads and full sequences of 16S ribosomal RNA (rRNA) marker genes, which identify the heterogeneity of a microbial community sample. We demonstrate that our implementation of a novel attention-based deep network architecture, Read2Pheno, achieves read-level phenotypic prediction. Training Read2Pheno models will encode sequences (reads) into dense, meaningful representations: learned embedded vectors output from the intermediate layer of the network model, which can provide biological insight when visualized. The attention layer of Read2Pheno models can also automatically identify nucleotide regions in reads/sequences which are particularly informative for classification. As such, this novel approach can avoid pre/post-processing and manual interpretation required with conventional approaches to microbiome sequence classification. We further show, as proof-of-concept, that aggregating read-level information can robustly predict microbial community properties, host phenotype, and taxonomic classification, with performance at least comparable to conventional approaches. An implementation of the attention-based deep learning network is available at https://github.com/EESI/sequence_attention (a python package) and https://github.com/EESI/seq2att (a command line tool).
format article
author Zhengqiao Zhao
Stephen Woloszynek
Felix Agbavor
Joshua Chang Mell
Bahrad A Sokhansanj
Gail L Rosen
author_facet Zhengqiao Zhao
Stephen Woloszynek
Felix Agbavor
Joshua Chang Mell
Bahrad A Sokhansanj
Gail L Rosen
author_sort Zhengqiao Zhao
title Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network.
title_short Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network.
title_full Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network.
title_fullStr Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network.
title_full_unstemmed Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network.
title_sort learning, visualizing and exploring 16s rrna structure using an attention-based deep neural network.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/f942b69fd3b3444fabf60265b9f20db6
work_keys_str_mv AT zhengqiaozhao learningvisualizingandexploring16srrnastructureusinganattentionbaseddeepneuralnetwork
AT stephenwoloszynek learningvisualizingandexploring16srrnastructureusinganattentionbaseddeepneuralnetwork
AT felixagbavor learningvisualizingandexploring16srrnastructureusinganattentionbaseddeepneuralnetwork
AT joshuachangmell learningvisualizingandexploring16srrnastructureusinganattentionbaseddeepneuralnetwork
AT bahradasokhansanj learningvisualizingandexploring16srrnastructureusinganattentionbaseddeepneuralnetwork
AT gaillrosen learningvisualizingandexploring16srrnastructureusinganattentionbaseddeepneuralnetwork
_version_ 1718375816487763968