Artificial neural networks trained to detect viral and phage structural proteins.

Phages play critical roles in the survival and pathogenicity of their hosts, via lysogenic conversion factors, and in nutrient redistribution, via cell lysis. Analyses of phage- and viral-encoded genes in environmental samples provide insights into the physiological impact of viruses on microbial co...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Victor Seguritan, Nelson Alves, Michael Arnoult, Amy Raymond, Don Lorimer, Alex B Burgin, Peter Salamon, Anca M Segall
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2012
Materias:
Acceso en línea:https://doaj.org/article/3fcd19c29c714c9c8afaeb6dc4345047
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:3fcd19c29c714c9c8afaeb6dc4345047
record_format dspace
spelling oai:doaj.org-article:3fcd19c29c714c9c8afaeb6dc43450472021-11-18T05:51:03ZArtificial neural networks trained to detect viral and phage structural proteins.1553-734X1553-735810.1371/journal.pcbi.1002657https://doaj.org/article/3fcd19c29c714c9c8afaeb6dc43450472012-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/22927809/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Phages play critical roles in the survival and pathogenicity of their hosts, via lysogenic conversion factors, and in nutrient redistribution, via cell lysis. Analyses of phage- and viral-encoded genes in environmental samples provide insights into the physiological impact of viruses on microbial communities and human health. However, phage ORFs are extremely diverse of which over 70% of them are dissimilar to any genes with annotated functions in GenBank. Better identification of viruses would also aid in better detection and diagnosis of disease, in vaccine development, and generally in better understanding the physiological potential of any environment. In contrast to enzymes, viral structural protein function can be much more challenging to detect from sequence data because of low sequence conservation, few known conserved catalytic sites or sequence domains, and relatively limited experimental data. We have designed a method of predicting phage structural protein sequences that uses Artificial Neural Networks (ANNs). First, we trained ANNs to classify viral structural proteins using amino acid frequency; these correctly classify a large fraction of test cases with a high degree of specificity and sensitivity. Subsequently, we added estimates of protein isoelectric points as a feature to ANNs that classify specialized families of proteins, namely major capsid and tail proteins. As expected, these more specialized ANNs are more accurate than the structural ANNs. To experimentally validate the ANN predictions, several ORFs with no significant similarities to known sequences that are ANN-predicted structural proteins were examined by transmission electron microscopy. Some of these self-assembled into structures strongly resembling virion structures. Thus, our ANNs are new tools for identifying phage and potential prophage structural proteins that are difficult or impossible to detect by other bioinformatic analysis. The networks will be valuable when sequence is available but in vitro propagation of the phage may not be practical or possible.Victor SeguritanNelson AlvesMichael ArnoultAmy RaymondDon LorimerAlex B BurginPeter SalamonAnca M SegallPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 8, Iss 8, p e1002657 (2012)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Victor Seguritan
Nelson Alves
Michael Arnoult
Amy Raymond
Don Lorimer
Alex B Burgin
Peter Salamon
Anca M Segall
Artificial neural networks trained to detect viral and phage structural proteins.
description Phages play critical roles in the survival and pathogenicity of their hosts, via lysogenic conversion factors, and in nutrient redistribution, via cell lysis. Analyses of phage- and viral-encoded genes in environmental samples provide insights into the physiological impact of viruses on microbial communities and human health. However, phage ORFs are extremely diverse of which over 70% of them are dissimilar to any genes with annotated functions in GenBank. Better identification of viruses would also aid in better detection and diagnosis of disease, in vaccine development, and generally in better understanding the physiological potential of any environment. In contrast to enzymes, viral structural protein function can be much more challenging to detect from sequence data because of low sequence conservation, few known conserved catalytic sites or sequence domains, and relatively limited experimental data. We have designed a method of predicting phage structural protein sequences that uses Artificial Neural Networks (ANNs). First, we trained ANNs to classify viral structural proteins using amino acid frequency; these correctly classify a large fraction of test cases with a high degree of specificity and sensitivity. Subsequently, we added estimates of protein isoelectric points as a feature to ANNs that classify specialized families of proteins, namely major capsid and tail proteins. As expected, these more specialized ANNs are more accurate than the structural ANNs. To experimentally validate the ANN predictions, several ORFs with no significant similarities to known sequences that are ANN-predicted structural proteins were examined by transmission electron microscopy. Some of these self-assembled into structures strongly resembling virion structures. Thus, our ANNs are new tools for identifying phage and potential prophage structural proteins that are difficult or impossible to detect by other bioinformatic analysis. The networks will be valuable when sequence is available but in vitro propagation of the phage may not be practical or possible.
format article
author Victor Seguritan
Nelson Alves
Michael Arnoult
Amy Raymond
Don Lorimer
Alex B Burgin
Peter Salamon
Anca M Segall
author_facet Victor Seguritan
Nelson Alves
Michael Arnoult
Amy Raymond
Don Lorimer
Alex B Burgin
Peter Salamon
Anca M Segall
author_sort Victor Seguritan
title Artificial neural networks trained to detect viral and phage structural proteins.
title_short Artificial neural networks trained to detect viral and phage structural proteins.
title_full Artificial neural networks trained to detect viral and phage structural proteins.
title_fullStr Artificial neural networks trained to detect viral and phage structural proteins.
title_full_unstemmed Artificial neural networks trained to detect viral and phage structural proteins.
title_sort artificial neural networks trained to detect viral and phage structural proteins.
publisher Public Library of Science (PLoS)
publishDate 2012
url https://doaj.org/article/3fcd19c29c714c9c8afaeb6dc4345047
work_keys_str_mv AT victorseguritan artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT nelsonalves artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT michaelarnoult artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT amyraymond artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT donlorimer artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT alexbburgin artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT petersalamon artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
AT ancamsegall artificialneuralnetworkstrainedtodetectviralandphagestructuralproteins
_version_ 1718424720786849792