Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage

ABSTRACT Prokaryotic genome annotation is heavily dependent on automated gene annotation pipelines that are prone to propagate errors and underestimate genome complexity. We describe an optimized proteogenomic workflow that uses ribosome profiling (ribo-seq) and proteomic data for Salmonella enteric...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Patrick Willems, Igor Fijalkowski, Petra Van Damme
Formato: article
Lenguaje:EN
Publicado: American Society for Microbiology 2020
Materias:
Acceso en línea:https://doaj.org/article/d37629f750554a13b702882925d5bd97
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:d37629f750554a13b702882925d5bd97
record_format dspace
spelling oai:doaj.org-article:d37629f750554a13b702882925d5bd972021-12-02T18:44:44ZLost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage10.1128/mSystems.00833-202379-5077https://doaj.org/article/d37629f750554a13b702882925d5bd972020-10-01T00:00:00Zhttps://journals.asm.org/doi/10.1128/mSystems.00833-20https://doaj.org/toc/2379-5077ABSTRACT Prokaryotic genome annotation is heavily dependent on automated gene annotation pipelines that are prone to propagate errors and underestimate genome complexity. We describe an optimized proteogenomic workflow that uses ribosome profiling (ribo-seq) and proteomic data for Salmonella enterica serovar Typhimurium to identify unannotated proteins or alternative protein forms. This data analysis encompasses the searching of cofragmenting peptides and postprocessing with extended peptide-to-spectrum quality features, including comparison to predicted fragment ion intensities. When this strategy is applied, an enhanced proteome depth is achieved, as well as greater confidence for unannotated peptide hits. We demonstrate the general applicability of our pipeline by reanalyzing public Deinococcus radiodurans data sets. Taken together, our results show that systematic reanalysis using available prokaryotic (proteome) data sets holds great promise to assist in experimentally based genome annotation. IMPORTANCE Delineation of open reading frames (ORFs) causes persistent inconsistencies in prokaryote genome annotation. We demonstrate that by advanced (re)analysis of omics data, a higher proteome coverage and sensitive detection of unannotated ORFs can be achieved, which can be exploited for conditional bacterial genome (re)annotation, which is especially relevant in view of annotating the wealth of sequenced prokaryotic genomes obtained in recent years.Patrick WillemsIgor FijalkowskiPetra Van DammeAmerican Society for MicrobiologyarticleDeinococcus radioduransSalmonellaalternative translation initiationbacterial genome (re)annotationchimeric spectrariboproteogenomicsMicrobiologyQR1-502ENmSystems, Vol 5, Iss 5 (2020)
institution DOAJ
collection DOAJ
language EN
topic Deinococcus radiodurans
Salmonella
alternative translation initiation
bacterial genome (re)annotation
chimeric spectra
riboproteogenomics
Microbiology
QR1-502
spellingShingle Deinococcus radiodurans
Salmonella
alternative translation initiation
bacterial genome (re)annotation
chimeric spectra
riboproteogenomics
Microbiology
QR1-502
Patrick Willems
Igor Fijalkowski
Petra Van Damme
Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage
description ABSTRACT Prokaryotic genome annotation is heavily dependent on automated gene annotation pipelines that are prone to propagate errors and underestimate genome complexity. We describe an optimized proteogenomic workflow that uses ribosome profiling (ribo-seq) and proteomic data for Salmonella enterica serovar Typhimurium to identify unannotated proteins or alternative protein forms. This data analysis encompasses the searching of cofragmenting peptides and postprocessing with extended peptide-to-spectrum quality features, including comparison to predicted fragment ion intensities. When this strategy is applied, an enhanced proteome depth is achieved, as well as greater confidence for unannotated peptide hits. We demonstrate the general applicability of our pipeline by reanalyzing public Deinococcus radiodurans data sets. Taken together, our results show that systematic reanalysis using available prokaryotic (proteome) data sets holds great promise to assist in experimentally based genome annotation. IMPORTANCE Delineation of open reading frames (ORFs) causes persistent inconsistencies in prokaryote genome annotation. We demonstrate that by advanced (re)analysis of omics data, a higher proteome coverage and sensitive detection of unannotated ORFs can be achieved, which can be exploited for conditional bacterial genome (re)annotation, which is especially relevant in view of annotating the wealth of sequenced prokaryotic genomes obtained in recent years.
format article
author Patrick Willems
Igor Fijalkowski
Petra Van Damme
author_facet Patrick Willems
Igor Fijalkowski
Petra Van Damme
author_sort Patrick Willems
title Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage
title_short Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage
title_full Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage
title_fullStr Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage
title_full_unstemmed Lost and Found: Re-searching and Re-scoring Proteomics Data Aids Genome Annotation and Improves Proteome Coverage
title_sort lost and found: re-searching and re-scoring proteomics data aids genome annotation and improves proteome coverage
publisher American Society for Microbiology
publishDate 2020
url https://doaj.org/article/d37629f750554a13b702882925d5bd97
work_keys_str_mv AT patrickwillems lostandfoundresearchingandrescoringproteomicsdataaidsgenomeannotationandimprovesproteomecoverage
AT igorfijalkowski lostandfoundresearchingandrescoringproteomicsdataaidsgenomeannotationandimprovesproteomecoverage
AT petravandamme lostandfoundresearchingandrescoringproteomicsdataaidsgenomeannotationandimprovesproteomecoverage
_version_ 1718377708676710400