A Genome-Based Model to Predict the Virulence of <named-content content-type="genus-species">Pseudomonas aeruginosa</named-content> Isolates

ABSTRACT Variation in the genome of Pseudomonas aeruginosa, an important pathogen, can have dramatic impacts on the bacterium’s ability to cause disease. We therefore asked whether it was possible to predict the virulence of P. aeruginosa isolates based on their genomic content. We applied a machine...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Nathan B. Pincus, Egon A. Ozer, Jonathan P. Allen, Marcus Nguyen, James J. Davis, Deborah R. Winter, Chih-Hsien Chuang, Cheng-Hsun Chiu, Laura Zamorano, Antonio Oliver, Alan R. Hauser
Formato: article
Lenguaje:EN
Publicado: American Society for Microbiology 2020
Materias:
Acceso en línea:https://doaj.org/article/1908f46e3086477ca33102afad2a73da
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:1908f46e3086477ca33102afad2a73da
record_format dspace
spelling oai:doaj.org-article:1908f46e3086477ca33102afad2a73da2021-11-15T15:56:45ZA Genome-Based Model to Predict the Virulence of <named-content content-type="genus-species">Pseudomonas aeruginosa</named-content> Isolates10.1128/mBio.01527-202150-7511https://doaj.org/article/1908f46e3086477ca33102afad2a73da2020-08-01T00:00:00Zhttps://journals.asm.org/doi/10.1128/mBio.01527-20https://doaj.org/toc/2150-7511ABSTRACT Variation in the genome of Pseudomonas aeruginosa, an important pathogen, can have dramatic impacts on the bacterium’s ability to cause disease. We therefore asked whether it was possible to predict the virulence of P. aeruginosa isolates based on their genomic content. We applied a machine learning approach to a genetically and phenotypically diverse collection of 115 clinical P. aeruginosa isolates using genomic information and corresponding virulence phenotypes in a mouse model of bacteremia. We defined the accessory genome of these isolates through the presence or absence of accessory genomic elements (AGEs), sequences present in some strains but not others. Machine learning models trained using AGEs were predictive of virulence, with a mean nested cross-validation accuracy of 75% using the random forest algorithm. However, individual AGEs did not have a large influence on the algorithm’s performance, suggesting instead that virulence predictions are derived from a diffuse genomic signature. These results were validated with an independent test set of 25 P. aeruginosa isolates whose virulence was predicted with 72% accuracy. Machine learning models trained using core genome single-nucleotide variants and whole-genome k-mers also predicted virulence. Our findings are a proof of concept for the use of bacterial genomes to predict pathogenicity in P. aeruginosa and highlight the potential of this approach for predicting patient outcomes. IMPORTANCE Pseudomonas aeruginosa is a clinically important Gram-negative opportunistic pathogen. P. aeruginosa shows a large degree of genomic heterogeneity both through variation in sequences found throughout the species (core genome) and through the presence or absence of sequences in different isolates (accessory genome). P. aeruginosa isolates also differ markedly in their ability to cause disease. In this study, we used machine learning to predict the virulence level of P. aeruginosa isolates in a mouse bacteremia model based on genomic content. We show that both the accessory and core genomes are predictive of virulence. This study provides a machine learning framework to investigate relationships between bacterial genomes and complex phenotypes such as virulence.Nathan B. PincusEgon A. OzerJonathan P. AllenMarcus NguyenJames J. DavisDeborah R. WinterChih-Hsien ChuangCheng-Hsun ChiuLaura ZamoranoAntonio OliverAlan R. HauserAmerican Society for MicrobiologyarticlePseudomonas aeruginosagenome analysismachine learningmodelingpredictionvirulenceMicrobiologyQR1-502ENmBio, Vol 11, Iss 4 (2020)
institution DOAJ
collection DOAJ
language EN
topic Pseudomonas aeruginosa
genome analysis
machine learning
modeling
prediction
virulence
Microbiology
QR1-502
spellingShingle Pseudomonas aeruginosa
genome analysis
machine learning
modeling
prediction
virulence
Microbiology
QR1-502
Nathan B. Pincus
Egon A. Ozer
Jonathan P. Allen
Marcus Nguyen
James J. Davis
Deborah R. Winter
Chih-Hsien Chuang
Cheng-Hsun Chiu
Laura Zamorano
Antonio Oliver
Alan R. Hauser
A Genome-Based Model to Predict the Virulence of <named-content content-type="genus-species">Pseudomonas aeruginosa</named-content> Isolates
description ABSTRACT Variation in the genome of Pseudomonas aeruginosa, an important pathogen, can have dramatic impacts on the bacterium’s ability to cause disease. We therefore asked whether it was possible to predict the virulence of P. aeruginosa isolates based on their genomic content. We applied a machine learning approach to a genetically and phenotypically diverse collection of 115 clinical P. aeruginosa isolates using genomic information and corresponding virulence phenotypes in a mouse model of bacteremia. We defined the accessory genome of these isolates through the presence or absence of accessory genomic elements (AGEs), sequences present in some strains but not others. Machine learning models trained using AGEs were predictive of virulence, with a mean nested cross-validation accuracy of 75% using the random forest algorithm. However, individual AGEs did not have a large influence on the algorithm’s performance, suggesting instead that virulence predictions are derived from a diffuse genomic signature. These results were validated with an independent test set of 25 P. aeruginosa isolates whose virulence was predicted with 72% accuracy. Machine learning models trained using core genome single-nucleotide variants and whole-genome k-mers also predicted virulence. Our findings are a proof of concept for the use of bacterial genomes to predict pathogenicity in P. aeruginosa and highlight the potential of this approach for predicting patient outcomes. IMPORTANCE Pseudomonas aeruginosa is a clinically important Gram-negative opportunistic pathogen. P. aeruginosa shows a large degree of genomic heterogeneity both through variation in sequences found throughout the species (core genome) and through the presence or absence of sequences in different isolates (accessory genome). P. aeruginosa isolates also differ markedly in their ability to cause disease. In this study, we used machine learning to predict the virulence level of P. aeruginosa isolates in a mouse bacteremia model based on genomic content. We show that both the accessory and core genomes are predictive of virulence. This study provides a machine learning framework to investigate relationships between bacterial genomes and complex phenotypes such as virulence.
format article
author Nathan B. Pincus
Egon A. Ozer
Jonathan P. Allen
Marcus Nguyen
James J. Davis
Deborah R. Winter
Chih-Hsien Chuang
Cheng-Hsun Chiu
Laura Zamorano
Antonio Oliver
Alan R. Hauser
author_facet Nathan B. Pincus
Egon A. Ozer
Jonathan P. Allen
Marcus Nguyen
James J. Davis
Deborah R. Winter
Chih-Hsien Chuang
Cheng-Hsun Chiu
Laura Zamorano
Antonio Oliver
Alan R. Hauser
author_sort Nathan B. Pincus
title A Genome-Based Model to Predict the Virulence of <named-content content-type="genus-species">Pseudomonas aeruginosa</named-content> Isolates
title_short A Genome-Based Model to Predict the Virulence of <named-content content-type="genus-species">Pseudomonas aeruginosa</named-content> Isolates
title_full A Genome-Based Model to Predict the Virulence of <named-content content-type="genus-species">Pseudomonas aeruginosa</named-content> Isolates
title_fullStr A Genome-Based Model to Predict the Virulence of <named-content content-type="genus-species">Pseudomonas aeruginosa</named-content> Isolates
title_full_unstemmed A Genome-Based Model to Predict the Virulence of <named-content content-type="genus-species">Pseudomonas aeruginosa</named-content> Isolates
title_sort genome-based model to predict the virulence of <named-content content-type="genus-species">pseudomonas aeruginosa</named-content> isolates
publisher American Society for Microbiology
publishDate 2020
url https://doaj.org/article/1908f46e3086477ca33102afad2a73da
work_keys_str_mv AT nathanbpincus agenomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT egonaozer agenomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT jonathanpallen agenomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT marcusnguyen agenomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT jamesjdavis agenomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT deborahrwinter agenomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT chihhsienchuang agenomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT chenghsunchiu agenomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT laurazamorano agenomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT antoniooliver agenomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT alanrhauser agenomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT nathanbpincus genomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT egonaozer genomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT jonathanpallen genomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT marcusnguyen genomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT jamesjdavis genomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT deborahrwinter genomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT chihhsienchuang genomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT chenghsunchiu genomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT laurazamorano genomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT antoniooliver genomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
AT alanrhauser genomebasedmodeltopredictthevirulenceofnamedcontentcontenttypegenusspeciespseudomonasaeruginosanamedcontentisolates
_version_ 1718427073030127616