Machine learning approaches to predict the Plant-associated phenotype of Xanthomonas strains
Abstract Background The genus Xanthomonas has long been considered to consist predominantly of plant pathogens, but over the last decade there has been an increasing number of reports on non-pathogenic and endophytic members. As Xanthomonas species are prevalent pathogens on a wide variety of import...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
BMC
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/95ed90908a38453e8622ba0e6b17c599 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:95ed90908a38453e8622ba0e6b17c599 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:95ed90908a38453e8622ba0e6b17c5992021-11-28T12:23:04ZMachine learning approaches to predict the Plant-associated phenotype of Xanthomonas strains10.1186/s12864-021-08093-01471-2164https://doaj.org/article/95ed90908a38453e8622ba0e6b17c5992021-11-01T00:00:00Zhttps://doi.org/10.1186/s12864-021-08093-0https://doaj.org/toc/1471-2164Abstract Background The genus Xanthomonas has long been considered to consist predominantly of plant pathogens, but over the last decade there has been an increasing number of reports on non-pathogenic and endophytic members. As Xanthomonas species are prevalent pathogens on a wide variety of important crops around the world, there is a need to distinguish between these plant-associated phenotypes. To date a large number of Xanthomonas genomes have been sequenced, which enables the application of machine learning (ML) approaches on the genome content to predict this phenotype. Until now such approaches to the pathogenomics of Xanthomonas strains have been hampered by the fragmentation of information regarding pathogenicity of individual strains over many studies. Unification of this information into a single resource was therefore considered to be an essential step. Results Mining of 39 papers considering both plant-associated phenotypes, allowed for a phenotypic classification of 578 Xanthomonas strains. For 65 plant-pathogenic and 53 non-pathogenic strains the corresponding genomes were available and de novo annotated for the presence of Pfam protein domains used as features to train and compare three ML classification algorithms; CART, Lasso and Random Forest. Conclusion The literature resource in combination with recursive feature extraction used in the ML classification algorithms provided further insights into the virulence enabling factors, but also highlighted domains linked to traits not present in pathogenic strains.Dennie te MolderWasin PoncheewinPeter J. SchaapJasper J. KoehorstBMCarticlePathogenicityProtein domainsMachine learningXanthomonasBiotechnologyTP248.13-248.65GeneticsQH426-470ENBMC Genomics, Vol 22, Iss 1, Pp 1-14 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Pathogenicity Protein domains Machine learning Xanthomonas Biotechnology TP248.13-248.65 Genetics QH426-470 |
spellingShingle |
Pathogenicity Protein domains Machine learning Xanthomonas Biotechnology TP248.13-248.65 Genetics QH426-470 Dennie te Molder Wasin Poncheewin Peter J. Schaap Jasper J. Koehorst Machine learning approaches to predict the Plant-associated phenotype of Xanthomonas strains |
description |
Abstract Background The genus Xanthomonas has long been considered to consist predominantly of plant pathogens, but over the last decade there has been an increasing number of reports on non-pathogenic and endophytic members. As Xanthomonas species are prevalent pathogens on a wide variety of important crops around the world, there is a need to distinguish between these plant-associated phenotypes. To date a large number of Xanthomonas genomes have been sequenced, which enables the application of machine learning (ML) approaches on the genome content to predict this phenotype. Until now such approaches to the pathogenomics of Xanthomonas strains have been hampered by the fragmentation of information regarding pathogenicity of individual strains over many studies. Unification of this information into a single resource was therefore considered to be an essential step. Results Mining of 39 papers considering both plant-associated phenotypes, allowed for a phenotypic classification of 578 Xanthomonas strains. For 65 plant-pathogenic and 53 non-pathogenic strains the corresponding genomes were available and de novo annotated for the presence of Pfam protein domains used as features to train and compare three ML classification algorithms; CART, Lasso and Random Forest. Conclusion The literature resource in combination with recursive feature extraction used in the ML classification algorithms provided further insights into the virulence enabling factors, but also highlighted domains linked to traits not present in pathogenic strains. |
format |
article |
author |
Dennie te Molder Wasin Poncheewin Peter J. Schaap Jasper J. Koehorst |
author_facet |
Dennie te Molder Wasin Poncheewin Peter J. Schaap Jasper J. Koehorst |
author_sort |
Dennie te Molder |
title |
Machine learning approaches to predict the Plant-associated phenotype of Xanthomonas strains |
title_short |
Machine learning approaches to predict the Plant-associated phenotype of Xanthomonas strains |
title_full |
Machine learning approaches to predict the Plant-associated phenotype of Xanthomonas strains |
title_fullStr |
Machine learning approaches to predict the Plant-associated phenotype of Xanthomonas strains |
title_full_unstemmed |
Machine learning approaches to predict the Plant-associated phenotype of Xanthomonas strains |
title_sort |
machine learning approaches to predict the plant-associated phenotype of xanthomonas strains |
publisher |
BMC |
publishDate |
2021 |
url |
https://doaj.org/article/95ed90908a38453e8622ba0e6b17c599 |
work_keys_str_mv |
AT dennietemolder machinelearningapproachestopredicttheplantassociatedphenotypeofxanthomonasstrains AT wasinponcheewin machinelearningapproachestopredicttheplantassociatedphenotypeofxanthomonasstrains AT peterjschaap machinelearningapproachestopredicttheplantassociatedphenotypeofxanthomonasstrains AT jasperjkoehorst machinelearningapproachestopredicttheplantassociatedphenotypeofxanthomonasstrains |
_version_ |
1718408030558617600 |