Genes and regulatory mechanisms associated with experimentally-induced bovine respiratory disease identified using supervised machine learning methodology

Abstract Bovine respiratory disease (BRD) is a multifactorial disease involving complex host immune interactions shaped by pathogenic agents and environmental factors. Advancements in RNA sequencing and associated analytical methods are improving our understanding of host response related to BRD pat...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Matthew A. Scott, Amelia R. Woolums, Cyprianna E. Swiderski, Andy D. Perkins, Bindu Nanduri
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/129c74af2e0e42edacc1e1ed6e71e16c
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:129c74af2e0e42edacc1e1ed6e71e16c
record_format dspace
spelling oai:doaj.org-article:129c74af2e0e42edacc1e1ed6e71e16c2021-11-28T12:16:41ZGenes and regulatory mechanisms associated with experimentally-induced bovine respiratory disease identified using supervised machine learning methodology10.1038/s41598-021-02343-72045-2322https://doaj.org/article/129c74af2e0e42edacc1e1ed6e71e16c2021-11-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-02343-7https://doaj.org/toc/2045-2322Abstract Bovine respiratory disease (BRD) is a multifactorial disease involving complex host immune interactions shaped by pathogenic agents and environmental factors. Advancements in RNA sequencing and associated analytical methods are improving our understanding of host response related to BRD pathophysiology. Supervised machine learning (ML) approaches present one such method for analyzing new and previously published transcriptome data to identify novel disease-associated genes and mechanisms. Our objective was to apply ML models to lung and immunological tissue datasets acquired from previous clinical BRD experiments to identify genes that classify disease with high accuracy. Raw mRNA sequencing reads from 151 bovine datasets (n = 123 BRD, n = 28 control) were downloaded from NCBI-GEO. Quality filtered reads were assembled in a HISAT2/Stringtie2 pipeline. Raw gene counts for ML analysis were normalized, transformed, and analyzed with MLSeq, utilizing six ML models. Cross-validation parameters (fivefold, repeated 10 times) were applied to 70% of the compiled datasets for ML model training and parameter tuning; optimized ML models were tested with the remaining 30%. Downstream analysis of significant genes identified by the top ML models, based on classification accuracy for each etiological association, was performed within WebGestalt and Reactome (FDR ≤ 0.05). Nearest shrunken centroid and Poisson linear discriminant analysis with power transformation models identified 154 and 195 significant genes for IBR and BRSV, respectively; from these genes, the two ML models discriminated IBR and BRSV with 100% accuracy compared to sham controls. Significant genes classified by the top ML models in IBR (154) and BRSV (195), but not BVDV (74), were related to type I interferon production and IL-8 secretion, specifically in lymphoid tissue and not homogenized lung tissue. Genes identified in Mannheimia haemolytica infections (97) were involved in activating classical and alternative pathways of complement. Novel findings, including expression of genes related to reduced mitochondrial oxygenation and ATP synthesis in consolidated lung tissue, were discovered. Genes identified in each analysis represent distinct genomic events relevant to understanding and predicting clinical BRD. Our analysis demonstrates the utility of ML with published datasets for discovering functional information to support the prediction and understanding of clinical BRD.Matthew A. ScottAmelia R. WoolumsCyprianna E. SwiderskiAndy D. PerkinsBindu NanduriNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-11 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Matthew A. Scott
Amelia R. Woolums
Cyprianna E. Swiderski
Andy D. Perkins
Bindu Nanduri
Genes and regulatory mechanisms associated with experimentally-induced bovine respiratory disease identified using supervised machine learning methodology
description Abstract Bovine respiratory disease (BRD) is a multifactorial disease involving complex host immune interactions shaped by pathogenic agents and environmental factors. Advancements in RNA sequencing and associated analytical methods are improving our understanding of host response related to BRD pathophysiology. Supervised machine learning (ML) approaches present one such method for analyzing new and previously published transcriptome data to identify novel disease-associated genes and mechanisms. Our objective was to apply ML models to lung and immunological tissue datasets acquired from previous clinical BRD experiments to identify genes that classify disease with high accuracy. Raw mRNA sequencing reads from 151 bovine datasets (n = 123 BRD, n = 28 control) were downloaded from NCBI-GEO. Quality filtered reads were assembled in a HISAT2/Stringtie2 pipeline. Raw gene counts for ML analysis were normalized, transformed, and analyzed with MLSeq, utilizing six ML models. Cross-validation parameters (fivefold, repeated 10 times) were applied to 70% of the compiled datasets for ML model training and parameter tuning; optimized ML models were tested with the remaining 30%. Downstream analysis of significant genes identified by the top ML models, based on classification accuracy for each etiological association, was performed within WebGestalt and Reactome (FDR ≤ 0.05). Nearest shrunken centroid and Poisson linear discriminant analysis with power transformation models identified 154 and 195 significant genes for IBR and BRSV, respectively; from these genes, the two ML models discriminated IBR and BRSV with 100% accuracy compared to sham controls. Significant genes classified by the top ML models in IBR (154) and BRSV (195), but not BVDV (74), were related to type I interferon production and IL-8 secretion, specifically in lymphoid tissue and not homogenized lung tissue. Genes identified in Mannheimia haemolytica infections (97) were involved in activating classical and alternative pathways of complement. Novel findings, including expression of genes related to reduced mitochondrial oxygenation and ATP synthesis in consolidated lung tissue, were discovered. Genes identified in each analysis represent distinct genomic events relevant to understanding and predicting clinical BRD. Our analysis demonstrates the utility of ML with published datasets for discovering functional information to support the prediction and understanding of clinical BRD.
format article
author Matthew A. Scott
Amelia R. Woolums
Cyprianna E. Swiderski
Andy D. Perkins
Bindu Nanduri
author_facet Matthew A. Scott
Amelia R. Woolums
Cyprianna E. Swiderski
Andy D. Perkins
Bindu Nanduri
author_sort Matthew A. Scott
title Genes and regulatory mechanisms associated with experimentally-induced bovine respiratory disease identified using supervised machine learning methodology
title_short Genes and regulatory mechanisms associated with experimentally-induced bovine respiratory disease identified using supervised machine learning methodology
title_full Genes and regulatory mechanisms associated with experimentally-induced bovine respiratory disease identified using supervised machine learning methodology
title_fullStr Genes and regulatory mechanisms associated with experimentally-induced bovine respiratory disease identified using supervised machine learning methodology
title_full_unstemmed Genes and regulatory mechanisms associated with experimentally-induced bovine respiratory disease identified using supervised machine learning methodology
title_sort genes and regulatory mechanisms associated with experimentally-induced bovine respiratory disease identified using supervised machine learning methodology
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/129c74af2e0e42edacc1e1ed6e71e16c
work_keys_str_mv AT matthewascott genesandregulatorymechanismsassociatedwithexperimentallyinducedbovinerespiratorydiseaseidentifiedusingsupervisedmachinelearningmethodology
AT ameliarwoolums genesandregulatorymechanismsassociatedwithexperimentallyinducedbovinerespiratorydiseaseidentifiedusingsupervisedmachinelearningmethodology
AT cypriannaeswiderski genesandregulatorymechanismsassociatedwithexperimentallyinducedbovinerespiratorydiseaseidentifiedusingsupervisedmachinelearningmethodology
AT andydperkins genesandregulatorymechanismsassociatedwithexperimentallyinducedbovinerespiratorydiseaseidentifiedusingsupervisedmachinelearningmethodology
AT bindunanduri genesandregulatorymechanismsassociatedwithexperimentallyinducedbovinerespiratorydiseaseidentifiedusingsupervisedmachinelearningmethodology
_version_ 1718408060515385344