Epidemiological associations with genomic variation in SARS-CoV-2
Abstract SARS-CoV-2 (CoV) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. We divided the CoV genome into 29 constituent regions and applied novel analytical approaches to identify associations between CoV genomic features a...
Guardado en:
Autores principales: | , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Nature Portfolio
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/f38b7a7ac21d46d6bbf4eef2b9aa3065 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:f38b7a7ac21d46d6bbf4eef2b9aa3065 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:f38b7a7ac21d46d6bbf4eef2b9aa30652021-11-28T12:15:32ZEpidemiological associations with genomic variation in SARS-CoV-210.1038/s41598-021-02548-w2045-2322https://doaj.org/article/f38b7a7ac21d46d6bbf4eef2b9aa30652021-11-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-02548-whttps://doaj.org/toc/2045-2322Abstract SARS-CoV-2 (CoV) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. We divided the CoV genome into 29 constituent regions and applied novel analytical approaches to identify associations between CoV genomic features and epidemiological metadata. Our results show that nonstructural protein 3 (nsp3) and Spike protein (S) have the highest variation and greatest correlation with the viral whole-genome variation. S protein variation is correlated with nsp3, nsp6, and 3′-to-5′ exonuclease variation. Country of origin and time since the start of the pandemic were the most influential metadata associated with genomic variation, while host sex and age were the least influential. We define a novel statistic—coherence—and show its utility in identifying geographic regions (populations) with unusually high (many new variants) or low (isolated) viral phylogenetic diversity. Interestingly, at both global and regional scales, we identify geographic locations with high coherence neighboring regions of low coherence; this emphasizes the utility of this metric to inform public health measures for disease spread. Our results provide a direction to prioritize genes associated with outcome predictors (e.g., health, therapeutic, and vaccine outcomes) and to improve DNA tests for predicting disease status.Ali RahnavardTyson DawsonRebecca ClementNathaniel StearrettMarcos Pérez-LosadaKeith A. CrandallNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-10 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Ali Rahnavard Tyson Dawson Rebecca Clement Nathaniel Stearrett Marcos Pérez-Losada Keith A. Crandall Epidemiological associations with genomic variation in SARS-CoV-2 |
description |
Abstract SARS-CoV-2 (CoV) is the etiological agent of the COVID-19 pandemic and evolves to evade both host immune systems and intervention strategies. We divided the CoV genome into 29 constituent regions and applied novel analytical approaches to identify associations between CoV genomic features and epidemiological metadata. Our results show that nonstructural protein 3 (nsp3) and Spike protein (S) have the highest variation and greatest correlation with the viral whole-genome variation. S protein variation is correlated with nsp3, nsp6, and 3′-to-5′ exonuclease variation. Country of origin and time since the start of the pandemic were the most influential metadata associated with genomic variation, while host sex and age were the least influential. We define a novel statistic—coherence—and show its utility in identifying geographic regions (populations) with unusually high (many new variants) or low (isolated) viral phylogenetic diversity. Interestingly, at both global and regional scales, we identify geographic locations with high coherence neighboring regions of low coherence; this emphasizes the utility of this metric to inform public health measures for disease spread. Our results provide a direction to prioritize genes associated with outcome predictors (e.g., health, therapeutic, and vaccine outcomes) and to improve DNA tests for predicting disease status. |
format |
article |
author |
Ali Rahnavard Tyson Dawson Rebecca Clement Nathaniel Stearrett Marcos Pérez-Losada Keith A. Crandall |
author_facet |
Ali Rahnavard Tyson Dawson Rebecca Clement Nathaniel Stearrett Marcos Pérez-Losada Keith A. Crandall |
author_sort |
Ali Rahnavard |
title |
Epidemiological associations with genomic variation in SARS-CoV-2 |
title_short |
Epidemiological associations with genomic variation in SARS-CoV-2 |
title_full |
Epidemiological associations with genomic variation in SARS-CoV-2 |
title_fullStr |
Epidemiological associations with genomic variation in SARS-CoV-2 |
title_full_unstemmed |
Epidemiological associations with genomic variation in SARS-CoV-2 |
title_sort |
epidemiological associations with genomic variation in sars-cov-2 |
publisher |
Nature Portfolio |
publishDate |
2021 |
url |
https://doaj.org/article/f38b7a7ac21d46d6bbf4eef2b9aa3065 |
work_keys_str_mv |
AT alirahnavard epidemiologicalassociationswithgenomicvariationinsarscov2 AT tysondawson epidemiologicalassociationswithgenomicvariationinsarscov2 AT rebeccaclement epidemiologicalassociationswithgenomicvariationinsarscov2 AT nathanielstearrett epidemiologicalassociationswithgenomicvariationinsarscov2 AT marcosperezlosada epidemiologicalassociationswithgenomicvariationinsarscov2 AT keithacrandall epidemiologicalassociationswithgenomicvariationinsarscov2 |
_version_ |
1718408113138171904 |