The CRISPR Spacer Space Is Dominated by Sequences from Species-Specific Mobilomes

ABSTRACT Clustered regularly interspaced short palindromic repeats and CRISPR-associated protein (CRISPR-Cas) systems store the memory of past encounters with foreign DNA in unique spacers that are inserted between direct repeats in CRISPR arrays. For only a small fraction of the spacers, homologous...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Sergey A. Shmakov, Vassilii Sitnik, Kira S. Makarova, Yuri I. Wolf, Konstantin V. Severinov, Eugene V. Koonin
Formato: article
Lenguaje:EN
Publicado: American Society for Microbiology 2017
Materias:
Acceso en línea:https://doaj.org/article/8c876cbf1aa040bc86fed2699abe75c8
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:8c876cbf1aa040bc86fed2699abe75c8
record_format dspace
spelling oai:doaj.org-article:8c876cbf1aa040bc86fed2699abe75c82021-11-15T15:51:51ZThe CRISPR Spacer Space Is Dominated by Sequences from Species-Specific Mobilomes10.1128/mBio.01397-172150-7511https://doaj.org/article/8c876cbf1aa040bc86fed2699abe75c82017-11-01T00:00:00Zhttps://journals.asm.org/doi/10.1128/mBio.01397-17https://doaj.org/toc/2150-7511ABSTRACT Clustered regularly interspaced short palindromic repeats and CRISPR-associated protein (CRISPR-Cas) systems store the memory of past encounters with foreign DNA in unique spacers that are inserted between direct repeats in CRISPR arrays. For only a small fraction of the spacers, homologous sequences, called protospacers, are detectable in viral, plasmid, and microbial genomes. The rest of the spacers remain the CRISPR “dark matter.” We performed a comprehensive analysis of the spacers from all CRISPR-cas loci identified in bacterial and archaeal genomes, and we found that, depending on the CRISPR-Cas subtype and the prokaryotic phylum, protospacers were detectable for 1% to about 19% of the spacers (~7% global average). Among the detected protospacers, the majority, typically 80 to 90%, originated from viral genomes, including proviruses, and among the rest, the most common source was genes that are integrated into microbial chromosomes but are involved in plasmid conjugation or replication. Thus, almost all spacers with identifiable protospacers target mobile genetic elements (MGE). The GC content, as well as dinucleotide and tetranucleotide compositions, of microbial genomes, their spacer complements, and the cognate viral genomes showed a nearly perfect correlation and were almost identical. Given the near absence of self-targeting spacers, these findings are most compatible with the possibility that the spacers, including the dark matter, are derived almost completely from the species-specific microbial mobilomes. IMPORTANCE The principal function of CRISPR-Cas systems is thought to be protection of bacteria and archaea against viruses and other parasitic genetic elements. The CRISPR defense function is mediated by sequences from parasitic elements, known as spacers, that are inserted into CRISPR arrays and then transcribed and employed as guides to identify and inactivate the cognate parasitic genomes. However, only a small fraction of the CRISPR spacers match any sequences in the current databases, and of these, only a minority correspond to known parasitic elements. We show that nearly all spacers with matches originate from viral or plasmid genomes that are either free or have been integrated into the host genome. We further demonstrate that spacers with no matches have the same properties as those of identifiable origins, strongly suggesting that all spacers originate from mobile elements.Sergey A. ShmakovVassilii SitnikKira S. MakarovaYuri I. WolfKonstantin V. SeverinovEugene V. KooninAmerican Society for MicrobiologyarticleCRISPR-Casbacteriophagesmobilomeoligonucleotide compositionspacer acquisitionMicrobiologyQR1-502ENmBio, Vol 8, Iss 5 (2017)
institution DOAJ
collection DOAJ
language EN
topic CRISPR-Cas
bacteriophages
mobilome
oligonucleotide composition
spacer acquisition
Microbiology
QR1-502
spellingShingle CRISPR-Cas
bacteriophages
mobilome
oligonucleotide composition
spacer acquisition
Microbiology
QR1-502
Sergey A. Shmakov
Vassilii Sitnik
Kira S. Makarova
Yuri I. Wolf
Konstantin V. Severinov
Eugene V. Koonin
The CRISPR Spacer Space Is Dominated by Sequences from Species-Specific Mobilomes
description ABSTRACT Clustered regularly interspaced short palindromic repeats and CRISPR-associated protein (CRISPR-Cas) systems store the memory of past encounters with foreign DNA in unique spacers that are inserted between direct repeats in CRISPR arrays. For only a small fraction of the spacers, homologous sequences, called protospacers, are detectable in viral, plasmid, and microbial genomes. The rest of the spacers remain the CRISPR “dark matter.” We performed a comprehensive analysis of the spacers from all CRISPR-cas loci identified in bacterial and archaeal genomes, and we found that, depending on the CRISPR-Cas subtype and the prokaryotic phylum, protospacers were detectable for 1% to about 19% of the spacers (~7% global average). Among the detected protospacers, the majority, typically 80 to 90%, originated from viral genomes, including proviruses, and among the rest, the most common source was genes that are integrated into microbial chromosomes but are involved in plasmid conjugation or replication. Thus, almost all spacers with identifiable protospacers target mobile genetic elements (MGE). The GC content, as well as dinucleotide and tetranucleotide compositions, of microbial genomes, their spacer complements, and the cognate viral genomes showed a nearly perfect correlation and were almost identical. Given the near absence of self-targeting spacers, these findings are most compatible with the possibility that the spacers, including the dark matter, are derived almost completely from the species-specific microbial mobilomes. IMPORTANCE The principal function of CRISPR-Cas systems is thought to be protection of bacteria and archaea against viruses and other parasitic genetic elements. The CRISPR defense function is mediated by sequences from parasitic elements, known as spacers, that are inserted into CRISPR arrays and then transcribed and employed as guides to identify and inactivate the cognate parasitic genomes. However, only a small fraction of the CRISPR spacers match any sequences in the current databases, and of these, only a minority correspond to known parasitic elements. We show that nearly all spacers with matches originate from viral or plasmid genomes that are either free or have been integrated into the host genome. We further demonstrate that spacers with no matches have the same properties as those of identifiable origins, strongly suggesting that all spacers originate from mobile elements.
format article
author Sergey A. Shmakov
Vassilii Sitnik
Kira S. Makarova
Yuri I. Wolf
Konstantin V. Severinov
Eugene V. Koonin
author_facet Sergey A. Shmakov
Vassilii Sitnik
Kira S. Makarova
Yuri I. Wolf
Konstantin V. Severinov
Eugene V. Koonin
author_sort Sergey A. Shmakov
title The CRISPR Spacer Space Is Dominated by Sequences from Species-Specific Mobilomes
title_short The CRISPR Spacer Space Is Dominated by Sequences from Species-Specific Mobilomes
title_full The CRISPR Spacer Space Is Dominated by Sequences from Species-Specific Mobilomes
title_fullStr The CRISPR Spacer Space Is Dominated by Sequences from Species-Specific Mobilomes
title_full_unstemmed The CRISPR Spacer Space Is Dominated by Sequences from Species-Specific Mobilomes
title_sort crispr spacer space is dominated by sequences from species-specific mobilomes
publisher American Society for Microbiology
publishDate 2017
url https://doaj.org/article/8c876cbf1aa040bc86fed2699abe75c8
work_keys_str_mv AT sergeyashmakov thecrisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
AT vassiliisitnik thecrisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
AT kirasmakarova thecrisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
AT yuriiwolf thecrisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
AT konstantinvseverinov thecrisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
AT eugenevkoonin thecrisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
AT sergeyashmakov crisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
AT vassiliisitnik crisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
AT kirasmakarova crisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
AT yuriiwolf crisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
AT konstantinvseverinov crisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
AT eugenevkoonin crisprspacerspaceisdominatedbysequencesfromspeciesspecificmobilomes
_version_ 1718427351640965120