A survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing.

While it is widely held that an organism's genomic information should remain constant, several protein families are known to modify it. Members of the AID/APOBEC protein family can deaminate DNA. Similarly, members of the ADAR family can deaminate RNA. Characterizing the scope of these events i...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Alexander Wait Zaranek, Erez Y Levanon, Tomer Zecharia, Tom Clegg, George M Church
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2010
Materias:
Acceso en línea:https://doaj.org/article/a85d2d4d5e13428f9f52c85eb9e27268
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:a85d2d4d5e13428f9f52c85eb9e27268
record_format dspace
spelling oai:doaj.org-article:a85d2d4d5e13428f9f52c85eb9e272682021-12-02T20:03:41ZA survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing.1553-73901553-740410.1371/journal.pgen.1000954https://doaj.org/article/a85d2d4d5e13428f9f52c85eb9e272682010-05-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/20531933/pdf/?tool=EBIhttps://doaj.org/toc/1553-7390https://doaj.org/toc/1553-7404While it is widely held that an organism's genomic information should remain constant, several protein families are known to modify it. Members of the AID/APOBEC protein family can deaminate DNA. Similarly, members of the ADAR family can deaminate RNA. Characterizing the scope of these events is challenging. Here we use large genomic data sets, such as the two billion sequences in the NCBI Trace Archive, to look for clusters of mismatches of the same type, which are a hallmark of editing events caused by APOBEC3 and ADAR. We align 603,249,815 traces from the NCBI trace archive to their reference genomes. In clusters of mismatches of increasing size, at least one systematic sequencing error dominates the results (G-to-A). It is still present in mismatches with 99% accuracy and only vanishes in mismatches at 99.99% accuracy or higher. The error appears to have entered into about 1% of the HapMap, possibly affecting other users that rely on this resource. Further investigation, using stringent quality thresholds, uncovers thousands of mismatch clusters with no apparent defects in their chromatograms. These traces provide the first reported candidates of endogenous DNA editing in human, further elucidating RNA editing in human and mouse and also revealing, for the first time, extensive RNA editing in Xenopus tropicalis. We show that the NCBI Trace Archive provides a valuable resource for the investigation of the phenomena of DNA and RNA editing, as well as setting the stage for a comprehensive mapping of editing events in large-scale genomic datasets.Alexander Wait ZaranekErez Y LevanonTomer ZechariaTom CleggGeorge M ChurchPublic Library of Science (PLoS)articleGeneticsQH426-470ENPLoS Genetics, Vol 6, Iss 5, p e1000954 (2010)
institution DOAJ
collection DOAJ
language EN
topic Genetics
QH426-470
spellingShingle Genetics
QH426-470
Alexander Wait Zaranek
Erez Y Levanon
Tomer Zecharia
Tom Clegg
George M Church
A survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing.
description While it is widely held that an organism's genomic information should remain constant, several protein families are known to modify it. Members of the AID/APOBEC protein family can deaminate DNA. Similarly, members of the ADAR family can deaminate RNA. Characterizing the scope of these events is challenging. Here we use large genomic data sets, such as the two billion sequences in the NCBI Trace Archive, to look for clusters of mismatches of the same type, which are a hallmark of editing events caused by APOBEC3 and ADAR. We align 603,249,815 traces from the NCBI trace archive to their reference genomes. In clusters of mismatches of increasing size, at least one systematic sequencing error dominates the results (G-to-A). It is still present in mismatches with 99% accuracy and only vanishes in mismatches at 99.99% accuracy or higher. The error appears to have entered into about 1% of the HapMap, possibly affecting other users that rely on this resource. Further investigation, using stringent quality thresholds, uncovers thousands of mismatch clusters with no apparent defects in their chromatograms. These traces provide the first reported candidates of endogenous DNA editing in human, further elucidating RNA editing in human and mouse and also revealing, for the first time, extensive RNA editing in Xenopus tropicalis. We show that the NCBI Trace Archive provides a valuable resource for the investigation of the phenomena of DNA and RNA editing, as well as setting the stage for a comprehensive mapping of editing events in large-scale genomic datasets.
format article
author Alexander Wait Zaranek
Erez Y Levanon
Tomer Zecharia
Tom Clegg
George M Church
author_facet Alexander Wait Zaranek
Erez Y Levanon
Tomer Zecharia
Tom Clegg
George M Church
author_sort Alexander Wait Zaranek
title A survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing.
title_short A survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing.
title_full A survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing.
title_fullStr A survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing.
title_full_unstemmed A survey of genomic traces reveals a common sequencing error, RNA editing, and DNA editing.
title_sort survey of genomic traces reveals a common sequencing error, rna editing, and dna editing.
publisher Public Library of Science (PLoS)
publishDate 2010
url https://doaj.org/article/a85d2d4d5e13428f9f52c85eb9e27268
work_keys_str_mv AT alexanderwaitzaranek asurveyofgenomictracesrevealsacommonsequencingerrorrnaeditinganddnaediting
AT erezylevanon asurveyofgenomictracesrevealsacommonsequencingerrorrnaeditinganddnaediting
AT tomerzecharia asurveyofgenomictracesrevealsacommonsequencingerrorrnaeditinganddnaediting
AT tomclegg asurveyofgenomictracesrevealsacommonsequencingerrorrnaeditinganddnaediting
AT georgemchurch asurveyofgenomictracesrevealsacommonsequencingerrorrnaeditinganddnaediting
AT alexanderwaitzaranek surveyofgenomictracesrevealsacommonsequencingerrorrnaeditinganddnaediting
AT erezylevanon surveyofgenomictracesrevealsacommonsequencingerrorrnaeditinganddnaediting
AT tomerzecharia surveyofgenomictracesrevealsacommonsequencingerrorrnaeditinganddnaediting
AT tomclegg surveyofgenomictracesrevealsacommonsequencingerrorrnaeditinganddnaediting
AT georgemchurch surveyofgenomictracesrevealsacommonsequencingerrorrnaeditinganddnaediting
_version_ 1718375677121527808