Insights into the loblolly pine genome: characterization of BAC and fosmid sequences.

Despite their prevalence and importance, the genome sequences of loblolly pine, Norway spruce, and white spruce, three ecologically and economically important conifer species, are just becoming available to the research community. Following the completion of these large assemblies, annotation effort...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Jill L Wegrzyn, Brian Y Lin, Jacob J Zieve, William M Dougherty, Pedro J Martínez-García, Maxim Koriabine, Ann Holtz-Morris, Pieter deJong, Marc Crepeau, Charles H Langley, Daniela Puiu, Steven L Salzberg, David B Neale, Kristian A Stevens
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2013
Materias:
R
Q
Acceso en línea:https://doaj.org/article/f784f8de9a7547ad9e78ce260e448d09
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:f784f8de9a7547ad9e78ce260e448d09
record_format dspace
spelling oai:doaj.org-article:f784f8de9a7547ad9e78ce260e448d092021-11-18T08:57:04ZInsights into the loblolly pine genome: characterization of BAC and fosmid sequences.1932-620310.1371/journal.pone.0072439https://doaj.org/article/f784f8de9a7547ad9e78ce260e448d092013-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24023741/?tool=EBIhttps://doaj.org/toc/1932-6203Despite their prevalence and importance, the genome sequences of loblolly pine, Norway spruce, and white spruce, three ecologically and economically important conifer species, are just becoming available to the research community. Following the completion of these large assemblies, annotation efforts will be undertaken to characterize the reference sequences. Accurate annotation of these ancient genomes would be aided by a comprehensive repeat library; however, few studies have generated enough sequence to fully evaluate and catalog their non-genic content. In this paper, two sets of loblolly pine genomic sequence, 103 previously assembled BACs and 90,954 newly sequenced and assembled fosmid scaffolds, were analyzed. Together, this sequence represents 280 Mbp (roughly 1% of the loblolly pine genome) and one of the most comprehensive studies of repetitive elements and genes in a gymnosperm species. A combination of homology and de novo methodologies were applied to identify both conserved and novel repeats. Similarity analysis estimated a repetitive content of 27% that included both full and partial elements. When combined with the de novo investigation, the estimate increased to almost 86%. Over 60% of the repetitive sequence consists of full or partial LTR (long terminal repeat) retrotransposons. Through de novo approaches, 6,270 novel, full-length transposable element families and 9,415 sub-families were identified. Among those 6,270 families, 82% were annotated as single-copy. Several of the novel, high-copy families are described here, with the largest, PtPiedmont, comprising 133 full-length copies. In addition to repeats, analysis of the coding region reported 23 full-length eukaryotic orthologous proteins (KOGS) and another 29 novel or orthologous genes. These discoveries, along with other genomic resources, will be used to annotate conifer genomes and address long-standing questions about gymnosperm evolution.Jill L WegrzynBrian Y LinJacob J ZieveWilliam M DoughertyPedro J Martínez-GarcíaMaxim KoriabineAnn Holtz-MorrisPieter deJongMarc CrepeauCharles H LangleyDaniela PuiuSteven L SalzbergDavid B NealeKristian A StevensPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 8, Iss 9, p e72439 (2013)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Jill L Wegrzyn
Brian Y Lin
Jacob J Zieve
William M Dougherty
Pedro J Martínez-García
Maxim Koriabine
Ann Holtz-Morris
Pieter deJong
Marc Crepeau
Charles H Langley
Daniela Puiu
Steven L Salzberg
David B Neale
Kristian A Stevens
Insights into the loblolly pine genome: characterization of BAC and fosmid sequences.
description Despite their prevalence and importance, the genome sequences of loblolly pine, Norway spruce, and white spruce, three ecologically and economically important conifer species, are just becoming available to the research community. Following the completion of these large assemblies, annotation efforts will be undertaken to characterize the reference sequences. Accurate annotation of these ancient genomes would be aided by a comprehensive repeat library; however, few studies have generated enough sequence to fully evaluate and catalog their non-genic content. In this paper, two sets of loblolly pine genomic sequence, 103 previously assembled BACs and 90,954 newly sequenced and assembled fosmid scaffolds, were analyzed. Together, this sequence represents 280 Mbp (roughly 1% of the loblolly pine genome) and one of the most comprehensive studies of repetitive elements and genes in a gymnosperm species. A combination of homology and de novo methodologies were applied to identify both conserved and novel repeats. Similarity analysis estimated a repetitive content of 27% that included both full and partial elements. When combined with the de novo investigation, the estimate increased to almost 86%. Over 60% of the repetitive sequence consists of full or partial LTR (long terminal repeat) retrotransposons. Through de novo approaches, 6,270 novel, full-length transposable element families and 9,415 sub-families were identified. Among those 6,270 families, 82% were annotated as single-copy. Several of the novel, high-copy families are described here, with the largest, PtPiedmont, comprising 133 full-length copies. In addition to repeats, analysis of the coding region reported 23 full-length eukaryotic orthologous proteins (KOGS) and another 29 novel or orthologous genes. These discoveries, along with other genomic resources, will be used to annotate conifer genomes and address long-standing questions about gymnosperm evolution.
format article
author Jill L Wegrzyn
Brian Y Lin
Jacob J Zieve
William M Dougherty
Pedro J Martínez-García
Maxim Koriabine
Ann Holtz-Morris
Pieter deJong
Marc Crepeau
Charles H Langley
Daniela Puiu
Steven L Salzberg
David B Neale
Kristian A Stevens
author_facet Jill L Wegrzyn
Brian Y Lin
Jacob J Zieve
William M Dougherty
Pedro J Martínez-García
Maxim Koriabine
Ann Holtz-Morris
Pieter deJong
Marc Crepeau
Charles H Langley
Daniela Puiu
Steven L Salzberg
David B Neale
Kristian A Stevens
author_sort Jill L Wegrzyn
title Insights into the loblolly pine genome: characterization of BAC and fosmid sequences.
title_short Insights into the loblolly pine genome: characterization of BAC and fosmid sequences.
title_full Insights into the loblolly pine genome: characterization of BAC and fosmid sequences.
title_fullStr Insights into the loblolly pine genome: characterization of BAC and fosmid sequences.
title_full_unstemmed Insights into the loblolly pine genome: characterization of BAC and fosmid sequences.
title_sort insights into the loblolly pine genome: characterization of bac and fosmid sequences.
publisher Public Library of Science (PLoS)
publishDate 2013
url https://doaj.org/article/f784f8de9a7547ad9e78ce260e448d09
work_keys_str_mv AT jilllwegrzyn insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT brianylin insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT jacobjzieve insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT williammdougherty insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT pedrojmartinezgarcia insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT maximkoriabine insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT annholtzmorris insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT pieterdejong insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT marccrepeau insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT charleshlangley insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT danielapuiu insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT stevenlsalzberg insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT davidbneale insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
AT kristianastevens insightsintotheloblollypinegenomecharacterizationofbacandfosmidsequences
_version_ 1718421180460826624