Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus.

Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Nemanja Vukašinović, Fatima Cvrčková, Marek Eliáš, Rex Cole, John E Fowler, Viktor Žárský, Lukáš Synek
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2014
Materias:
R
Q
Acceso en línea:https://doaj.org/article/84d67cbbf7b74340951bb17d21ce6493
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:84d67cbbf7b74340951bb17d21ce6493
record_format dspace
spelling oai:doaj.org-article:84d67cbbf7b74340951bb17d21ce64932021-11-18T08:23:39ZDissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus.1932-620310.1371/journal.pone.0094077https://doaj.org/article/84d67cbbf7b74340951bb17d21ce64932014-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24728280/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a gene encoding a core subunit of the exocyst tethering complex, we examined whether this could be explained by a hidden gene duplication. Re-sequencing and manual assembly of the Arabidopsis thaliana SEC10 (At5g12370) locus revealed that this locus, comprising a single gene in the reference genome assembly, indeed contains two paralogous genes in tandem, SEC10a and SEC10b, and that a sequence segment of 7 kb in length is missing from the reference genome sequence. Differences between the two paralogs are concentrated in non-coding regions, while the predicted protein sequences exhibit 99% identity, differing only by substitution of five amino acid residues and an indel of four residues. Both SEC10 genes are expressed, although varying transcript levels suggest differential regulation. Homozygous T-DNA insertion mutants in either paralog exhibit a wild-type phenotype, consistent with proposed extensive functional redundancy of the two genes. By these observations we demonstrate that recently duplicated genes may remain hidden even in well-characterized genomes, such as that of A. thaliana. Moreover, we show that the use of the existing A. thaliana reference genome sequence as a guide for sequence assembly of new Arabidopsis accessions or related species has at least in some cases led to error propagation.Nemanja VukašinovićFatima CvrčkováMarek EliášRex ColeJohn E FowlerViktor ŽárskýLukáš SynekPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 9, Iss 4, p e94077 (2014)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Nemanja Vukašinović
Fatima Cvrčková
Marek Eliáš
Rex Cole
John E Fowler
Viktor Žárský
Lukáš Synek
Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus.
description Repetitive sequences present a challenge for genome sequence assembly, and highly similar segmental duplications may disappear from assembled genome sequences. Having found a surprising lack of observable phenotypic deviations and non-Mendelian segregation in Arabidopsis thaliana mutants in SEC10, a gene encoding a core subunit of the exocyst tethering complex, we examined whether this could be explained by a hidden gene duplication. Re-sequencing and manual assembly of the Arabidopsis thaliana SEC10 (At5g12370) locus revealed that this locus, comprising a single gene in the reference genome assembly, indeed contains two paralogous genes in tandem, SEC10a and SEC10b, and that a sequence segment of 7 kb in length is missing from the reference genome sequence. Differences between the two paralogs are concentrated in non-coding regions, while the predicted protein sequences exhibit 99% identity, differing only by substitution of five amino acid residues and an indel of four residues. Both SEC10 genes are expressed, although varying transcript levels suggest differential regulation. Homozygous T-DNA insertion mutants in either paralog exhibit a wild-type phenotype, consistent with proposed extensive functional redundancy of the two genes. By these observations we demonstrate that recently duplicated genes may remain hidden even in well-characterized genomes, such as that of A. thaliana. Moreover, we show that the use of the existing A. thaliana reference genome sequence as a guide for sequence assembly of new Arabidopsis accessions or related species has at least in some cases led to error propagation.
format article
author Nemanja Vukašinović
Fatima Cvrčková
Marek Eliáš
Rex Cole
John E Fowler
Viktor Žárský
Lukáš Synek
author_facet Nemanja Vukašinović
Fatima Cvrčková
Marek Eliáš
Rex Cole
John E Fowler
Viktor Žárský
Lukáš Synek
author_sort Nemanja Vukašinović
title Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus.
title_short Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus.
title_full Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus.
title_fullStr Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus.
title_full_unstemmed Dissecting a hidden gene duplication: the Arabidopsis thaliana SEC10 locus.
title_sort dissecting a hidden gene duplication: the arabidopsis thaliana sec10 locus.
publisher Public Library of Science (PLoS)
publishDate 2014
url https://doaj.org/article/84d67cbbf7b74340951bb17d21ce6493
work_keys_str_mv AT nemanjavukasinovic dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT fatimacvrckova dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT marekelias dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT rexcole dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT johnefowler dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT viktorzarsky dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
AT lukassynek dissectingahiddengeneduplicationthearabidopsisthalianasec10locus
_version_ 1718421870561198080