Baiting out a full length sequence from unmapped RNA-seq data

Abstract Background As a powerful tool, RNA-Seq has been widely used in various studies. Usually, unmapped RNA-seq reads have been considered as useless and been trashed or ignored. Results We develop a strategy to mining the full length sequence by unmapped reads combining with specific reverse tra...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Dongwei Li, Qitong Huang, Lei Huang, Jikai Wen, Jing Luo, Qing Li, Yanling Peng, Yubo Zhang
Formato: article
Lenguaje:EN
Publicado: BMC 2021
Materias:
Acceso en línea:https://doaj.org/article/64905360506f4e71be749e7023442c74
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:64905360506f4e71be749e7023442c74
record_format dspace
spelling oai:doaj.org-article:64905360506f4e71be749e7023442c742021-11-28T12:23:16ZBaiting out a full length sequence from unmapped RNA-seq data10.1186/s12864-021-08146-41471-2164https://doaj.org/article/64905360506f4e71be749e7023442c742021-11-01T00:00:00Zhttps://doi.org/10.1186/s12864-021-08146-4https://doaj.org/toc/1471-2164Abstract Background As a powerful tool, RNA-Seq has been widely used in various studies. Usually, unmapped RNA-seq reads have been considered as useless and been trashed or ignored. Results We develop a strategy to mining the full length sequence by unmapped reads combining with specific reverse transcription primers design and high throughput sequencing. In this study, we salvage 36 unmapped reads from standard RNA-Seq data and randomly select one 149 bp read as a model. Specific reverse transcription primers are designed to amplify its both ends, followed by next generation sequencing. Then we design a statistical model based on power law distribution to estimate its integrality and significance. Further, we validate it by Sanger sequencing. The result shows that the full length is 1556 bp, with insertion mutations in microsatellite structure. Conclusion We believe this method would be a useful strategy to extract the sequences information from the unmapped RNA-seq data. Further, it is an alternative way to get the full length sequence of unknown cDNA.Dongwei LiQitong HuangLei HuangJikai WenJing LuoQing LiYanling PengYubo ZhangBMCarticleUnmapped readsFull length sequenceStatistical modelRNA-seqBiotechnologyTP248.13-248.65GeneticsQH426-470ENBMC Genomics, Vol 22, Iss 1, Pp 1-8 (2021)
institution DOAJ
collection DOAJ
language EN
topic Unmapped reads
Full length sequence
Statistical model
RNA-seq
Biotechnology
TP248.13-248.65
Genetics
QH426-470
spellingShingle Unmapped reads
Full length sequence
Statistical model
RNA-seq
Biotechnology
TP248.13-248.65
Genetics
QH426-470
Dongwei Li
Qitong Huang
Lei Huang
Jikai Wen
Jing Luo
Qing Li
Yanling Peng
Yubo Zhang
Baiting out a full length sequence from unmapped RNA-seq data
description Abstract Background As a powerful tool, RNA-Seq has been widely used in various studies. Usually, unmapped RNA-seq reads have been considered as useless and been trashed or ignored. Results We develop a strategy to mining the full length sequence by unmapped reads combining with specific reverse transcription primers design and high throughput sequencing. In this study, we salvage 36 unmapped reads from standard RNA-Seq data and randomly select one 149 bp read as a model. Specific reverse transcription primers are designed to amplify its both ends, followed by next generation sequencing. Then we design a statistical model based on power law distribution to estimate its integrality and significance. Further, we validate it by Sanger sequencing. The result shows that the full length is 1556 bp, with insertion mutations in microsatellite structure. Conclusion We believe this method would be a useful strategy to extract the sequences information from the unmapped RNA-seq data. Further, it is an alternative way to get the full length sequence of unknown cDNA.
format article
author Dongwei Li
Qitong Huang
Lei Huang
Jikai Wen
Jing Luo
Qing Li
Yanling Peng
Yubo Zhang
author_facet Dongwei Li
Qitong Huang
Lei Huang
Jikai Wen
Jing Luo
Qing Li
Yanling Peng
Yubo Zhang
author_sort Dongwei Li
title Baiting out a full length sequence from unmapped RNA-seq data
title_short Baiting out a full length sequence from unmapped RNA-seq data
title_full Baiting out a full length sequence from unmapped RNA-seq data
title_fullStr Baiting out a full length sequence from unmapped RNA-seq data
title_full_unstemmed Baiting out a full length sequence from unmapped RNA-seq data
title_sort baiting out a full length sequence from unmapped rna-seq data
publisher BMC
publishDate 2021
url https://doaj.org/article/64905360506f4e71be749e7023442c74
work_keys_str_mv AT dongweili baitingoutafulllengthsequencefromunmappedrnaseqdata
AT qitonghuang baitingoutafulllengthsequencefromunmappedrnaseqdata
AT leihuang baitingoutafulllengthsequencefromunmappedrnaseqdata
AT jikaiwen baitingoutafulllengthsequencefromunmappedrnaseqdata
AT jingluo baitingoutafulllengthsequencefromunmappedrnaseqdata
AT qingli baitingoutafulllengthsequencefromunmappedrnaseqdata
AT yanlingpeng baitingoutafulllengthsequencefromunmappedrnaseqdata
AT yubozhang baitingoutafulllengthsequencefromunmappedrnaseqdata
_version_ 1718408019246579712