Sequence variations, flanking region mutations, and allele frequency at 31 autosomal STRs in the central Indian population by next generation sequencing (NGS)

Abstract Capillary electrophoresis-based analysis does not reflect the exact allele number variation at the STR loci due to the non-availability of the data on sequence variation in the repeat region and the SNPs in flanking regions. Herein, this study reports the length-based and sequence-based all...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Hirak Ranjan Dash, Kamlesh Kaitholia, R. K. Kumawat, Anil Kumar Singh, Pankaj Shrivastava, Gyaneshwer Chaubey, Surajit Das
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/02ba23130885482b982d75bdefef4906
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Abstract Capillary electrophoresis-based analysis does not reflect the exact allele number variation at the STR loci due to the non-availability of the data on sequence variation in the repeat region and the SNPs in flanking regions. Herein, this study reports the length-based and sequence-based allelic data of 138 central Indian individuals at 31 autosomal STR loci by NGS. The sequence data at each allele was compared to the reference hg19 sequence. The length-based allelic results were found in concordance with the CE-based results. 20 out of 31 autosomal STR loci showed an increase in the number of alleles by the presence of sequence variation and/or SNPs in the flanking regions. The highest gain in the heterozygosity and allele numbers was observed in D5S2800, D1S1656, D16S539, D5S818, and vWA. rs25768 (A/G) at D5S818 was found to be the most frequent SNP in the studied population. Allele no. 15 of D3S1358, allele no. 19 of D2S1338, and allele no. 22 of D12S391 showed 5 isoalleles each with the same size and with different intervening sequences. Length-based determination of the alleles showed Penta E to be the most useful marker in the central Indian population among 31 STRs studied; however, sequence-based analysis advocated D2S1338 to be the most useful marker in terms of various forensic parameters. Population genetics analysis showed a shared genetic ancestry of the studied population with other Indian populations. This first-ever study to the best of our knowledge on sequence-based STR analysis in the central Indian population is expected to prove the use of NGS in forensic case-work and in forensic DNA laboratories.