Fine population structure analysis method for genomes of many

Abstract Fine population structure can be examined through the clustering of individuals into subpopulations. The clustering of individuals in large sequence datasets into subpopulations makes the calculation of subpopulation specific allele frequency possible, which may shed light on selection of c...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Xuedong Pan, Yi Wang, Emily H. M. Wong, Amalio Telenti, J. Craig Venter, Li Jin
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2017
Materias:
R
Q
Acceso en línea:https://doaj.org/article/c200df0cc40b4ab69b6fd5028613b674
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Abstract Fine population structure can be examined through the clustering of individuals into subpopulations. The clustering of individuals in large sequence datasets into subpopulations makes the calculation of subpopulation specific allele frequency possible, which may shed light on selection of candidate variants for rare diseases. However, as the magnitude of the data increases, computational burden becomes a challenge in fine population structure analysis. To address this issue, we propose fine population structure analysis (FIPSA), which is an individual-based non-parametric method for dissecting fine population structure. FIPSA maximizes the likelihood ratio of the contingency table of the allele counts multiplied by the group. We demonstrated that its speed and accuracy were superior to existing non-parametric methods when the simulated sample size was up to 5,000 individuals. When applied to real data, the method showed high resolution on the Human Genome Diversity Project (HGDP) East Asian dataset. FIPSA was independently validated on 11,257 human genomes. The group assignment given by FIPSA was 99.1% similar to those assigned based on supervised learning. Thus, FIPSA provides high resolution and is compatible with a real dataset of more than ten thousand individuals.