Ancestral informative marker selection and population structure visualization using sparse Laplacian eigenfunctions.

Identification of a small panel of population structure informative markers can reduce genotyping cost and is useful in various applications, such as ancestry inference in association mapping, forensics and evolutionary theory in population genetics. Traditional methods to ascertain ancestral inform...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: Jun Zhang
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2010
Materias:
R
Q
Acceso en línea:https://doaj.org/article/08e01f08b5504315b98de80a2334801c
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:08e01f08b5504315b98de80a2334801c
record_format dspace
spelling oai:doaj.org-article:08e01f08b5504315b98de80a2334801c2021-11-18T07:02:30ZAncestral informative marker selection and population structure visualization using sparse Laplacian eigenfunctions.1932-620310.1371/journal.pone.0013734https://doaj.org/article/08e01f08b5504315b98de80a2334801c2010-11-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21079796/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203Identification of a small panel of population structure informative markers can reduce genotyping cost and is useful in various applications, such as ancestry inference in association mapping, forensics and evolutionary theory in population genetics. Traditional methods to ascertain ancestral informative markers usually require the prior knowledge of individual ancestry and have difficulty for admixed populations. Recently Principal Components Analysis (PCA) has been employed with success to select SNPs which are highly correlated with top significant principal components (PCs) without use of individual ancestral information. The approach is also applicable to admixed populations. Here we propose a novel approach based on our recent result on summarizing population structure by graph laplacian eigenfunctions, which differs from PCA in that it is geometric and robust to outliers. Our approach also takes advantage of the priori sparseness of informative markers in the genome. Through simulation of a ring population and the real global population sample HGDP of 650K SNPs genotyped in 940 unrelated individuals, we validate the proposed algorithm at selecting most informative markers, a small fraction of which can recover the similar underlying population structure efficiently. Employing a standard Support Vector Machine (SVM) to predict individuals' continental memberships on HGDP dataset of seven continents, we demonstrate that the selected SNPs by our method are more informative but less redundant than those selected by PCA. Our algorithm is a promising tool in genome-wide association studies and population genetics, facilitating the selection of structure informative markers, efficient detection of population substructure and ancestral inference.Jun ZhangPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 5, Iss 11, p e13734 (2010)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Jun Zhang
Ancestral informative marker selection and population structure visualization using sparse Laplacian eigenfunctions.
description Identification of a small panel of population structure informative markers can reduce genotyping cost and is useful in various applications, such as ancestry inference in association mapping, forensics and evolutionary theory in population genetics. Traditional methods to ascertain ancestral informative markers usually require the prior knowledge of individual ancestry and have difficulty for admixed populations. Recently Principal Components Analysis (PCA) has been employed with success to select SNPs which are highly correlated with top significant principal components (PCs) without use of individual ancestral information. The approach is also applicable to admixed populations. Here we propose a novel approach based on our recent result on summarizing population structure by graph laplacian eigenfunctions, which differs from PCA in that it is geometric and robust to outliers. Our approach also takes advantage of the priori sparseness of informative markers in the genome. Through simulation of a ring population and the real global population sample HGDP of 650K SNPs genotyped in 940 unrelated individuals, we validate the proposed algorithm at selecting most informative markers, a small fraction of which can recover the similar underlying population structure efficiently. Employing a standard Support Vector Machine (SVM) to predict individuals' continental memberships on HGDP dataset of seven continents, we demonstrate that the selected SNPs by our method are more informative but less redundant than those selected by PCA. Our algorithm is a promising tool in genome-wide association studies and population genetics, facilitating the selection of structure informative markers, efficient detection of population substructure and ancestral inference.
format article
author Jun Zhang
author_facet Jun Zhang
author_sort Jun Zhang
title Ancestral informative marker selection and population structure visualization using sparse Laplacian eigenfunctions.
title_short Ancestral informative marker selection and population structure visualization using sparse Laplacian eigenfunctions.
title_full Ancestral informative marker selection and population structure visualization using sparse Laplacian eigenfunctions.
title_fullStr Ancestral informative marker selection and population structure visualization using sparse Laplacian eigenfunctions.
title_full_unstemmed Ancestral informative marker selection and population structure visualization using sparse Laplacian eigenfunctions.
title_sort ancestral informative marker selection and population structure visualization using sparse laplacian eigenfunctions.
publisher Public Library of Science (PLoS)
publishDate 2010
url https://doaj.org/article/08e01f08b5504315b98de80a2334801c
work_keys_str_mv AT junzhang ancestralinformativemarkerselectionandpopulationstructurevisualizationusingsparselaplacianeigenfunctions
_version_ 1718424020497465344