Genomic data and disease forecasting: application to type 2 diabetes (T2D).

A general approach is presented for the extraction of a classifier of disease risk that is latent in large scale disease/control databases. Novel features are the following: (1) a data reorganization into a regularized standard form that emphasizes individual alleles instead of the single nucleotide...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: Lawrence Sirovich
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2014
Materias:
R
Q
Acceso en línea:https://doaj.org/article/ee6d0247e21f49c1ae3713971e3ad0f4
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:ee6d0247e21f49c1ae3713971e3ad0f4
record_format dspace
spelling oai:doaj.org-article:ee6d0247e21f49c1ae3713971e3ad0f42021-11-18T08:37:27ZGenomic data and disease forecasting: application to type 2 diabetes (T2D).1932-620310.1371/journal.pone.0085684https://doaj.org/article/ee6d0247e21f49c1ae3713971e3ad0f42014-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24465649/pdf/?tool=EBIhttps://doaj.org/toc/1932-6203A general approach is presented for the extraction of a classifier of disease risk that is latent in large scale disease/control databases. Novel features are the following: (1) a data reorganization into a regularized standard form that emphasizes individual alleles instead of the single nucleotide polymorphism (Snp) allele pair to which they belong; (2) from this a procedure that significantly enhances the discovery of high value genomic loci; (3) an investigative analysis based on the hypothesis that disease represents a very small signal (small signal-to-noise) that is latent in the data. The resulting analyses applied to the FUSION T2D database leads to the polling of thousands of genomic loci to classify disease. This large genomic kernel of loci is shared by non-diabetics at nearly the same high level; but a small well defined separation exists and it is speculated that this might be due to unconventional disease mechanisms. Another analysis demonstrates that the FUSION database size limits its disease predictability, and only one third of the resulting classifier loci are estimated to relate to T2D. The remainder is associated with hidden features that might contrast the disease and control populations and that more data would eliminate.Lawrence SirovichPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 9, Iss 1, p e85684 (2014)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Lawrence Sirovich
Genomic data and disease forecasting: application to type 2 diabetes (T2D).
description A general approach is presented for the extraction of a classifier of disease risk that is latent in large scale disease/control databases. Novel features are the following: (1) a data reorganization into a regularized standard form that emphasizes individual alleles instead of the single nucleotide polymorphism (Snp) allele pair to which they belong; (2) from this a procedure that significantly enhances the discovery of high value genomic loci; (3) an investigative analysis based on the hypothesis that disease represents a very small signal (small signal-to-noise) that is latent in the data. The resulting analyses applied to the FUSION T2D database leads to the polling of thousands of genomic loci to classify disease. This large genomic kernel of loci is shared by non-diabetics at nearly the same high level; but a small well defined separation exists and it is speculated that this might be due to unconventional disease mechanisms. Another analysis demonstrates that the FUSION database size limits its disease predictability, and only one third of the resulting classifier loci are estimated to relate to T2D. The remainder is associated with hidden features that might contrast the disease and control populations and that more data would eliminate.
format article
author Lawrence Sirovich
author_facet Lawrence Sirovich
author_sort Lawrence Sirovich
title Genomic data and disease forecasting: application to type 2 diabetes (T2D).
title_short Genomic data and disease forecasting: application to type 2 diabetes (T2D).
title_full Genomic data and disease forecasting: application to type 2 diabetes (T2D).
title_fullStr Genomic data and disease forecasting: application to type 2 diabetes (T2D).
title_full_unstemmed Genomic data and disease forecasting: application to type 2 diabetes (T2D).
title_sort genomic data and disease forecasting: application to type 2 diabetes (t2d).
publisher Public Library of Science (PLoS)
publishDate 2014
url https://doaj.org/article/ee6d0247e21f49c1ae3713971e3ad0f4
work_keys_str_mv AT lawrencesirovich genomicdataanddiseaseforecastingapplicationtotype2diabetest2d
_version_ 1718421596552560640