Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques

Abstract The increased prevalence of childhood obesity is expected to translate in the near future into a concomitant soaring of multiple cardio-metabolic diseases. Obesity has a complex, multifactorial etiology, that includes multiple and multidomain potential risk factors: genetics, dietary and ph...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Helena Marcos-Pasero, Gonzalo Colmenarejo, Elena Aguilar-Aguilar, Ana Ramírez de Molina, Guillermo Reglero, Viviana Loria-Kohen
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/3137e6569f2743b28bb18e71bab182c5
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:3137e6569f2743b28bb18e71bab182c5
record_format dspace
spelling oai:doaj.org-article:3137e6569f2743b28bb18e71bab182c52021-12-02T10:49:29ZRanking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques10.1038/s41598-021-81205-82045-2322https://doaj.org/article/3137e6569f2743b28bb18e71bab182c52021-01-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-81205-8https://doaj.org/toc/2045-2322Abstract The increased prevalence of childhood obesity is expected to translate in the near future into a concomitant soaring of multiple cardio-metabolic diseases. Obesity has a complex, multifactorial etiology, that includes multiple and multidomain potential risk factors: genetics, dietary and physical activity habits, socio-economic environment, lifestyle, etc. In addition, all these factors are expected to exert their influence through a specific and especially convoluted way during childhood, given the fast growth along this period. Machine Learning methods are the appropriate tools to model this complexity, given their ability to cope with high-dimensional, non-linear data. Here, we have analyzed by Machine Learning a sample of 221 children (6–9 years) from Madrid, Spain. Both Random Forest and Gradient Boosting Machine models have been derived to predict the body mass index from a wide set of 190 multidomain variables (including age, sex, genetic polymorphisms, lifestyle, socio-economic, diet, exercise, and gestation ones). A consensus relative importance of the predictors has been estimated through variable importance measures, implemented robustly through an iterative process that included permutation and multiple imputation. We expect this analysis will help to shed light on the most important variables associated to childhood obesity, in order to choose better treatments for its prevention.Helena Marcos-PaseroGonzalo ColmenarejoElena Aguilar-AguilarAna Ramírez de MolinaGuillermo RegleroViviana Loria-KohenNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-14 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Helena Marcos-Pasero
Gonzalo Colmenarejo
Elena Aguilar-Aguilar
Ana Ramírez de Molina
Guillermo Reglero
Viviana Loria-Kohen
Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
description Abstract The increased prevalence of childhood obesity is expected to translate in the near future into a concomitant soaring of multiple cardio-metabolic diseases. Obesity has a complex, multifactorial etiology, that includes multiple and multidomain potential risk factors: genetics, dietary and physical activity habits, socio-economic environment, lifestyle, etc. In addition, all these factors are expected to exert their influence through a specific and especially convoluted way during childhood, given the fast growth along this period. Machine Learning methods are the appropriate tools to model this complexity, given their ability to cope with high-dimensional, non-linear data. Here, we have analyzed by Machine Learning a sample of 221 children (6–9 years) from Madrid, Spain. Both Random Forest and Gradient Boosting Machine models have been derived to predict the body mass index from a wide set of 190 multidomain variables (including age, sex, genetic polymorphisms, lifestyle, socio-economic, diet, exercise, and gestation ones). A consensus relative importance of the predictors has been estimated through variable importance measures, implemented robustly through an iterative process that included permutation and multiple imputation. We expect this analysis will help to shed light on the most important variables associated to childhood obesity, in order to choose better treatments for its prevention.
format article
author Helena Marcos-Pasero
Gonzalo Colmenarejo
Elena Aguilar-Aguilar
Ana Ramírez de Molina
Guillermo Reglero
Viviana Loria-Kohen
author_facet Helena Marcos-Pasero
Gonzalo Colmenarejo
Elena Aguilar-Aguilar
Ana Ramírez de Molina
Guillermo Reglero
Viviana Loria-Kohen
author_sort Helena Marcos-Pasero
title Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
title_short Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
title_full Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
title_fullStr Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
title_full_unstemmed Ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
title_sort ranking of a wide multidomain set of predictor variables of children obesity by machine learning variable importance techniques
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/3137e6569f2743b28bb18e71bab182c5
work_keys_str_mv AT helenamarcospasero rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
AT gonzalocolmenarejo rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
AT elenaaguilaraguilar rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
AT anaramirezdemolina rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
AT guillermoreglero rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
AT vivianaloriakohen rankingofawidemultidomainsetofpredictorvariablesofchildrenobesitybymachinelearningvariableimportancetechniques
_version_ 1718396580979015680