Multiple imputation with compatibility for high-dimensional data.

Multiple Imputation (MI) is always challenging in high dimensional settings. The imputation model with some selected number of predictors can be incompatible with the analysis model leading to inconsistent and biased estimates. Although compatibility in such cases may not be achieved, but one can ob...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Faisal Maqbool Zahid, Shahla Faisal, Christian Heumann
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/cc55db66f7de4a9db934f51c1ba44619
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:cc55db66f7de4a9db934f51c1ba44619
record_format dspace
spelling oai:doaj.org-article:cc55db66f7de4a9db934f51c1ba446192021-12-02T20:19:40ZMultiple imputation with compatibility for high-dimensional data.1932-620310.1371/journal.pone.0254112https://doaj.org/article/cc55db66f7de4a9db934f51c1ba446192021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0254112https://doaj.org/toc/1932-6203Multiple Imputation (MI) is always challenging in high dimensional settings. The imputation model with some selected number of predictors can be incompatible with the analysis model leading to inconsistent and biased estimates. Although compatibility in such cases may not be achieved, but one can obtain consistent and unbiased estimates using a semi-compatible imputation model. We propose to relax the lasso penalty for selecting a large set of variables (at most n). The substantive model that also uses some formal variable selection procedure in high-dimensional structures is then expected to be nested in this imputation model. The resulting imputation model will be semi-compatible with high probability. The likelihood estimates can be unstable and can face the convergence issues as the number of variables becomes nearly as large as the sample size. To address these issues, we further propose to use a ridge penalty for obtaining the posterior distribution of the parameters based on the observed data. The proposed technique is compared with the standard MI software and MI techniques available for high-dimensional data in simulation studies and a real life dataset. Our results exhibit the superiority of the proposed approach to the existing MI approaches while addressing the compatibility issue.Faisal Maqbool ZahidShahla FaisalChristian HeumannPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 7, p e0254112 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Faisal Maqbool Zahid
Shahla Faisal
Christian Heumann
Multiple imputation with compatibility for high-dimensional data.
description Multiple Imputation (MI) is always challenging in high dimensional settings. The imputation model with some selected number of predictors can be incompatible with the analysis model leading to inconsistent and biased estimates. Although compatibility in such cases may not be achieved, but one can obtain consistent and unbiased estimates using a semi-compatible imputation model. We propose to relax the lasso penalty for selecting a large set of variables (at most n). The substantive model that also uses some formal variable selection procedure in high-dimensional structures is then expected to be nested in this imputation model. The resulting imputation model will be semi-compatible with high probability. The likelihood estimates can be unstable and can face the convergence issues as the number of variables becomes nearly as large as the sample size. To address these issues, we further propose to use a ridge penalty for obtaining the posterior distribution of the parameters based on the observed data. The proposed technique is compared with the standard MI software and MI techniques available for high-dimensional data in simulation studies and a real life dataset. Our results exhibit the superiority of the proposed approach to the existing MI approaches while addressing the compatibility issue.
format article
author Faisal Maqbool Zahid
Shahla Faisal
Christian Heumann
author_facet Faisal Maqbool Zahid
Shahla Faisal
Christian Heumann
author_sort Faisal Maqbool Zahid
title Multiple imputation with compatibility for high-dimensional data.
title_short Multiple imputation with compatibility for high-dimensional data.
title_full Multiple imputation with compatibility for high-dimensional data.
title_fullStr Multiple imputation with compatibility for high-dimensional data.
title_full_unstemmed Multiple imputation with compatibility for high-dimensional data.
title_sort multiple imputation with compatibility for high-dimensional data.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/cc55db66f7de4a9db934f51c1ba44619
work_keys_str_mv AT faisalmaqboolzahid multipleimputationwithcompatibilityforhighdimensionaldata
AT shahlafaisal multipleimputationwithcompatibilityforhighdimensionaldata
AT christianheumann multipleimputationwithcompatibilityforhighdimensionaldata
_version_ 1718374192307503104