Multiple imputation with compatibility for high-dimensional data.
Multiple Imputation (MI) is always challenging in high dimensional settings. The imputation model with some selected number of predictors can be incompatible with the analysis model leading to inconsistent and biased estimates. Although compatibility in such cases may not be achieved, but one can ob...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/cc55db66f7de4a9db934f51c1ba44619 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:cc55db66f7de4a9db934f51c1ba44619 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:cc55db66f7de4a9db934f51c1ba446192021-12-02T20:19:40ZMultiple imputation with compatibility for high-dimensional data.1932-620310.1371/journal.pone.0254112https://doaj.org/article/cc55db66f7de4a9db934f51c1ba446192021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0254112https://doaj.org/toc/1932-6203Multiple Imputation (MI) is always challenging in high dimensional settings. The imputation model with some selected number of predictors can be incompatible with the analysis model leading to inconsistent and biased estimates. Although compatibility in such cases may not be achieved, but one can obtain consistent and unbiased estimates using a semi-compatible imputation model. We propose to relax the lasso penalty for selecting a large set of variables (at most n). The substantive model that also uses some formal variable selection procedure in high-dimensional structures is then expected to be nested in this imputation model. The resulting imputation model will be semi-compatible with high probability. The likelihood estimates can be unstable and can face the convergence issues as the number of variables becomes nearly as large as the sample size. To address these issues, we further propose to use a ridge penalty for obtaining the posterior distribution of the parameters based on the observed data. The proposed technique is compared with the standard MI software and MI techniques available for high-dimensional data in simulation studies and a real life dataset. Our results exhibit the superiority of the proposed approach to the existing MI approaches while addressing the compatibility issue.Faisal Maqbool ZahidShahla FaisalChristian HeumannPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 7, p e0254112 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Faisal Maqbool Zahid Shahla Faisal Christian Heumann Multiple imputation with compatibility for high-dimensional data. |
description |
Multiple Imputation (MI) is always challenging in high dimensional settings. The imputation model with some selected number of predictors can be incompatible with the analysis model leading to inconsistent and biased estimates. Although compatibility in such cases may not be achieved, but one can obtain consistent and unbiased estimates using a semi-compatible imputation model. We propose to relax the lasso penalty for selecting a large set of variables (at most n). The substantive model that also uses some formal variable selection procedure in high-dimensional structures is then expected to be nested in this imputation model. The resulting imputation model will be semi-compatible with high probability. The likelihood estimates can be unstable and can face the convergence issues as the number of variables becomes nearly as large as the sample size. To address these issues, we further propose to use a ridge penalty for obtaining the posterior distribution of the parameters based on the observed data. The proposed technique is compared with the standard MI software and MI techniques available for high-dimensional data in simulation studies and a real life dataset. Our results exhibit the superiority of the proposed approach to the existing MI approaches while addressing the compatibility issue. |
format |
article |
author |
Faisal Maqbool Zahid Shahla Faisal Christian Heumann |
author_facet |
Faisal Maqbool Zahid Shahla Faisal Christian Heumann |
author_sort |
Faisal Maqbool Zahid |
title |
Multiple imputation with compatibility for high-dimensional data. |
title_short |
Multiple imputation with compatibility for high-dimensional data. |
title_full |
Multiple imputation with compatibility for high-dimensional data. |
title_fullStr |
Multiple imputation with compatibility for high-dimensional data. |
title_full_unstemmed |
Multiple imputation with compatibility for high-dimensional data. |
title_sort |
multiple imputation with compatibility for high-dimensional data. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2021 |
url |
https://doaj.org/article/cc55db66f7de4a9db934f51c1ba44619 |
work_keys_str_mv |
AT faisalmaqboolzahid multipleimputationwithcompatibilityforhighdimensionaldata AT shahlafaisal multipleimputationwithcompatibilityforhighdimensionaldata AT christianheumann multipleimputationwithcompatibilityforhighdimensionaldata |
_version_ |
1718374192307503104 |