A Bayesian network approach incorporating imputation of missing data enables exploratory analysis of complex causal biological relationships.

Bayesian networks can be used to identify possible causal relationships between variables based on their conditional dependencies and independencies, which can be particularly useful in complex biological scenarios with many measured variables. Here we propose two improvements to an existing method...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Richard Howey, Alexander D Clark, Najib Naamane, Louise N Reynard, Arthur G Pratt, Heather J Cordell
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
Acceso en línea:https://doaj.org/article/0f1187fea9e34709af36a64f09f121e9
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:0f1187fea9e34709af36a64f09f121e9
record_format dspace
spelling oai:doaj.org-article:0f1187fea9e34709af36a64f09f121e92021-12-02T20:03:01ZA Bayesian network approach incorporating imputation of missing data enables exploratory analysis of complex causal biological relationships.1553-73901553-740410.1371/journal.pgen.1009811https://doaj.org/article/0f1187fea9e34709af36a64f09f121e92021-09-01T00:00:00Zhttps://doi.org/10.1371/journal.pgen.1009811https://doaj.org/toc/1553-7390https://doaj.org/toc/1553-7404Bayesian networks can be used to identify possible causal relationships between variables based on their conditional dependencies and independencies, which can be particularly useful in complex biological scenarios with many measured variables. Here we propose two improvements to an existing method for Bayesian network analysis, designed to increase the power to detect potential causal relationships between variables (including potentially a mixture of both discrete and continuous variables). Our first improvement relates to the treatment of missing data. When there is missing data, the standard approach is to remove every individual with any missing data before performing analysis. This can be wasteful and undesirable when there are many individuals with missing data, perhaps with only one or a few variables missing. This motivates the use of imputation. We present a new imputation method that uses a version of nearest neighbour imputation, whereby missing data from one individual is replaced with data from another individual, their nearest neighbour. For each individual with missing data, the subsets of variables to be used to select the nearest neighbour are chosen by sampling without replacement the complete data and estimating a best fit Bayesian network. We show that this approach leads to marked improvements in the recall and precision of directed edges in the final network identified, and we illustrate the approach through application to data from a recent study investigating the causal relationship between methylation and gene expression in early inflammatory arthritis patients. We also describe a second improvement in the form of a pseudo-Bayesian approach for upweighting certain network edges, which can be useful when there is prior evidence concerning their directions.Richard HoweyAlexander D ClarkNajib NaamaneLouise N ReynardArthur G PrattHeather J CordellPublic Library of Science (PLoS)articleGeneticsQH426-470ENPLoS Genetics, Vol 17, Iss 9, p e1009811 (2021)
institution DOAJ
collection DOAJ
language EN
topic Genetics
QH426-470
spellingShingle Genetics
QH426-470
Richard Howey
Alexander D Clark
Najib Naamane
Louise N Reynard
Arthur G Pratt
Heather J Cordell
A Bayesian network approach incorporating imputation of missing data enables exploratory analysis of complex causal biological relationships.
description Bayesian networks can be used to identify possible causal relationships between variables based on their conditional dependencies and independencies, which can be particularly useful in complex biological scenarios with many measured variables. Here we propose two improvements to an existing method for Bayesian network analysis, designed to increase the power to detect potential causal relationships between variables (including potentially a mixture of both discrete and continuous variables). Our first improvement relates to the treatment of missing data. When there is missing data, the standard approach is to remove every individual with any missing data before performing analysis. This can be wasteful and undesirable when there are many individuals with missing data, perhaps with only one or a few variables missing. This motivates the use of imputation. We present a new imputation method that uses a version of nearest neighbour imputation, whereby missing data from one individual is replaced with data from another individual, their nearest neighbour. For each individual with missing data, the subsets of variables to be used to select the nearest neighbour are chosen by sampling without replacement the complete data and estimating a best fit Bayesian network. We show that this approach leads to marked improvements in the recall and precision of directed edges in the final network identified, and we illustrate the approach through application to data from a recent study investigating the causal relationship between methylation and gene expression in early inflammatory arthritis patients. We also describe a second improvement in the form of a pseudo-Bayesian approach for upweighting certain network edges, which can be useful when there is prior evidence concerning their directions.
format article
author Richard Howey
Alexander D Clark
Najib Naamane
Louise N Reynard
Arthur G Pratt
Heather J Cordell
author_facet Richard Howey
Alexander D Clark
Najib Naamane
Louise N Reynard
Arthur G Pratt
Heather J Cordell
author_sort Richard Howey
title A Bayesian network approach incorporating imputation of missing data enables exploratory analysis of complex causal biological relationships.
title_short A Bayesian network approach incorporating imputation of missing data enables exploratory analysis of complex causal biological relationships.
title_full A Bayesian network approach incorporating imputation of missing data enables exploratory analysis of complex causal biological relationships.
title_fullStr A Bayesian network approach incorporating imputation of missing data enables exploratory analysis of complex causal biological relationships.
title_full_unstemmed A Bayesian network approach incorporating imputation of missing data enables exploratory analysis of complex causal biological relationships.
title_sort bayesian network approach incorporating imputation of missing data enables exploratory analysis of complex causal biological relationships.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/0f1187fea9e34709af36a64f09f121e9
work_keys_str_mv AT richardhowey abayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
AT alexanderdclark abayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
AT najibnaamane abayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
AT louisenreynard abayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
AT arthurgpratt abayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
AT heatherjcordell abayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
AT richardhowey bayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
AT alexanderdclark bayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
AT najibnaamane bayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
AT louisenreynard bayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
AT arthurgpratt bayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
AT heatherjcordell bayesiannetworkapproachincorporatingimputationofmissingdataenablesexploratoryanalysisofcomplexcausalbiologicalrelationships
_version_ 1718375643614281728