Synthetic data generation with probabilistic Bayesian Networks

Bayesian Network (BN) modeling is a prominent and increasingly popular computational systems biology method. It aims to construct network graphs from the large heterogeneous biological datasets that reflect the underlying biological relationships. Currently, a variety of strategies exist for evaluat...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Grigoriy Gogoshin, Sergio Branciamore, Andrei S. Rodin
Formato: article
Lenguaje:EN
Publicado: AIMS Press 2021
Materias:
Acceso en línea:https://doaj.org/article/02b13fa56cb2452892c55bbe03f653a3
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:02b13fa56cb2452892c55bbe03f653a3
record_format dspace
spelling oai:doaj.org-article:02b13fa56cb2452892c55bbe03f653a32021-11-29T01:23:02ZSynthetic data generation with probabilistic Bayesian Networks10.3934/mbe.20214261551-0018https://doaj.org/article/02b13fa56cb2452892c55bbe03f653a32021-10-01T00:00:00Zhttps://www.aimspress.com/article/doi/10.3934/mbe.2021426?viewType=HTMLhttps://doaj.org/toc/1551-0018Bayesian Network (BN) modeling is a prominent and increasingly popular computational systems biology method. It aims to construct network graphs from the large heterogeneous biological datasets that reflect the underlying biological relationships. Currently, a variety of strategies exist for evaluating BN methodology performance, ranging from utilizing artificial benchmark datasets and models, to specialized biological benchmark datasets, to simulation studies that generate synthetic data from predefined network models. The last is arguably the most comprehensive approach; however, existing implementations often rely on explicit and implicit assumptions that may be unrealistic in a typical biological data analysis scenario, or are poorly equipped for automated arbitrary model generation. In this study, we develop a purely probabilistic simulation framework that addresses the demands of statistically sound simulations studies in an unbiased fashion. Additionally, we expand on our current understanding of the theoretical notions of causality and dependence / conditional independence in BNs and the Markov Blankets within.Grigoriy GogoshinSergio BranciamoreAndrei S. RodinAIMS Pressarticlebayesian networkssynthetic data generationdirected acyclic graphprobabilistic graphical modelsmarkov blanketcentral limitBiotechnologyTP248.13-248.65MathematicsQA1-939ENMathematical Biosciences and Engineering, Vol 18, Iss 6, Pp 8603-8621 (2021)
institution DOAJ
collection DOAJ
language EN
topic bayesian networks
synthetic data generation
directed acyclic graph
probabilistic graphical models
markov blanket
central limit
Biotechnology
TP248.13-248.65
Mathematics
QA1-939
spellingShingle bayesian networks
synthetic data generation
directed acyclic graph
probabilistic graphical models
markov blanket
central limit
Biotechnology
TP248.13-248.65
Mathematics
QA1-939
Grigoriy Gogoshin
Sergio Branciamore
Andrei S. Rodin
Synthetic data generation with probabilistic Bayesian Networks
description Bayesian Network (BN) modeling is a prominent and increasingly popular computational systems biology method. It aims to construct network graphs from the large heterogeneous biological datasets that reflect the underlying biological relationships. Currently, a variety of strategies exist for evaluating BN methodology performance, ranging from utilizing artificial benchmark datasets and models, to specialized biological benchmark datasets, to simulation studies that generate synthetic data from predefined network models. The last is arguably the most comprehensive approach; however, existing implementations often rely on explicit and implicit assumptions that may be unrealistic in a typical biological data analysis scenario, or are poorly equipped for automated arbitrary model generation. In this study, we develop a purely probabilistic simulation framework that addresses the demands of statistically sound simulations studies in an unbiased fashion. Additionally, we expand on our current understanding of the theoretical notions of causality and dependence / conditional independence in BNs and the Markov Blankets within.
format article
author Grigoriy Gogoshin
Sergio Branciamore
Andrei S. Rodin
author_facet Grigoriy Gogoshin
Sergio Branciamore
Andrei S. Rodin
author_sort Grigoriy Gogoshin
title Synthetic data generation with probabilistic Bayesian Networks
title_short Synthetic data generation with probabilistic Bayesian Networks
title_full Synthetic data generation with probabilistic Bayesian Networks
title_fullStr Synthetic data generation with probabilistic Bayesian Networks
title_full_unstemmed Synthetic data generation with probabilistic Bayesian Networks
title_sort synthetic data generation with probabilistic bayesian networks
publisher AIMS Press
publishDate 2021
url https://doaj.org/article/02b13fa56cb2452892c55bbe03f653a3
work_keys_str_mv AT grigoriygogoshin syntheticdatagenerationwithprobabilisticbayesiannetworks
AT sergiobranciamore syntheticdatagenerationwithprobabilisticbayesiannetworks
AT andreisrodin syntheticdatagenerationwithprobabilisticbayesiannetworks
_version_ 1718407626933403648