Big data integration enhancement based on attributes conditional dependency and similarity index method

Big data has attracted a lot of attention in many domain sectors. The volume of data-generating today in every domain in form of digital is enormous and same time acquiring such information for various analyses and decisions is growing in every field. So, it is significant to integrate the related i...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Vishnu Vandana Kolisetty, Dharmendra Singh Rajput
Formato:	article
Lenguaje:	EN
Publicado:	AIMS Press 2021
Materias:	integration attributes dependency similarity index big data Biotechnology TP248.13-248.65 Mathematics QA1-939
Acceso en línea:	https://doaj.org/article/9c84f8a1e0924436a610e6aa515bfd66
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:9c84f8a1e0924436a610e6aa515bfd66
record_format	dspace
spelling	oai:doaj.org-article:9c84f8a1e0924436a610e6aa515bfd662021-11-29T01:33:46ZBig data integration enhancement based on attributes conditional dependency and similarity index method10.3934/mbe.20214291551-0018https://doaj.org/article/9c84f8a1e0924436a610e6aa515bfd662021-10-01T00:00:00Zhttps://www.aimspress.com/article/doi/10.3934/mbe.2021429?viewType=HTMLhttps://doaj.org/toc/1551-0018Big data has attracted a lot of attention in many domain sectors. The volume of data-generating today in every domain in form of digital is enormous and same time acquiring such information for various analyses and decisions is growing in every field. So, it is significant to integrate the related information based on their similarity. But the existing integration techniques are usually having processing and time complexity and even having constraints in interconnecting multiple data sources. Many of these sources of information come from a variety of sources. Due to the complex distribution of many different data sources, it is difficult to determine the relationship between the data, and it is difficult to study the same data structures for integration to effectively access or retrieve data to meet the needs of different data analysis. In this paper, proposed an integration of big data with computation of attribute conditional dependency (ACD) and similarity index (SI) methods termed as ACD-SI. The ACD-SI mechanism allows using of an improved Bayesian mechanism to analyze the distribution of attributes in a document in the form of dependence on possible attributes. It also uses attribute conversion and selection mechanisms for mapping and grouping data for integration and uses methods such as LSA (latent semantic analysis) to analyze the content of data attributes to extract relevant and accurate data. It performs a series of experiments to measure the overall purity and normalization of the data integrity, using a large dataset of bibliographic data from various publications. The obtained purity and NMI ratio confined the clustered data relevancy and the measure of precision, recall, and accurate rate justified the improvement of the proposal is compared to the existing approaches.Vishnu Vandana KolisettyDharmendra Singh RajputAIMS Pressarticleintegration attributes dependencysimilarity indexbig dataBiotechnologyTP248.13-248.65MathematicsQA1-939ENMathematical Biosciences and Engineering, Vol 18, Iss 6, Pp 8661-8682 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	integration attributes dependency similarity index big data Biotechnology TP248.13-248.65 Mathematics QA1-939
spellingShingle	integration attributes dependency similarity index big data Biotechnology TP248.13-248.65 Mathematics QA1-939 Vishnu Vandana Kolisetty Dharmendra Singh Rajput Big data integration enhancement based on attributes conditional dependency and similarity index method
description	Big data has attracted a lot of attention in many domain sectors. The volume of data-generating today in every domain in form of digital is enormous and same time acquiring such information for various analyses and decisions is growing in every field. So, it is significant to integrate the related information based on their similarity. But the existing integration techniques are usually having processing and time complexity and even having constraints in interconnecting multiple data sources. Many of these sources of information come from a variety of sources. Due to the complex distribution of many different data sources, it is difficult to determine the relationship between the data, and it is difficult to study the same data structures for integration to effectively access or retrieve data to meet the needs of different data analysis. In this paper, proposed an integration of big data with computation of attribute conditional dependency (ACD) and similarity index (SI) methods termed as ACD-SI. The ACD-SI mechanism allows using of an improved Bayesian mechanism to analyze the distribution of attributes in a document in the form of dependence on possible attributes. It also uses attribute conversion and selection mechanisms for mapping and grouping data for integration and uses methods such as LSA (latent semantic analysis) to analyze the content of data attributes to extract relevant and accurate data. It performs a series of experiments to measure the overall purity and normalization of the data integrity, using a large dataset of bibliographic data from various publications. The obtained purity and NMI ratio confined the clustered data relevancy and the measure of precision, recall, and accurate rate justified the improvement of the proposal is compared to the existing approaches.
format	article
author	Vishnu Vandana Kolisetty Dharmendra Singh Rajput
author_facet	Vishnu Vandana Kolisetty Dharmendra Singh Rajput
author_sort	Vishnu Vandana Kolisetty
title	Big data integration enhancement based on attributes conditional dependency and similarity index method
title_short	Big data integration enhancement based on attributes conditional dependency and similarity index method
title_full	Big data integration enhancement based on attributes conditional dependency and similarity index method
title_fullStr	Big data integration enhancement based on attributes conditional dependency and similarity index method
title_full_unstemmed	Big data integration enhancement based on attributes conditional dependency and similarity index method
title_sort	big data integration enhancement based on attributes conditional dependency and similarity index method
publisher	AIMS Press
publishDate	2021
url	https://doaj.org/article/9c84f8a1e0924436a610e6aa515bfd66
work_keys_str_mv	AT vishnuvandanakolisetty bigdataintegrationenhancementbasedonattributesconditionaldependencyandsimilarityindexmethod AT dharmendrasinghrajput bigdataintegrationenhancementbasedonattributesconditionaldependencyandsimilarityindexmethod
_version_	1718407662333329408

Big data integration enhancement based on attributes conditional dependency and similarity index method

Ejemplares similares