The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm.

Since water supply association analysis plays an important role in attribution analysis of water supply fluctuation, how to carry out effective association analysis has become a critical problem. However, the current techniques and methods used for association analysis are not very effective because...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Xin Liu, Xuefeng Sang, Jiaxuan Chang, Yang Zheng, Yuping Han
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/9d812cd399de4814bfb0ddf2d3768c56
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:9d812cd399de4814bfb0ddf2d3768c56
record_format dspace
spelling oai:doaj.org-article:9d812cd399de4814bfb0ddf2d3768c562021-12-02T20:15:12ZThe water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm.1932-620310.1371/journal.pone.0255684https://doaj.org/article/9d812cd399de4814bfb0ddf2d3768c562021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0255684https://doaj.org/toc/1932-6203Since water supply association analysis plays an important role in attribution analysis of water supply fluctuation, how to carry out effective association analysis has become a critical problem. However, the current techniques and methods used for association analysis are not very effective because they are based on continuous data. In general, there is different degrees of monotone relationship between continuous data, which makes the analysis results easily affected by monotone relationship. The multicollinearity between continuous data distorts these analytical methods and may generate incorrect results. Meanwhile, we cannot know the association rules and value interval between features and water supply. Therefore, the lack of an effective analysis method hinders the water supply association analysis. Association rules and value interval of features obtained from association analysis are helpful to grasp cause of water supply fluctuation and know the fluctuation interval of water supply, so as to provide better support for water supply dispatching. But the association rules and value interval between features and water supply are not fully understood. In this study, a data mining method coupling kmeans clustering discretization and apriori algorithm was proposed. The kmeans was used for data discretization to obtain the one-hot encoding that can be recognized by apriori, and the discretization can also avoid the influence of monotone relationship and multicollinearity on analysis results. All the rules eventually need to be validated in order to filter out spurious rules. The results show that the method in this study is an effective association analysis method. The method can not only obtain the valid strong association rules between features and water supply, but also understand whether the association relationship between features and water supply is direct or indirect. Meanwhile, the method can also obtain value interval of features, the association degree between features and confidence probability of rules.Xin LiuXuefeng SangJiaxuan ChangYang ZhengYuping HanPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 8, p e0255684 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Xin Liu
Xuefeng Sang
Jiaxuan Chang
Yang Zheng
Yuping Han
The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm.
description Since water supply association analysis plays an important role in attribution analysis of water supply fluctuation, how to carry out effective association analysis has become a critical problem. However, the current techniques and methods used for association analysis are not very effective because they are based on continuous data. In general, there is different degrees of monotone relationship between continuous data, which makes the analysis results easily affected by monotone relationship. The multicollinearity between continuous data distorts these analytical methods and may generate incorrect results. Meanwhile, we cannot know the association rules and value interval between features and water supply. Therefore, the lack of an effective analysis method hinders the water supply association analysis. Association rules and value interval of features obtained from association analysis are helpful to grasp cause of water supply fluctuation and know the fluctuation interval of water supply, so as to provide better support for water supply dispatching. But the association rules and value interval between features and water supply are not fully understood. In this study, a data mining method coupling kmeans clustering discretization and apriori algorithm was proposed. The kmeans was used for data discretization to obtain the one-hot encoding that can be recognized by apriori, and the discretization can also avoid the influence of monotone relationship and multicollinearity on analysis results. All the rules eventually need to be validated in order to filter out spurious rules. The results show that the method in this study is an effective association analysis method. The method can not only obtain the valid strong association rules between features and water supply, but also understand whether the association relationship between features and water supply is direct or indirect. Meanwhile, the method can also obtain value interval of features, the association degree between features and confidence probability of rules.
format article
author Xin Liu
Xuefeng Sang
Jiaxuan Chang
Yang Zheng
Yuping Han
author_facet Xin Liu
Xuefeng Sang
Jiaxuan Chang
Yang Zheng
Yuping Han
author_sort Xin Liu
title The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm.
title_short The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm.
title_full The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm.
title_fullStr The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm.
title_full_unstemmed The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm.
title_sort water supply association analysis method in shenzhen based on kmeans clustering discretization and apriori algorithm.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/9d812cd399de4814bfb0ddf2d3768c56
work_keys_str_mv AT xinliu thewatersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT xuefengsang thewatersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT jiaxuanchang thewatersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT yangzheng thewatersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT yupinghan thewatersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT xinliu watersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT xuefengsang watersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT jiaxuanchang watersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT yangzheng watersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT yupinghan watersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
_version_ 1718374597263360000