Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry

The work is aimed at solving the actual problem of identification and interpretation of anomalous observations in the study of socio-economic processes. The proposed method is based on the use of a cluster approach to detecting anomalous observations. Clustering is performed using hierarchical metho...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: A. N. Kislyakov, S. V. Polyakov
Formato: article
Lenguaje:EN
RU
Publicado: North-West institute of management of the Russian Presidential Academy of National Economy and Public Administration 2020
Materias:
Acceso en línea:https://doaj.org/article/c231fd1be269415e93b2ca53769f9a57
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:c231fd1be269415e93b2ca53769f9a57
record_format dspace
spelling oai:doaj.org-article:c231fd1be269415e93b2ca53769f9a572021-11-12T10:46:12ZHierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry1726-11391816-859010.22394/1726-1139-2020-5-116-127https://doaj.org/article/c231fd1be269415e93b2ca53769f9a572020-06-01T00:00:00Zhttps://www.acjournal.ru/jour/article/view/1423https://doaj.org/toc/1726-1139https://doaj.org/toc/1816-8590The work is aimed at solving the actual problem of identification and interpretation of anomalous observations in the study of socio-economic processes. The proposed method is based on the use of a cluster approach to detecting anomalous observations. Clustering is performed using hierarchical methods, which are a set of data ordering algorithms aimed at creating dendrograms consisting of groups of observed points. In the case of mixed data consisting of numeric and categorical variables, it is proposed to use the Gower distance as a metric for distances between elements. Clustering quality is evaluated based on the sum of squares of metric distances between objects within the cluster and the average width of the silhouette. These indicators allow you to select the optimal number of clusters and evaluate the quality of the split results. The dendrogram can be used to study the symmetry groups of cluster systems and the causes of symmetry breaking. Anomaly detection is performed by analyzing the results of hierarchical clustering and identifying branches of the dendrogram that are located at the initial levels of tree construction and do not have branches. The implemented method makes it possible to more accurately interpret the results of clustering with respect to determining errors of the first and second kind in the form of anomalous observations in the data set. Using the described method, it is possible to effectively investigate socio-economic systems and manage their development.A. N. KislyakovS. V. PolyakovNorth-West institute of management of the Russian Presidential Academy of National Economy and Public Administration articlecluster analysisnetwork graphssymmetry breakinganomalous observationsdecision treesPolitical institutions and public administration (General)JF20-2112ENRUУправленческое консультирование, Vol 0, Iss 5, Pp 116-127 (2020)
institution DOAJ
collection DOAJ
language EN
RU
topic cluster analysis
network graphs
symmetry breaking
anomalous observations
decision trees
Political institutions and public administration (General)
JF20-2112
spellingShingle cluster analysis
network graphs
symmetry breaking
anomalous observations
decision trees
Political institutions and public administration (General)
JF20-2112
A. N. Kislyakov
S. V. Polyakov
Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
description The work is aimed at solving the actual problem of identification and interpretation of anomalous observations in the study of socio-economic processes. The proposed method is based on the use of a cluster approach to detecting anomalous observations. Clustering is performed using hierarchical methods, which are a set of data ordering algorithms aimed at creating dendrograms consisting of groups of observed points. In the case of mixed data consisting of numeric and categorical variables, it is proposed to use the Gower distance as a metric for distances between elements. Clustering quality is evaluated based on the sum of squares of metric distances between objects within the cluster and the average width of the silhouette. These indicators allow you to select the optimal number of clusters and evaluate the quality of the split results. The dendrogram can be used to study the symmetry groups of cluster systems and the causes of symmetry breaking. Anomaly detection is performed by analyzing the results of hierarchical clustering and identifying branches of the dendrogram that are located at the initial levels of tree construction and do not have branches. The implemented method makes it possible to more accurately interpret the results of clustering with respect to determining errors of the first and second kind in the form of anomalous observations in the data set. Using the described method, it is possible to effectively investigate socio-economic systems and manage their development.
format article
author A. N. Kislyakov
S. V. Polyakov
author_facet A. N. Kislyakov
S. V. Polyakov
author_sort A. N. Kislyakov
title Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
title_short Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
title_full Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
title_fullStr Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
title_full_unstemmed Hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
title_sort hierarchical clustering methods in a task to find abnormal observations based on groups with broken symmetry
publisher North-West institute of management of the Russian Presidential Academy of National Economy and Public Administration
publishDate 2020
url https://doaj.org/article/c231fd1be269415e93b2ca53769f9a57
work_keys_str_mv AT ankislyakov hierarchicalclusteringmethodsinatasktofindabnormalobservationsbasedongroupswithbrokensymmetry
AT svpolyakov hierarchicalclusteringmethodsinatasktofindabnormalobservationsbasedongroupswithbrokensymmetry
_version_ 1718430704797220864