Summarizing Finite Mixture Model with Overlapping Quantification

Finite mixture models are widely used for modeling and clustering data. When they are used for clustering, they are often interpreted by regarding each component as one cluster. However, this assumption may be invalid when the components overlap. It leads to the issue of analyzing such overlaps to c...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Shunki Kyoya, Kenji Yamanishi
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Q
Acceso en línea:https://doaj.org/article/0f3d687cacf84dcba36b1732b8a41ab5
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:0f3d687cacf84dcba36b1732b8a41ab5
record_format dspace
spelling oai:doaj.org-article:0f3d687cacf84dcba36b1732b8a41ab52021-11-25T17:30:18ZSummarizing Finite Mixture Model with Overlapping Quantification10.3390/e231115031099-4300https://doaj.org/article/0f3d687cacf84dcba36b1732b8a41ab52021-11-01T00:00:00Zhttps://www.mdpi.com/1099-4300/23/11/1503https://doaj.org/toc/1099-4300Finite mixture models are widely used for modeling and clustering data. When they are used for clustering, they are often interpreted by regarding each component as one cluster. However, this assumption may be invalid when the components overlap. It leads to the issue of analyzing such overlaps to correctly understand the models. The primary purpose of this paper is to establish a theoretical framework for interpreting the overlapping mixture models by estimating how they overlap, using measures of information such as entropy and mutual information. This is achieved by merging components to regard multiple components as one cluster and summarizing the merging results. First, we propose three conditions that any merging criterion should satisfy. Then, we investigate whether several existing merging criteria satisfy the conditions and modify them to fulfill more conditions. Second, we propose a novel concept named clustering summarization to evaluate the merging results. In it, we can quantify how overlapped and biased the clusters are, using mutual information-based criteria. Using artificial and real datasets, we empirically demonstrate that our methods of modifying criteria and summarizing results are effective for understanding the cluster structures. We therefore give a new view of interpretability/explainability for model-based clustering.Shunki KyoyaKenji YamanishiMDPI AGarticlemodel-based clusteringmerging mixture componentscomponent overlapinterpretabilityScienceQAstrophysicsQB460-466PhysicsQC1-999ENEntropy, Vol 23, Iss 1503, p 1503 (2021)
institution DOAJ
collection DOAJ
language EN
topic model-based clustering
merging mixture components
component overlap
interpretability
Science
Q
Astrophysics
QB460-466
Physics
QC1-999
spellingShingle model-based clustering
merging mixture components
component overlap
interpretability
Science
Q
Astrophysics
QB460-466
Physics
QC1-999
Shunki Kyoya
Kenji Yamanishi
Summarizing Finite Mixture Model with Overlapping Quantification
description Finite mixture models are widely used for modeling and clustering data. When they are used for clustering, they are often interpreted by regarding each component as one cluster. However, this assumption may be invalid when the components overlap. It leads to the issue of analyzing such overlaps to correctly understand the models. The primary purpose of this paper is to establish a theoretical framework for interpreting the overlapping mixture models by estimating how they overlap, using measures of information such as entropy and mutual information. This is achieved by merging components to regard multiple components as one cluster and summarizing the merging results. First, we propose three conditions that any merging criterion should satisfy. Then, we investigate whether several existing merging criteria satisfy the conditions and modify them to fulfill more conditions. Second, we propose a novel concept named clustering summarization to evaluate the merging results. In it, we can quantify how overlapped and biased the clusters are, using mutual information-based criteria. Using artificial and real datasets, we empirically demonstrate that our methods of modifying criteria and summarizing results are effective for understanding the cluster structures. We therefore give a new view of interpretability/explainability for model-based clustering.
format article
author Shunki Kyoya
Kenji Yamanishi
author_facet Shunki Kyoya
Kenji Yamanishi
author_sort Shunki Kyoya
title Summarizing Finite Mixture Model with Overlapping Quantification
title_short Summarizing Finite Mixture Model with Overlapping Quantification
title_full Summarizing Finite Mixture Model with Overlapping Quantification
title_fullStr Summarizing Finite Mixture Model with Overlapping Quantification
title_full_unstemmed Summarizing Finite Mixture Model with Overlapping Quantification
title_sort summarizing finite mixture model with overlapping quantification
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/0f3d687cacf84dcba36b1732b8a41ab5
work_keys_str_mv AT shunkikyoya summarizingfinitemixturemodelwithoverlappingquantification
AT kenjiyamanishi summarizingfinitemixturemodelwithoverlappingquantification
_version_ 1718412269232062464