Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.

Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering pr...

Description complète

Enregistré dans:
Détails bibliographiques
Auteurs principaux: Romeo Rizzi, Pritha Mahata, Luke Mathieson, Pablo Moscato
Format: article
Langue:EN
Publié: Public Library of Science (PLoS) 2010
Sujets:
R
Q
Accès en ligne:https://doaj.org/article/3d1c06c6ca0b43f78298ded6d8916924
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
Description
Résumé:Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities.