Evaluation of jackknife and bootstrap for defining confidence intervals for pairwise agreement measures.

Several research fields frequently deal with the analysis of diverse classification results of the same entities. This should imply an objective detection of overlaps and divergences between the formed clusters. The congruence between classifications can be quantified by clustering agreement measure...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Ana Severiano, João A Carriço, D Ashley Robinson, Mário Ramirez, Francisco R Pinto
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2011
Materias:
R
Q
Acceso en línea:https://doaj.org/article/5c93b05bf72d47a583b594db0b626a43
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Several research fields frequently deal with the analysis of diverse classification results of the same entities. This should imply an objective detection of overlaps and divergences between the formed clusters. The congruence between classifications can be quantified by clustering agreement measures, including pairwise agreement measures. Several measures have been proposed and the importance of obtaining confidence intervals for the point estimate in the comparison of these measures has been highlighted. A broad range of methods can be used for the estimation of confidence intervals. However, evidence is lacking about what are the appropriate methods for the calculation of confidence intervals for most clustering agreement measures. Here we evaluate the resampling techniques of bootstrap and jackknife for the calculation of the confidence intervals for clustering agreement measures. Contrary to what has been shown for some statistics, simulations showed that the jackknife performs better than the bootstrap at accurately estimating confidence intervals for pairwise agreement measures, especially when the agreement between partitions is low. The coverage of the jackknife confidence interval is robust to changes in cluster number and cluster size distribution.