Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/55194
Title: | A split-merge framework for comparing clusterings | Authors: | Xiang, Qiaoliang | Keywords: | DRNTU::Engineering::Computer science and engineering | Issue Date: | 2013 | Source: | Xiang, Q. (2013). A split-merge framework for comparing clusterings. Master's thesis, Nanyang Technological University, Singapore. | Abstract: | External clustering evaluation measures are often used to evaluate the performance of different clustering algorithms on a collection of data sets. Traditional normalization property is no longer suitable for this task and a conditional normalization property is proposed based on the fact that one clustering is the ground-truth. Even existing measures have been proposed from different points of view, we study them from the normalization point of view. Besides, we propose a new category of cluster counting measures and further group set matching measures into two subcategories according to how the matching is performed. Furthermore, we propose a generative model to study how exist- ing measures are generated as well as producing new measures according to application requirements. In order to understand the intrinsic properties of a measure, a graph-based model is presented to model two clusterings as a directed bipartite graph, which can be decomposed into weakly connected components. A measure can be expressed as a conic combination of scores on components, and different weights are assigned to components when aggregating their scores. Based on the graph-based model, we propose a split-merge framework by breaking components into subcomponents and combining the scores of any two related subcomponents. It is conditionally normalized while existing measures are not. It also has many nice properties compared to other existing frameworks. We give some examples of the framework and compare one example with a few representative measures theoretically and empirically on a coreference resolution data set. | URI: | https://hdl.handle.net/10356/55194 | DOI: | 10.32657/10356/55194 | Schools: | School of Computer Engineering | Research Centres: | Centre for Computational Intelligence | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Theses |
Page view(s) 50
625
Updated on Mar 27, 2025
Download(s) 10
399
Updated on Mar 27, 2025
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.