Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/104862
Title: | SparseHC : a memory-efficient online hierarchical clustering algorithm | Authors: | Nguyen, Thuy-Diem Schmidt, Bertil Kwoh, Chee-Keong |
Keywords: | DRNTU::Engineering::Computer science and engineering | Issue Date: | 2014 | Source: | Nguyen, T.- D., Schmidt, B., & Kwoh, C.- K. (2014). SparseHC: A Memory-efficient Online Hierarchical Clustering Algorithm. Procedia Computer Science, 29, 8-19. | Series/Report no.: | Procedia computer science | Abstract: | Computing a hierarchical clustering of objects from a pairwise distance matrix is an important algorithmic kernel in computational science. Since the storage of this matrix requires quadratic space with respect to the number of objects, the design of memory-efficient approaches is of high importance to this research area. In this paper, we address this problem by presenting a memory-efficient online hierarchical clustering algorithm called SparseHC. SparseHC scans a sorted and possibly sparse distance matrix chunk-by-chunk. Meanwhile, a dendrogram is built by merging cluster pairs as and when the distance between them is determined to be the smallest among all remaining cluster pairs. The key insight used is that for finding the cluster pair with the smallest distance, it is unnecessary to complete the computation of all cluster pairwise distances. Partial information can be utilized to calculate a lower bound on cluster pairwise distances that are subsequently used for cluster distance comparison. Our experimental results show that SparseHC achieves a linear empirical memory complexity, which is a significant improvement compared to existing algorithms. | URI: | https://hdl.handle.net/10356/104862 http://hdl.handle.net/10220/20325 |
ISSN: | 1877-0509 | DOI: | 10.1016/j.procs.2014.05.001 | Rights: | © 2014 The Author(s). This paper was published in Procedia Computer Science and is made available as an electronic reprint (preprint) with permission of the Author(s). The paper can be found at the following official DOI: http://dx.doi.org/10.1016/j.procs.2014.05.001. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law. | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Journal Articles |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
SparseHC A Memory-efficient Online Hierarchical Clustering Algorithm.pdf | 463.16 kB | Adobe PDF | ![]() View/Open |
SCOPUSTM
Citations
20
18
Updated on Jan 24, 2023
Web of ScienceTM
Citations
10
23
Updated on Feb 5, 2023
Page view(s) 50
525
Updated on Feb 7, 2023
Download(s) 10
316
Updated on Feb 7, 2023
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.