Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/36246
Title: Data stream mining
Authors: Huang, Lelun.
Keywords: DRNTU::Engineering::Computer science and engineering::Information systems::Database management
Issue Date: 2010
Abstract: Data streaming is one area of data mining that has been studied extensively. One problem of data streaming is to detect noise and random shapes when clustering, where basic K-Means usually fail. Some researchers suggested density based clustering according to a decay function; one typical example is D-Stream. However, its universal decay factor and cluster on a fixed interval do not achieve optimal efficiency regarding to space and time complexity. In this report, we made an attempt to improve both space and time complexity of D-Stream. Our integrated work DCC-Stream follows conventional online-offline approach in stream mining. We describe our algorithm as two parts: online and offline parts. Online part accumulates historical data as synopsis information and makes use of two sentinels to detect whether offline parts should be invoked. Offline part contains two separate parts, one is responsible for updating density, the other is for clustering. The experimental evaluation shows that our algorithm achieves both significant improvements on time and space complexity. The results show time usage is greatly reduced while maintain similar purity. In addition, the algorithm also achieves better space usage.
URI: http://hdl.handle.net/10356/36246
Schools: School of Computer Engineering 
Research Centres: Centre for Advanced Information Systems 
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
SCE0178.pdf
  Restricted Access
898.56 kBAdobe PDFView/Open

Page view(s) 50

451
Updated on Jun 11, 2024

Download(s)

7
Updated on Jun 11, 2024

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.