Please use this identifier to cite or link to this item:
|Title:||Clustering together with learning representations||Authors:||Yu, Shuaiqi||Keywords:||Engineering::Electrical and electronic engineering||Issue Date:||2022||Publisher:||Nanyang Technological University||Source:||Yu, S. (2022). Clustering together with learning representations. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/158048||Abstract:||Document clustering is a useful and practical machine learning methodology, with various real-world applications, such as search optimization, document recommendation, and tag generation of papers and records. It realizes the process of arranging a batch of pdf documents into many separate subgroups. To achieve more efficient clustering, we introduce representation learning, which is an unsupervised learning approach that self-studies the features from unlabeled data. In this project, we aim at implementing and studying a series of representation learning methods which are more suitable for clustering tasks on web documents such as Reuters-10k dataset. Specifically, the deep fuzzy clustering GrDNFCS has been implemented and explored to reproduce automatically categorize web documents reported in the paper. A new approach named CLDFC, where a contrastive loss is introduced into GrDNFCS is proposed and designed to improve accuracy of clustering. Based on our preliminary study, CLDEC shows 2.5% improvement in accuracy and reduce time complexity of average 60s per epoch compared with GrDNFCS. Experiments on several other clustering models will be included for comparisons.||URI:||https://hdl.handle.net/10356/158048||Fulltext Permission:||restricted||Fulltext Availability:||With Fulltext|
|Appears in Collections:||EEE Student Reports (FYP/IA/PA/PI)|
Updated on Jun 24, 2022
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.