Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/158461
Title: Addressing the cold start problem in active learning using self-supervised learning
Authors: Chen, Liangyu
Keywords: Engineering::Electrical and electronic engineering
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Chen, L. (2022). Addressing the cold start problem in active learning using self-supervised learning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/158461
Project: A3282-211
Abstract: Active learning promises to improve annotation efficiency by iteratively selecting the most important data to be annotated first. However, we uncover a striking contrast to this promise: Active querying strategies fail to select data as effectively as random selection at the first choice. We identify it as the cold start problem in vision active learning. Systematic ablation experiments and qualitative visualizations reveal that the level of label uniformity (the uniform distribution of categories in a query) is an explicit criterion for determining the annotation importance. However, computing the label uniformity requires manual annotation, which is not available according to the nature of active learning. In this paper, we find that without manual annotation, contrastive learning can approximate label uniformity based on pseudo-labeled features generated from image feature clustering. Moreover, within each cluster, selecting hard-to-contrast data (low confidence in instance discrimination with low variability along the contrastive learning trajectory) is preferable to those ambiguous and easy-to-contrast data. In this paper, we find that without manual annotation, contrastive learning can approximate these two criteria based on pseudo-labeled features generated from image feature clustering. Extensive benchmark experiments show that our initial query sheds light on surpassing random sampling on medical imaging datasets (e.g. Colon Pathology, Dermatoscope, and Blood Cell Microscope). In summary, this study (1) illustrates the cold start problem in vision active learning, (2) investigates the underlying causes of the problem with rigorous analysis and visualization, and (3) determines effective initial queries to start the “human-in-the-loop” procedure. We hope our potential solution to the cold start problem can be used as a simple yet strong baseline to sample the initial query for active learning in image classification.
URI: https://hdl.handle.net/10356/158461
Fulltext Permission: embargo_restricted_20240518
Fulltext Availability: With Fulltext
Appears in Collections:EEE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
ChenLiangyu_FYP_report_revised.pdf
  Until 2024-05-18
12 MBAdobe PDFUnder embargo until May 18, 2024

Page view(s)

81
Updated on Dec 6, 2022

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.