Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/183918
Title: | Deep learning methods with less supervision | Authors: | Lam, Kai Yi | Keywords: | Computer and Information Science | Issue Date: | 2025 | Publisher: | Nanyang Technological University | Source: | Lam, K. Y. (2025). Deep learning methods with less supervision. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/183918 | Abstract: | This study examines weakly supervised semantic segmentation in video datasets, eval- uating the effectiveness of various prompting strategies on the initial video frame. The prompting approaches explored include segmentation masks from initial frames, sparse point-based prompting, augmented point-based techniques, text-based object grounding, and a combination of the methods. The findings indicate that the un- derlying models effectively propagate key image features across frames, mitigating challenges such as motion blur, perspective shifts, and occlusion. Augmenting point- based prompts enhances segmentation accuracy by reducing semantic ambiguity, while text-based prompting with a grounded object detection model offers a low-annotation alternative for object segmentation and tracking. The integration of Grounding DINO and SAM2 for text based prompting also shows strong results but is heavily reliant on the precision of image-level semantic labels. Overall, the results validate the efficacy of prompt-based segmentation in weakly supervised settings and underscore its potential for generating accurate semantic masks in video analysis. | URI: | https://hdl.handle.net/10356/183918 | Schools: | College of Computing and Data Science | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | CCDS Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Lam_KaiYi_FYP.pdf Restricted Access | 20.25 MB | Adobe PDF | View/Open |
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.