Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/175326
Title: Language-guided object segmentation
Authors: John Benedict, Remelia Shirlley
Keywords: Computer and Information Science
Issue Date: 2024
Publisher: Nanyang Technological University
Source: John Benedict, R. S. (2024). Language-guided object segmentation. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175326
Project: SCSE23-0379 
Abstract: Language-guided Video Object Segmentation (LVOS) is a multi-modal AI task that segments objects in videos based on natural language expressions. Although there has been significant research on Referring-Video Object Segmentation (R-VOS), which enables LVOS, these methods still face limitations that prevent accurate LVOS performance in real-life scenarios. Current R-VOS methods often rely on datasets featuring predominantly static attributes like object colour and category names or focus on singular objects identifiable in a single frame. This approach undermines the importance of tracking the target object's motion over time, leading to the failure of R-VOS models in capturing fleeting movements and long-term actions. The Motion expressions Video Segmentation (MeViS) dataset, which prioritizes the temporal dynamics in videos, is used to overcome this challenge. This approach requires LVOS models to recognize temporal context and have attention to the target object, a capability lacking in existing R-VOS methods. This report expands on the Language-guided Motion Perception and Matching (LMPM) model, a baseline model developed using the MeViS dataset and seeks to improve the robustness of the LMPM model, specifically by addressing the challenges posed by uncertain user text input.
URI: https://hdl.handle.net/10356/175326
Schools: School of Computer Science and Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
FYP_report.pdf
  Restricted Access
3.68 MBAdobe PDFView/Open

Page view(s)

120
Updated on May 7, 2025

Download(s)

9
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.