Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/181703
Title: Enhancing performance in video grounding tasks through the use of attention module
Authors: Do Duc Anh
Keywords: Computer and Information Science
Issue Date: 2024
Publisher: Nanyang Technological University
Source: Do Duc Anh (2024). Enhancing performance in video grounding tasks through the use of attention module. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181703
Abstract: This report investigates improving video grounding tasks through the use of attention mechanisms, tackling the issue of sparse annotations in video datasets. Drawing inspiration from the MMN model \cite{wang2021_negative_2dmap}, we developed a modified model based on the open-source MMN codebase and evaluated it on several widely-used datasets, including Charades-STA and ActivityNet Captions. Our approach shows improvements over certain benchmarks. Additionally, we conducted an in-depth analysis to assess the role of attention in enhancing the multimodal framework's ability to comprehend the complex structure of videos.
URI: https://hdl.handle.net/10356/181703
Schools: College of Computing and Data Science 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:CCDS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
FYP_DoDucAnh.pdf
  Restricted Access
810.8 kBAdobe PDFView/Open

Page view(s)

161
Updated on May 7, 2025

Download(s) 50

28
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.