Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/181703
Title: | Enhancing performance in video grounding tasks through the use of attention module | Authors: | Do Duc Anh | Keywords: | Computer and Information Science | Issue Date: | 2024 | Publisher: | Nanyang Technological University | Source: | Do Duc Anh (2024). Enhancing performance in video grounding tasks through the use of attention module. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181703 | Abstract: | This report investigates improving video grounding tasks through the use of attention mechanisms, tackling the issue of sparse annotations in video datasets. Drawing inspiration from the MMN model \cite{wang2021_negative_2dmap}, we developed a modified model based on the open-source MMN codebase and evaluated it on several widely-used datasets, including Charades-STA and ActivityNet Captions. Our approach shows improvements over certain benchmarks. Additionally, we conducted an in-depth analysis to assess the role of attention in enhancing the multimodal framework's ability to comprehend the complex structure of videos. | URI: | https://hdl.handle.net/10356/181703 | Schools: | College of Computing and Data Science | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | CCDS Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FYP_DoDucAnh.pdf Restricted Access | 810.8 kB | Adobe PDF | View/Open |
Page view(s)
161
Updated on May 7, 2025
Download(s) 50
28
Updated on May 7, 2025
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.