Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/175286
Title: Neural image and video captioning
Authors: Lam, Ting En
Keywords: Computer and Information Science
Issue Date: 2024
Publisher: Nanyang Technological University
Source: Lam, T. E. (2024). Neural image and video captioning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175286
Project: SCSE23-0211 
Abstract: In today’s digital age, the proliferation of visual content has underscored the critical importance of multimedia comprehension and interpretation. Video uses images and sound to convey information. This project introduces a novel approach to video captioning, leveraging the synergies between Machine Learning, Computer Vision and Natural Language Processing to bridge the gap between human and computer understanding of visual understanding by generating descriptive captions from visual content. In this project, the effectiveness of various image captioning models is evaluated to identify optimal frameworks for textual description generation. Subsequently, a video captioning model capable of generating multimodal captions for video content is developed. The proposed image and video captioning models are evaluated using standard metrics and a human evaluation study was conducted. Additionally, the models are deployed into a user-friendly application for usage. Overall, this study seeks to improve video captioning performance and foster further advancements in this field.
URI: https://hdl.handle.net/10356/175286
Schools: School of Computer Science and Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
SCSE23-0211_Lam Ting En_Final Report.pdf
  Restricted Access
13.66 MBAdobe PDFView/Open

Page view(s)

112
Updated on May 7, 2025

Download(s)

15
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.