Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/175732
Title: AimigoTutor - tutoring application using multi-modal capabilities
Authors: Nguyen, Viet Hoang
Keywords: Computer and Information Science
Issue Date: 2024
Publisher: Nanyang Technological University
Source: Nguyen, V. H. (2024). AimigoTutor - tutoring application using multi-modal capabilities. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175732
Project: SCSE23-0209 
Abstract: Video captioning has been an up-and-coming research topic. Thanks to the recent advances in the performance of deep neural networks, especially with transformers, video captioning is seeing a huge potential improvement in accuracy and versatility. Most state-of-the-art video captioning models employ a multi-modal approach, whereby both the visual information of the video frames and the audio information of the video are used to extract the semantic meaning of the video. This project will explore the capability of multi-modal video captioning in a much-needed context: building a video tutoring application for students, called AimigoTutor. This report will discuss the requirements, design, implementation and evaluation of the application.
URI: https://hdl.handle.net/10356/175732
Schools: School of Computer Science and Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
NguyenVietHoang_FYP_AmendedFinalReport.pdf
  Restricted Access
3.41 MBAdobe PDFView/Open

Page view(s)

138
Updated on May 7, 2025

Download(s) 50

41
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.