Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/156521
Title: | From an image to a text description of the image | Authors: | Liu, Yanli | Keywords: | Engineering::Computer science and engineering | Issue Date: | 2022 | Publisher: | Nanyang Technological University | Source: | Liu, Y. (2022). From an image to a text description of the image. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/156521 | Project: | SCSE21-0061 | Abstract: | Information technology is changing rapidly, multimedia video with its rich information content, diverse presentation, convenient transmission, and storage form is rapidly replacing the traditional paper text. The amount of video data is growing in a spurt. In the face of the vast sea of news video, how to quickly and accurately retrieve and store video information has become a pressing problem. Video uses images and sound to convey information. To achieve this purpose, the visual summaries of broadcast news videos can first be recovered by extracting the video’s important frames, resulting in a collection of images that is a good representation of the video’s visual content. Image captioning is then used to assign relevant descriptions to the extracted keyframes. Meanwhile, the audio of the video is extracted to be processed. Not only the speech content itself but also the background sound indicate the news content. This project implements a fully automated video captioning system designed specifically for broadcast news video. To perform image captioning, the proposed system uses shot-based boundary detection to extract important frames, and a CLIP prefix + GTP2 model is used for image caption. The system’s accuracy is measured using the MS COCO dataset, and it’s compared to the current state-of-the-art in image captioning. Also presented is a method for evaluating the generated video captions against a set of annotated keyframes. | URI: | https://hdl.handle.net/10356/156521 | Schools: | School of Computer Science and Engineering | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
SCSE21_0061_Liu_Yanli.pdf Restricted Access | 4.68 MB | Adobe PDF | View/Open |
Page view(s)
78
Updated on Sep 23, 2023
Download(s)
18
Updated on Sep 23, 2023
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.