Please use this identifier to cite or link to this item:
Title: From an image to a text description of an image
Authors: Peter
Keywords: DRNTU::Engineering::Computer science and engineering
Issue Date: 2017
Abstract: This project presents an implementation of a search feature that allows user to look for a particular object of interest in a video. The main idea is to train a very deep neural network architecture that outputs a sequence of words that describe an image. The network consists of a convolutional neural network (CNN) that learns features found on an image, and a long short-term memory (LSTM) unit that predicts the sequence of words from learnt features of the image. This project is not about real-time object detection, instead a video has to be preprocessed before a user may search for an object found visually inside the video.
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
4.78 MBAdobe PDFView/Open

Page view(s)

Updated on Jun 19, 2021

Download(s) 50

Updated on Jun 19, 2021

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.