Please use this identifier to cite or link to this item:
Title: Determining human intention in videos I
Authors: Hoong, Jia Qi
Keywords: Engineering::Computer science and engineering
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Hoong, J. Q. (2022). Determining human intention in videos I. Final Year Project (FYP), Nanyang Technological University, Singapore.
Project: SCSE21-0253 
Abstract: Human intention is a temporal sequence of human actions to achieve a goal. Determining human intentions is highly useful in many situations. It can enable better human-robot collaboration whereby robots are required to help human users. It is also useful in analysing human behaviours in dynamic environment, such as monitoring mobile patients in hospitals or monitoring athletes in tournaments. In this work, we focus on predicting future action from past observations in egocentric videos. This is known as egocentric action anticipation. Egocentric videos are videos that record the human actions in a first-person perspective. This research shall analyse a deep learning framework proposed by Furnari and Farinella [1]. The framework is a multimodal network consisting of (1) Rolling-Unrolling LSTM models for anticipating actions from egocentric videos using multi-modal features and (2) a Modality ATTention (MATT) mechanism for fusing multi-modal predictions. Moreover, the multimodal network shall be extended on other modalities, specifically using monocular depth for egocentric action anticipation.
Schools: School of Computer Science and Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
3.3 MBAdobe PDFView/Open

Page view(s)

Updated on Feb 26, 2024

Download(s) 50

Updated on Feb 26, 2024

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.