Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/139262
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lee, Zheng Han | en_US |
dc.date.accessioned | 2020-05-18T07:26:07Z | - |
dc.date.available | 2020-05-18T07:26:07Z | - |
dc.date.issued | 2020 | - |
dc.identifier.uri | https://hdl.handle.net/10356/139262 | - |
dc.description.abstract | In recent years, video analytics has risen to become a popular topic in the field of Artificial Intelligence. With the advancement in high-speed connection, machine learning algorithms and IoT technologies, the applications of video analytics using multiple modalities and information fusion technologies is becoming a commodity to everyone in the Information Age and the coming future. Most studies done in this topic previously focused on pushing the boundaries of algorithms for the applications of information fusion, such as Audio-visual correspondence task (AVC) and video-scene segmentation. This study aims to explore the optimization of video analytics based on information fusion technologies by using C3D-based action recognition function as the benchmark for video analytics performance. By scrutinizing and testing the mechanisms and architectures of the C3D-based action model, the best performing elements and the reasons behind their performances are explored. The types of pooling, optimizer and scheduler and their respective accuracies with the dataset used are recorded. The different methods of fusion of visual-audio information and their introduction into the action recognition model are explored. Their executions and respective accuracies are studied to get insights on how they affect the model’s performance. The feature extraction methods for the audio modality with their respective performance are also studied. Different self-attention mechanisms involving the modalities and channels are implemented in the model and the resulting accuracies studied. These explorations provide understandings on how they affect the performance of video analytics based on information fusion and subsequently help to unleash its full potential. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Nanyang Technological University | en_US |
dc.relation | A1121-191 | en_US |
dc.subject | Engineering::Electrical and electronic engineering | en_US |
dc.title | Video analytics based on deep learning and information fusion technologies | en_US |
dc.type | Final Year Project (FYP) | en_US |
dc.contributor.supervisor | Mao Kezhi | en_US |
dc.contributor.school | School of Electrical and Electronic Engineering | en_US |
dc.description.degree | Bachelor of Engineering (Electrical and Electronic Engineering) | en_US |
dc.contributor.supervisoremail | ekzmao@ntu.edu.sg | en_US |
item.grantfulltext | restricted | - |
item.fulltext | With Fulltext | - |
Appears in Collections: | EEE Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Lee Zheng Han FYP Final Report.pdf Restricted Access | 2.64 MB | Adobe PDF | View/Open |
Page view(s)
186
Updated on Jun 26, 2022
Download(s) 50
24
Updated on Jun 26, 2022
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.