Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/76976
Title: Deep learning for video segmentation
Authors: Tan, Clement Xian Ren
Keywords: DRNTU::Engineering::Computer science and engineering
Issue Date: 2019
Abstract: In recent years, CNNs have been the methodology used in multiple Computer Vision tasks. Although CNNs are avant-garde in image classification or object detection challenges, there are several limitations to them when it comes to semantic segmentation. When training the model, the resultant feature maps are usually coarse. Moreover, a typical evaluation process for the state-of-the-art DeepLab model takes approximately seven to eight FPS and is not suitable for real-time applications such as self-driving cars. This final year project seeks to evaluate the effectiveness of the atrous convolutions and atrous spatial pyramid pooling module on CNNs for the task of semantic segmentation. Before diving directly into the training of the CNN architectures, the analysis was done on the feature extractors and semantic segmentation architectures that will be used in the project. Next, the DeepLabV2, DeepLabV3 and dilated MobileNetV2 architectures were trained and evaluated on the Computer Vision and Pattern Recognition (CVPR) Workshop on Autonomous Driving (WAD) 2018 Berkeley DeepDrive dataset. In addition, the Cityscapes and a Singapore video will be used to visualize the drivable road segmentations. The DeepLabV3 and DeepLabV2 models used in this project achieved 84.30% and 78.83% validation mIOU respectively and these findings suggest that the atrous convolution and atrous spatial pooling module boosts the mIOU accuracy substantially and it may be reused in several other image classification architectures. These upsampling methodologies were incorporated into the MobileNetV2 which then achieved 76.10% validation mIOU and the trade-off between the accuracy and efficiency between the DeepLabV2 and MobileNetV2 architectures are discussed.
URI: http://hdl.handle.net/10356/76976
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
FYP Final Report_SCSE18-0060.pdf
  Restricted Access
3.67 MBAdobe PDFView/Open

Page view(s)

146
Updated on Jun 24, 2021

Download(s) 50

52
Updated on Jun 24, 2021

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.