Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/78780
Title: Real time semantic segmentation by fully convolutional network for UAV localization in urban environment
Authors: Jiang, Muyun
Keywords: Engineering::Electrical and electronic engineering
Issue Date: 2019
Abstract: Deep learning is an improvement over neural networks that includes more layers of computation, allowing for higher levels of abstraction and prediction in the data. To date, it has become the leading machine learning tool for general imaging and computer vision. Current research trends also indicate that deep convolutional neural networks (DCNN) are very effective for automatically analyzing images. Therefore, machine learning has a wide range of uses in the perceptual positioning of robots. This project aims to localize UAV equipped with a camera automatically flying among urban streets. The UAV navigation suffers from day-night luminance and weather change,so the UAV should only remember the invariant features in the environment such as road boundaries and building structures, and ignore those may change over time, such as cars, trees. And because UAV on-board computer is a computational power limited platform, so the network design should leverage speed and accuracy. This work proposes an effective and efficient Full Convolutional Network based End-to-End Encoder-Decoder Architecture for automatically semantic segmentation in urban environment. Because of the computational power limit of the UAV platform, the network is designed to be a 12-layer FCN with Dilated Convolution to to enlarge the field of view and extract multiscale information, and depthwise and pointwise separable convolution to reduce the number of parameters and cut down the computation cost, without causing degrading performance. In case of insufficient dataset images, we use data augmentation and Drop Block to provide sufficient and training data and improve generalization capability. The proposed model can achieve real-time operation standard of 11Hz image processing frequency on NVIDIA TX2 platform with over 86.5% mean IOU.
URI: http://hdl.handle.net/10356/78780
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
JIANG_MUYUN_Master_Dissertation.pdf
  Restricted Access
New version, modified as suggested by mail.9.09 MBAdobe PDFView/Open

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.