Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/163793
Title: Object-aware vision and language navigation for domestic robots
Authors: Zhao, Weiyi
Keywords: Engineering::Electrical and electronic engineering
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Zhao, W. (2022). Object-aware vision and language navigation for domestic robots. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/163793
Project: ISM-DISS-02859
Abstract: Vision and Language Navigation (VLN) problem demands a robot to navigate accurately by combining the natural language instruction and the visual perception of surrounding environment. Seamlessly combining and matching of textual instructions with visual features is challenging due to various entity clues, such as scene, object and direction, contained in both modal features. Based on the previous work \cite{entity}, we enrich the input feature information of the LSTM network by adding object features with different strategies to infer the state of the robot and propose OVLN (Object-aware Vision and Language Navigation) model. In OVLN, the addition of object features allow robot to be object aware and minimize the loss of visual information. The attention mechanism has been used to extract the specialized contexts and relational contexts of object, scene and direction for the language. Then a visual attention graph is constructed to obtain the entity aspects from vision to derive the navigation action. The model is trained on the Room-to-Room (R2R) dataset with a hierarchical structure. After the first stage training with imitation and reinforcement learning, the augmentation data is leveraged to fine-tune the model in the second stage to improve the generalizability. Experimental results show that OVLN improves both the successful rate (SR) and the successful rate weighted by path length (SPL) compared with previous methods. Meanwhile, OVLN alleviates the overshoot problem for the existing works, benefiting from the object awareness.
URI: https://hdl.handle.net/10356/163793
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
[ZHAO WEIYI]-Revision of Amended Dissertation_1.pdf
  Restricted Access
11.28 MBAdobe PDFView/Open

Page view(s)

31
Updated on Jan 26, 2023

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.