Please use this identifier to cite or link to this item:
Title: Transformers for computer vision
Authors: Deng, Yaojun
Keywords: Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Issue Date: 2021
Publisher: Nanyang Technological University
Source: Deng, Y. (2021). Transformers for computer vision. Master's thesis, Nanyang Technological University, Singapore.
Project: ISM-DISS-02493
Abstract: Transformer models were initially introduced on natural language tasks based on the self-attention mechanism. They require minimal inductive biases on design and can be applied as individual processing layers in network design in network design. In recent years, transformer models are applied to popular Computer Vision (CV) tasks and led to significant progress. Previous surveys introduced applications of transformers on different tasks (e.g., object detection, activity recognition, and image enhancement). In this dissertation, we focus on image classification and introduce several outstanding and representative improved vision transformer models. We conduct comparison and simulation between transformer models and several representative convolution neural network (CNN) models to illustrate the advantages and limitations of vision transformers in Computer Vision (CV) tasks.
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Theses

Files in This Item:
File Description SizeFormat 
  Restricted Access
1.65 MBAdobe PDFView/Open

Page view(s)

Updated on Jan 17, 2022


Updated on Jan 17, 2022

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.