Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/172661
Full metadata record
DC FieldValueLanguage
dc.contributor.authorTang, Zhe Junen_US
dc.contributor.authorCham, Tat-Jenen_US
dc.date.accessioned2023-12-19T05:23:12Z-
dc.date.available2023-12-19T05:23:12Z-
dc.date.issued2022-
dc.identifier.citationTang, Z. J. & Cham, T. (2022). MPT-Net: mask point transformer network for large scale point cloud semantic segmentation. 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 10611-10618. https://dx.doi.org/10.1109/IROS47612.2022.9981809en_US
dc.identifier.isbn9781665479271-
dc.identifier.urihttps://hdl.handle.net/10356/172661-
dc.description.abstractPoint cloud semantic segmentation is important for road scene perception, a task for driverless vehicles to achieve full fledged autonomy. In this work, we introduce Mask Point Transformer Network (MPT-Net), a novel architecture for point cloud segmentation that is simple to implement. MPT-Net consists of a local and global feature encoder and a transformer based decoder; a 3D Point-Voxel Convolution encoder backbone with voxel self attention to encode features and a Mask Point Transformer module to decode point features and segment the point cloud. Firstly, we introduce the novel MPT designed to specifically handle point cloud segmentation. MPT offers two benefits. It attends to every point in the point cloud using mask tokens to extract class specific features globally with cross attention, and provide inter-class feature information exchange using self attention on the learned mask tokens. Secondly, we design a backbone to use sparse point voxel convolutional blocks and a self attention block using transformers to learn local and global contextual features. We evaluate MPT-Net on large scale outdoor driving scene point cloud datasets, SemanticKITTI and nuScenes. Our experiments show that by replacing the standard segmentation head with MPT, MPT-Net achieves a state-of-the-art performance over our baseline approach by 3.8% in SemanticKITTI and is highly effective in detecting 'stuffs' in point cloud.en_US
dc.language.isoenen_US
dc.rights© 2022 IEEE. All rights reserved.en_US
dc.subjectEngineering::Computer science and engineering::Computing methodologies::Image processing and computer visionen_US
dc.titleMPT-Net: mask point transformer network for large scale point cloud semantic segmentationen_US
dc.typeConference Paperen
dc.contributor.schoolSchool of Computer Science and Engineeringen_US
dc.contributor.conference2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)en_US
dc.identifier.doi10.1109/IROS47612.2022.9981809-
dc.identifier.scopus2-s2.0-85146358620-
dc.identifier.spage10611en_US
dc.identifier.epage10618en_US
dc.subject.keywordsPoint Cloud Compressionen_US
dc.subject.keywordsRepresentation Learningen_US
dc.citation.conferencelocationKyoto, Japanen_US
item.grantfulltextnone-
item.fulltextNo Fulltext-
Appears in Collections:SCSE Conference Papers

SCOPUSTM   
Citations 50

2
Updated on Oct 11, 2024

Page view(s)

164
Updated on Oct 9, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.