Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/180181
Title: | Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation | Authors: | Guo, Xiaodong Zhou, Wujie Liu, Tong |
Keywords: | Computer and Information Science | Issue Date: | 2024 | Source: | Guo, X., Zhou, W. & Liu, T. (2024). Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation. Knowledge-Based Systems, 292, 111588-. https://dx.doi.org/10.1016/j.knosys.2024.111588 | Journal: | Knowledge-Based Systems | Abstract: | RGB thermal semantic segmentation facilitates unmanned platforms to perceive and characterize their surrounding environment, which is critical for autonomous driving tasks. Deep-learning-based algorithms have achieved dominance in terms of accuracy and robustness. However, their large parameter sizes and significant computational demands impede their application in terminal devices. To address this challenge, we propose a novel strategy for achieving a balance between effectiveness and compactness. It includes a robust teacher network, CLNet-T, and a streamlined student network, CLNet-S. Using knowledge distillation (KD), we obtained an optimized model called CLNet-S*. Specifically, CLNet-T and CLNet-S were identical in all aspects except for the feature extraction component. They included a multi-attribute hierarchical feature interaction module (MHFI) and a detail-guided semantic decoder (DGSD). The MHFI initially filters features by considering the characteristics of the low- and high-level features. It gradually combines complementary and common features from various modalities in distinct receptive fields. DGSD uses edge and distribution information to guide semantic decoding, thereby improving the segmentation accuracy at class boundaries. To enhance the performance of the compact student model, our KD strategy includes detail, semantic response distillation (DSRD), and contrastive learning-based feature distillation (CLFD). Practically, DSRD enables the student model to gain knowledge from the teacher model at both the detailed and semantic levels. At the same time, CLFD increases the similarity of features within the same categories and emphasizes the distinctiveness of features between different categories in both the student and teacher models. Extensive experiments conducted on two standard datasets have consistently demonstrated that both CLNet-T and CLNet-S* outperform other state-of-the-art methods. The code and results are available at https://github.com/xiaodonguo/CLNet. | URI: | https://hdl.handle.net/10356/180181 | ISSN: | 0950-7051 | DOI: | 10.1016/j.knosys.2024.111588 | Schools: | School of Computer Science and Engineering | Rights: | © 2024 Elsevier B.V. All rights reserved. | Fulltext Permission: | none | Fulltext Availability: | No Fulltext |
Appears in Collections: | SCSE Journal Articles |
SCOPUSTM
Citations
50
6
Updated on Jan 16, 2025
Page view(s)
39
Updated on Jan 15, 2025
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.