Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/184309
Title: On robustness and interpretability in convolutional neural networks
Authors: Weng, Pei He
Keywords: Computer and Information Science
Issue Date: 2025
Publisher: Nanyang Technological University
Source: Weng, P. H. (2025). On robustness and interpretability in convolutional neural networks. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/184309
Project: CCDS24-0496
Abstract: Convolutional Neural Networks achieve good performance on vision-related tasks, but often suffer from limited interpretability and are vulnerable to adversarial attacks. Past studies have shown a connection between these issues, and this work aims to quantify how bolstering robustness may influence interpretability in greater detail. We train models on the CIFAR-10 and ImageNet classification tasks to varying degrees of robustness - measured with adversarial accuracy under a fixed threat model - and evaluate their interpretability using metrics that assess various desirable aspects of model explanations. Our findings indicate that adversarial training generally yields feature attributions that are more concise and stable, though not necessarily more relevant. Finally, we propose that even modest adversarial training can meaningfully improve the quality of gradient-based explanations, but the trade-off with standard accuracy necessitates careful balancing to avoid misleading but plausible interpretations.
URI: https://hdl.handle.net/10356/184309
Schools: College of Computing and Data Science 
Research Centres: Hardware & Embedded Systems Lab (HESL) 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:CCDS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
WengPeiHe_FYP_FinalAmended.pdf
  Restricted Access
3.94 MBAdobe PDFView/Open

Page view(s)

30
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.