Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/184309
Title: | On robustness and interpretability in convolutional neural networks | Authors: | Weng, Pei He | Keywords: | Computer and Information Science | Issue Date: | 2025 | Publisher: | Nanyang Technological University | Source: | Weng, P. H. (2025). On robustness and interpretability in convolutional neural networks. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/184309 | Project: | CCDS24-0496 | Abstract: | Convolutional Neural Networks achieve good performance on vision-related tasks, but often suffer from limited interpretability and are vulnerable to adversarial attacks. Past studies have shown a connection between these issues, and this work aims to quantify how bolstering robustness may influence interpretability in greater detail. We train models on the CIFAR-10 and ImageNet classification tasks to varying degrees of robustness - measured with adversarial accuracy under a fixed threat model - and evaluate their interpretability using metrics that assess various desirable aspects of model explanations. Our findings indicate that adversarial training generally yields feature attributions that are more concise and stable, though not necessarily more relevant. Finally, we propose that even modest adversarial training can meaningfully improve the quality of gradient-based explanations, but the trade-off with standard accuracy necessitates careful balancing to avoid misleading but plausible interpretations. | URI: | https://hdl.handle.net/10356/184309 | Schools: | College of Computing and Data Science | Research Centres: | Hardware & Embedded Systems Lab (HESL) | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | CCDS Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
WengPeiHe_FYP_FinalAmended.pdf Restricted Access | 3.94 MB | Adobe PDF | View/Open |
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.