Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/165285
Title: | Can normalization methods allow escape from the doppelgänger effect in biomedical data? | Authors: | Guo, Zexi | Keywords: | Science::Biological sciences | Issue Date: | 2023 | Publisher: | Nanyang Technological University | Source: | Guo, Z. (2023). Can normalization methods allow escape from the doppelgänger effect in biomedical data?. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/165285 | Abstract: | The Doppelganger Effect (DE) describes the situation when an AI/ML model performs well on a validation set regardless of whether it has truly learned. DE may exaggerate the reported performance of the AI/ML model on real-world data, complicate model selection processes and lead towards false domain explanations. Here, we explore interactions between data normalization and DE. Although each normalization method produces different data distributions, they ultimately preserve rank orderings within each sample. It turns out that rank information alone is sufficient to induce high mutual correlations between samples. The only exception is the Gene Fuzzy Scoring (GFS) approach which impacts both scale and rank. Although GFS reduces mutual correlations, it does not provide an escape from DE, leading us to suspect that current approaches of identifying Data Doppelganger lack sensitivity. Contrary to previous reports, we find that GFS has reduced feature selection stability. However, GFS produces highly stable ML models which are also phenotypically relevant. We believe that combining GFS with current doppelganger mitigation measures may be a compelling synergistic approach towards biomedical data modeling. | URI: | https://hdl.handle.net/10356/165285 | DOI: | 10.32657/10356/165285 | Schools: | School of Biological Sciences | Rights: | This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). | Fulltext Permission: | open | Fulltext Availability: | With Fulltext |
Appears in Collections: | SBS Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
NTU_Master_Dissertation_Zexi_Guo_20230318_signed.pdf | 9.68 MB | Adobe PDF | ![]() View/Open |
Page view(s)
288
Updated on May 7, 2025
Download(s) 50
154
Updated on May 7, 2025
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.