Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/165285
Title: Can normalization methods allow escape from the doppelgänger effect in biomedical data?
Authors: Guo, Zexi
Keywords: Science::Biological sciences
Issue Date: 2023
Publisher: Nanyang Technological University
Source: Guo, Z. (2023). Can normalization methods allow escape from the doppelgänger effect in biomedical data?. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/165285
Abstract: The Doppelganger Effect (DE) describes the situation when an AI/ML model performs well on a validation set regardless of whether it has truly learned. DE may exaggerate the reported performance of the AI/ML model on real-world data, complicate model selection processes and lead towards false domain explanations. Here, we explore interactions between data normalization and DE. Although each normalization method produces different data distributions, they ultimately preserve rank orderings within each sample. It turns out that rank information alone is sufficient to induce high mutual correlations between samples. The only exception is the Gene Fuzzy Scoring (GFS) approach which impacts both scale and rank. Although GFS reduces mutual correlations, it does not provide an escape from DE, leading us to suspect that current approaches of identifying Data Doppelganger lack sensitivity. Contrary to previous reports, we find that GFS has reduced feature selection stability. However, GFS produces highly stable ML models which are also phenotypically relevant. We believe that combining GFS with current doppelganger mitigation measures may be a compelling synergistic approach towards biomedical data modeling.
URI: https://hdl.handle.net/10356/165285
DOI: 10.32657/10356/165285
Schools: School of Biological Sciences 
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SBS Theses

Files in This Item:
File Description SizeFormat 
NTU_Master_Dissertation_Zexi_Guo_20230318_signed.pdf9.68 MBAdobe PDFThumbnail
View/Open

Page view(s)

288
Updated on May 7, 2025

Download(s) 50

154
Updated on May 7, 2025

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.