Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/184194
Title: Artificial intelligence foundation models: understanding decision-making and mitigating hallucinations in AI-driven medical diagnostics with retrieval-augmented generation
Authors: Poh, Shi Qian
Keywords: Computer and Information Science
Issue Date: 2025
Publisher: Nanyang Technological University
Source: Poh, S. Q. (2025). Artificial intelligence foundation models: understanding decision-making and mitigating hallucinations in AI-driven medical diagnostics with retrieval-augmented generation. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/184194
Abstract: The rapid evolution of Artificial Intelligence (AI), particularly Foundation Models, has significantly transformed natural language processing in healthcare. However, a critical challenge remains — hallucinations in Large Language Models (LLMs), where AI generates inaccurate or misleading information. These hallucinations pose substantial risks in medical applications, where precision and reliability are paramount. This report explores the integration of Retrieval-Augmented Generation (RAG) as a potential solution to mitigate hallucinations in AI-driven medical diagnostics. The primary objective of this study is to understand the decision-making processes of state-of-the-art AI models and assess the impact of RAG in enhancing diagnostic accuracy and transparency. By incorporating external, trusted medical sources into the generation process, RAG aims to reduce misinformation and improve clinical decision support. The study involves a comparative analysis of AI foundation models, evaluation of RAG’s effectiveness, and exploration of potential enhancements for its application in healthcare. Additionally, Synthea's synthetic patient records leveraging Fast Healthcare Interoperability Resources (FHIR) and real-world medical datasets are examined to further strengthen AI-driven diagnostic tools. Through this exploration, RAG can be seen to aid in accuracy improvements, especially when implemented with the fine-tuned MedLlama2 model, showing significant improvements in accuracy from 0.1659 (without RAG) to 0.4600 (with RAG). In non fine-tuned models, despite the relatively lower accuracy as compared to fine-tuned models, the additional context provided with the help of RAG aided in the process of understanding the answer generation process and ensured that the answer generated is supported by credible sources from its context. Overall, the project highlights how RAG can serve as a key enabler in addressing AI hallucinations, ultimately improving patient safety, medical accuracy, and trust in AI-powered healthcare solutions.
URI: https://hdl.handle.net/10356/184194
Schools: College of Computing and Data Science 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:CCDS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
Poh_Shi_Qian_FYP_CCDS24-0763.pdf
  Restricted Access
5.22 MBAdobe PDFView/Open

Page view(s)

102
Updated on May 7, 2025

Download(s)

2
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.