Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/184194
Title: | Artificial intelligence foundation models: understanding decision-making and mitigating hallucinations in AI-driven medical diagnostics with retrieval-augmented generation | Authors: | Poh, Shi Qian | Keywords: | Computer and Information Science | Issue Date: | 2025 | Publisher: | Nanyang Technological University | Source: | Poh, S. Q. (2025). Artificial intelligence foundation models: understanding decision-making and mitigating hallucinations in AI-driven medical diagnostics with retrieval-augmented generation. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/184194 | Abstract: | The rapid evolution of Artificial Intelligence (AI), particularly Foundation Models, has significantly transformed natural language processing in healthcare. However, a critical challenge remains — hallucinations in Large Language Models (LLMs), where AI generates inaccurate or misleading information. These hallucinations pose substantial risks in medical applications, where precision and reliability are paramount. This report explores the integration of Retrieval-Augmented Generation (RAG) as a potential solution to mitigate hallucinations in AI-driven medical diagnostics. The primary objective of this study is to understand the decision-making processes of state-of-the-art AI models and assess the impact of RAG in enhancing diagnostic accuracy and transparency. By incorporating external, trusted medical sources into the generation process, RAG aims to reduce misinformation and improve clinical decision support. The study involves a comparative analysis of AI foundation models, evaluation of RAG’s effectiveness, and exploration of potential enhancements for its application in healthcare. Additionally, Synthea's synthetic patient records leveraging Fast Healthcare Interoperability Resources (FHIR) and real-world medical datasets are examined to further strengthen AI-driven diagnostic tools. Through this exploration, RAG can be seen to aid in accuracy improvements, especially when implemented with the fine-tuned MedLlama2 model, showing significant improvements in accuracy from 0.1659 (without RAG) to 0.4600 (with RAG). In non fine-tuned models, despite the relatively lower accuracy as compared to fine-tuned models, the additional context provided with the help of RAG aided in the process of understanding the answer generation process and ensured that the answer generated is supported by credible sources from its context. Overall, the project highlights how RAG can serve as a key enabler in addressing AI hallucinations, ultimately improving patient safety, medical accuracy, and trust in AI-powered healthcare solutions. | URI: | https://hdl.handle.net/10356/184194 | Schools: | College of Computing and Data Science | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | CCDS Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Poh_Shi_Qian_FYP_CCDS24-0763.pdf Restricted Access | 5.22 MB | Adobe PDF | View/Open |
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.