Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/183839
Title: Prompt injection reviews, analysis, & evaluation for chatbot
Authors: Ong, Samson Qi Xuan
Keywords: Computer and Information Science
Issue Date: 2025
Publisher: Nanyang Technological University
Source: Ong, S. Q. X. (2025). Prompt injection reviews, analysis, & evaluation for chatbot. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/183839
Abstract: Prompt injection attacks pose a significant threat to chatbot applications, allowing adversaries to manipulate the model’s behaviour and bypass security mechanisms. This project addresses this issue by developing a Python package that detects and classify prompt injection attempts, providing chatbot developers—particularly those with limited cybersecurity expertise—with an accessible and effective security solution. A key aspect of this project involved creating a dataset of prompt injection attempts along with their respective attack classifications. Llama 3.1 was used to classify the prompt injection attempts found in the selected dataset, while GPT-4o served as an evaluator to refine and validate the initial classifications made. The refined dataset was then used to train machine learning models using XGBClassifier in combination with SMOTE to improve the models’ performance on imbalanced data. The python package also includes a logging mechanism that records detected prompt injection attempts, offering developers insights into attack patterns and potential vulnerabilities. At the end of its development, user feedback was collected to evaluate the effectiveness and usability of the package. While the project successfully met its primary objectives, areas for improvement were identified, particularly in detecting multi-step and contextual prompt injection attempts. Future work will focus on developing a benchmark dataset with human-verified classifications, improving the package’s ability to detect multi step conversational attacks, and expanding detection capabilities to include multimodal inputs, such as images. This study concluded that the developed Python package provides a practical and effective solution for safeguarding chatbot applications against prompt injection attacks.
URI: https://hdl.handle.net/10356/183839
Schools: College of Computing and Data Science 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:CCDS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
Final_FYP_Report_OngQiXuanSamson.pdf
  Restricted Access
2.1 MBAdobe PDFView/Open

Page view(s)

24
Updated on May 7, 2025

Download(s)

5
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.