Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/175994
Title: Developing a graphene Q&A chatbot using retrieval augmented generation (RAG)
Authors: Sara Johari
Keywords: Engineering
Issue Date: 2024
Publisher: Nanyang Technological University
Source: Sara Johari (2024). Developing a graphene Q&A chatbot using retrieval augmented generation (RAG). Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175994
Abstract: Graphene synthesis is a rapidly growing market with various methods for different applications. However, the mass production of high-quality graphene that is cost-effective and environmentally sustainable has not been established commercially. Current graphene synthesis techniques also face issues related to reproducibility. Recently, the proliferation of artificial intelligence (AI) with ever-evolving large language models (LLMs), along with the emergence of the Retrieval Augmented Generation (RAG) approach, has demonstrated significant abilities to produce natural responses with a vast amount of knowledge. Therefore, there is an interest in combining the database of graphene synthesis with AI to remarkably assist in the research process. This experimental study tested the use of UMAP visualizations to determine the optimal chunk size and overlap. Subsequently, two LLMs, the DRAGON Deci-7B LLM and the DRAGON Mistral-7B LLM, were tested within a RAG question-answering chatbot architecture. The chatbots were then further evaluated with two advanced retrieval methods: the parent document retriever and the ensemble retriever. The chatbots were evaluated by RAGAs, a performance metric framework, with ChatGPT as a benchmark using a synthetic dataset of 10 questions and corresponding ground truths. Human evaluation was also conducted by manually inputting a user prompt into the chatbots and analysing the response generated. In summary, through LLM evaluations with ChatGPT, the optimal chatbot developed in this study utilized the DRAGON Mistral-7B LLM with the parent document retrieval method, with an embedded chunk size of 256 tokens and a 10% overlap. However, human evaluations raised concerns with regards to the actual useability of the chatbot. Further troubleshooting and refinement would be necessary, but this was constrained by the costs associated with the project.
URI: https://hdl.handle.net/10356/175994
Schools: School of Materials Science and Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:MSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
AY23S1 and AY23S2 final report_Sara Johari.pdf
  Restricted Access
1.4 MBAdobe PDFView/Open

Page view(s)

233
Updated on Mar 16, 2025

Download(s)

23
Updated on Mar 16, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.