Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/184179
Title: | Data science in Python of large language models for AI-assisted programming | Authors: | Wang, Davis Shang An | Keywords: | Other | Issue Date: | 2025 | Publisher: | Nanyang Technological University | Source: | Wang, D. S. A. (2025). Data science in Python of large language models for AI-assisted programming. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/184179 | Project: | CCDS24-0778 | Abstract: | The use of large language models (LLMs) for AI-assisted programming in the Python ecosystem is examined in this research. The purpose of the study is to assess and contrast the performance of a number of code generation models, including open-source alternatives like CodeLLaMA and StarCoderBase and private ones like OpenAI's GPT-4. Through parameter-efficient fine-tuning (PEFT) with LoRA, the StarCoderBase-7B model was refined using an instruction-based Stack Exchange dataset that was curated using a structured data science approach. To enable natural language to code generation straight from the browser, the optimised adapter was then combined with the base model and made available through a Flask web application. The HumanEval benchmark was utilized to evaluate the model's performance, and pass@k metrics were calculated. According to experimental data, the improved StarCoder model has promising functional correctness in comparison to other open-source models, however GPT-4 continues to be the best performer. The work shows that it is possible to create and implement efficient code generating aids on accessible hardware by using lightweight quantisation and fine-tuning techniques. In addition to providing insights into efficient model tuning, evaluation, and deployment procedures for practical use cases, this research demonstrates the expanding potential of open LLMs in software development automation. | URI: | https://hdl.handle.net/10356/184179 | Schools: | College of Computing and Data Science | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | CCDS Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Final_Year_Project_Amended_Report_Wang_Shang_An_Davis_CCDS24-0778.pdf Restricted Access | 1.19 MB | Adobe PDF | View/Open |
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.