Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/181182
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Seah, Kai Heng | en_US |
dc.date.accessioned | 2024-11-28T08:39:05Z | - |
dc.date.available | 2024-11-28T08:39:05Z | - |
dc.date.issued | 2024 | - |
dc.identifier.citation | Seah, K. H. (2024). Event extraction for cybersecurity using large language models. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181182 | en_US |
dc.identifier.uri | https://hdl.handle.net/10356/181182 | - |
dc.description.abstract | This project studies and compares the efficiency of different Large Language Models (LLMs) for the extraction of cybersecurity events. Cybersecurity event extraction is a critical task in Cyber Threat Intelligence, it is aimed at identifying and categorizing incidents such as data breaches, malware attacks, and vulnerabilities from unstructured text sources like news articles, threat reports, and social media. Traditional methods for cybersecurity event extraction often rely on rule-based systems or supervised machine learning models, which require extensive labelled data and are limited in adaptability. The nature of cybersecurity is that it is ever changing. One method of acquiring Cyber Threat Intelligence is through Open-Source Intelligence, where articles across the web are sourced and analysed. As LLMs have a good understanding of semantics and context, it is possible to leverage on LLMs for Cybersecurity event extraction. In this study, the focus will be on the conversational LLMs that many are familiar with, such as ChatGPT3.5, ChatGPT-4, LLAMA and Cohere. We investigate the efficacy of these conversational LLMs in extracting Cybersecurity events without further fine tuning but with the help of prompting techniques as well as Retrieval Augmented Generation. The effectiveness of our approach is evaluated through experiments on the CASIE dataset, comparing the performance of the different LLMs over zero shot, prompting techniques and retrieval augmented generation. The results demonstrate that the current state of base LLMs is unable to fulfil the task of Cybersecurity Event Extraction. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Nanyang Technological University | en_US |
dc.subject | Computer and Information Science | en_US |
dc.title | Event extraction for cybersecurity using large language models | en_US |
dc.type | Final Year Project (FYP) | en_US |
dc.contributor.supervisor | Hui Siu Cheung | en_US |
dc.contributor.school | College of Computing and Data Science | en_US |
dc.description.degree | Bachelor's degree | en_US |
dc.contributor.supervisoremail | ASSCHUI@ntu.edu.sg | en_US |
item.grantfulltext | restricted | - |
item.fulltext | With Fulltext | - |
Appears in Collections: | CCDS Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FYP_Final_Report_Seah_Kai_Heng.pdf Restricted Access | 729.73 kB | Adobe PDF | View/Open |
Page view(s)
61
Updated on Dec 11, 2024
Download(s)
3
Updated on Dec 11, 2024
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.