Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/166098
Title: | Development of an API for EDU segmentation | Authors: | Liu, Qingyi | Keywords: | Engineering::Computer science and engineering | Issue Date: | 2023 | Publisher: | Nanyang Technological University | Source: | Liu, Q. (2023). Development of an API for EDU segmentation. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166098 | Project: | SCSE22-0190 | Abstract: | EDU stands for elementary discourse unit, which is a clause-like structure in a sentence. EDU segmentation, refers to determining the boundaries to split sentences into multiple EDUs. This project aims to experiment and develop EDU segmentation models. The experiments are conducted using the Rhetorical Structure Theory (RST) dataset and the model performance is evaluated using the F1-score based on the token level EDU boundaries. The current existing research model, Segbot, has a Seq2seq model architecture using a bi-GRU encoder and GRU decoder with a pointer network to select the boundaries for EDU segmentation. To improve Segbot, we proposed replacing the bi-GRU encoder in Segbot with the generative pretrained BART encoder. This model performed at 94.5% F1-score. Token classification for EDU segmentation based on the boundaries is also explored. This is done by finetuning pretrained models such as BERT as well as using the PosTag embeddings as additional input features. Segbot with BART encoder yielded the highest performance and hence, the model weights would be used to develop an API Python Library in the future. This library would improve ease of usage for EDU segmentation on downstream NLP tasks, such as sentiment analysis and question answering. | URI: | https://hdl.handle.net/10356/166098 | Schools: | School of Computer Science and Engineering | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | SCSE Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FYP Report_Liu Qingyi.pdf Restricted Access | 925.89 kB | Adobe PDF | View/Open |
Page view(s)
168
Updated on Mar 16, 2025
Download(s)
16
Updated on Mar 16, 2025
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.