Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/157271
Full metadata record
DC FieldValueLanguage
dc.contributor.authorMa, Shutingen_US
dc.date.accessioned2022-05-11T13:48:19Z-
dc.date.available2022-05-11T13:48:19Z-
dc.date.issued2022-
dc.identifier.citationMa, S. (2022). Detecting novel and interested topics from open sources based on deep neural network and natural language processing techniques. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/157271en_US
dc.identifier.urihttps://hdl.handle.net/10356/157271-
dc.description.abstractOne of the factors threatening the security of coastal countries is piracy. With the Cov-19 pandemic, piracy incidents have also become more frequent than usual, making it a challenge to the safety of residents and social stability. At the same time, published news reports on open resources for piracy incidents are truly treasure for piracy research. With the maturity of artificial intelligence technology and the continuous development of Natural Language Processing, how to reasonably use these open resource text materials for analysis has become an important research direction. This project first introduces the possible applications of NLP to pirate news materials. The relevant piracy news materials were collected from the open resources, marked and cleaned to form a new dataset related to this topic. Four mainstream text classification models, textCNN, Bi-LSTM, Transformer, and Bert, theoretical introductions and practical tests are carried out, and Bert is finally selected as the base model. To address the imbalanced data classification problem, this project proposes and explores a variety of methods combined with deep learning and machine learning. On the one hand, data resampling has been achieved to improve the balance of the dataset. On the other hand, with Bert has been chosen to do classification, Costive-SVM is constructed in a fully connected layer with Triplet Loss to separate the labels of positive and negative samples. After fine-tuning, the performance of the model has been improved, where the over-fitting problem in the optimization process is solved as well. Finally, the F1 score improved from 0.46 to 0.87.en_US
dc.language.isoenen_US
dc.publisherNanyang Technological Universityen_US
dc.subjectEngineering::Electrical and electronic engineeringen_US
dc.titleDetecting novel and interested topics from open sources based on deep neural network and natural language processing techniquesen_US
dc.typeThesis-Master by Courseworken_US
dc.contributor.supervisorMao Kezhien_US
dc.contributor.schoolSchool of Electrical and Electronic Engineeringen_US
dc.description.degreeMaster of Science (Computer Control and Automation)en_US
dc.contributor.supervisoremailEKZMao@ntu.edu.sgen_US
item.grantfulltextrestricted-
item.fulltextWith Fulltext-
Appears in Collections:EEE Theses
Files in This Item:
File Description SizeFormat 
Dissertation_Final_version_MaShuting_amend-2-upload.pdf
  Restricted Access
4.86 MBAdobe PDFView/Open

Page view(s)

28
Updated on Jul 1, 2022

Download(s)

2
Updated on Jul 1, 2022

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.