Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/140587
Title: | Automatic topic detection of news | Authors: | Liu, Fengyuan | Keywords: | Engineering::Electrical and electronic engineering | Issue Date: | 2020 | Publisher: | Nanyang Technological University | Project: | A1119-191 | Abstract: | The aim of this project is to explore the topic of Natural Language Processing and how to implement it into automatic topic detection, namely categorization and topic generation of news articles. The project will mainly focus on using unsupervised learning methods for implementation to reduce the amount of manual work and fulfill the “automatic” component of the project [1]. Choosing the “right” information to read on the internet is a growing issue today. It is especially true for the news segment due to the vast amount of news available online. This brings our attention to one of the current solutions which is filtering or categorizing news into different sections and topics. However, manually categorizing the news is slow and prone to error since personal opinion is involved. Hence, the drive of the project would be to explore news topic detection using machine learning. The first half of the project explores topic modeling [2] and how to categorize news text using machine learning. The methodology chosen is Latent Dirichlet Allocation [3]. This model is trained on the “20 Newsgroup” dataset which contains 20,000 news documents across 20 different fields [4]. The second half of the project used the categorized results and further fine-grained the categories by generating new topic titles to choose from. The methodology used is Word2vec pre-trained on “Text8” corpus and fine-tuned using the “20 Newsgroup” dataset. This project also experiments on different approaches and hyperparameters to further analyze the results for both techniques. | URI: | https://hdl.handle.net/10356/140587 | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | EEE Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Final report LiuFengyuan U1621042F (4).pdf Restricted Access | 2.28 MB | Adobe PDF | View/Open |
Page view(s)
254
Updated on Jan 30, 2023
Download(s) 50
19
Updated on Jan 30, 2023
Google ScholarTM
Check
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.