Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/61087
Title: | Automatic summarizer for web documents | Authors: | Chia, Pei Qi | Keywords: | DRNTU::Engineering | Issue Date: | 2014 | Abstract: | As the world globalize, internet is being used around the world. This resulted in the web documents in texts, growing exponentially. It is not suitable to read through all the text information online and just to find and sieve out what you need. Using unsupervised clustering algorithms, the author had created an automatic summarizer that summarizes long documents into short summaries. This thesis will discuss various natural language processing techniques and data mining concepts that are used within the software with primary focus on Lemmatization. These allows the gathering of similar meaning words as well as clustering algorithms Hierarchical Agglomerative Clustering and K-means. The methodology is using the top down and incremental approach to design and build a reliable and functional summarizer. This thesis also explains the functionalities of the summarizer with different implemented tests for greater confidence. They are then observe and evaluate on its flexibility to different text inputs and the logicality of the output summaries. The thesis would then conclude with the suggestion of increasing the usage of natural language process to aid computers in the 'understanding' text information and the probably of using soft clustering approach. All in all, the objective of the project is met and the thesis provides the reader the necessary knowledge to develop a summarizer using the clustering process depicted. | URI: | http://hdl.handle.net/10356/61087 | Schools: | School of Electrical and Electronic Engineering | Rights: | Nanyang Technological University | Fulltext Permission: | restricted | Fulltext Availability: | With Fulltext |
Appears in Collections: | EEE Student Reports (FYP/IA/PA/PI) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FYP_FinalReport_ChiaPeiQi_U1123181G.pdf Restricted Access | Main Article for automatic summarizing | 3.31 MB | Adobe PDF | View/Open |
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.