Please use this identifier to cite or link to this item:
Title: Automatic summarization of web documents
Authors: Jodihardja, Marcellus Reinaldo
Keywords: DRNTU::Engineering
Issue Date: 2013
Abstract: Nowadays, we face an information overload, with all the rapid development in R& D and technological advancement. Even though information overload means that we can have various information regarding a specific topic, but it start to became more difficult to retrieve all the information needed in a limited time. The objective of this project is to create an auto-summarization program that can create a good summary of some documents in matter of seconds. By having this program, hopefully we can have all the information needed that are encapsulated in a dense and compact document. Latent Semantic Analysis is chosen to be the fundamental concept of this auto-summarization program. Thus, TFIDF (Term Frequency – Inverse Document Frequency) is utilized to give value of importance for each term, and Singular Value Decomposition is used to select the best sentences that can represent all information in a document. Some modifications have also been applied onto the algorithm in order to increase the efficiency and reduce the complexity time of this program. Furthermore, “meta” summarization method has also been implemented, to create a summary from some summaries that have been created from some input documents. This project successfully implemented all the algorithms needed and thus creating a good summary based on some input documents.
Schools: School of Electrical and Electronic Engineering 
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
3.56 MBAdobe PDFView/Open

Page view(s)

Updated on Jun 19, 2024

Download(s) 50

Updated on Jun 19, 2024

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.