Please use this identifier to cite or link to this item:
Title: Query-based text extraction algorithm for web pages.
Authors: New, Chin Ker.
Keywords: DRNTU::Library and information science::Libraries::Automation
DRNTU::Library and information science::Libraries::Technologies
Issue Date: 2000
Abstract: The objective of this research is to develop a query-based text extraction algorithm to generate an abstract from a Web document automatically. The algorithm was derived after a study of a sample of 60 sample Web pages. These Web pages were chosen from 5 different subject areas and retrieved using the AltaVista Search Engine. The development of this algorithm was based on sentence weight (through simple calculation), cue words, location of the sentence and the application of canned abstracts. To test out the new algorithm, a total of 50 Web pages (from 10 different subject areas) were retrieved from the Internet through AltaVista Search Engine. The abstracts of these Web pages were then generated by hand by simulating the new algorithm.
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:WKWSCI Theses

Files in This Item:
File Description SizeFormat 
  Restricted Access
16.27 MBAdobe PDFView/Open

Page view(s) 50

Updated on Jan 24, 2021


Updated on Jan 24, 2021

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.