Please use this identifier to cite or link to this item:
Title: Extracting integrate and search healthcare knowledge from the web (III)
Authors: Lim, Lionel Guan Chuan.
Keywords: DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Issue Date: 2013
Abstract: Currently, there is a trend where users post questions and edit questions via the use of online websites. These sites are also known as Community Question Answering (CQA) sites. CQA sites are beneficial to the web users because of the valuable knowledge accumulated from everybody around the world. However, as beneficial as CQA sites may be, there comes a complexity of how to extract only relevant information which is beneficial to the web user. The goal of this project aims to consolidate healthcare information and allow web users to extract information which is beneficial to them. To do so, java-programmed web crawlers are programmed to retrieve the URL, category, question answer from the CQA health category. The question answer pairs crawled are then saved into an XML format. Lucene, a java IR java library, is used for speed indexing of the various XML documents.Another goal is to design a centralised search engine that can retrieve relevant healthcare information from CQA data. As this project will be a continuation from Senior Lee Qian Hui’s progress, i am tasked to utilise Information Retrieval Models to data crawl from more CQA sites that resemble WikiAnswers, which was previously implemented by Senior Lee.
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
SCE 11-0066.pdf
  Restricted Access
Extracting integrate and Search healthcare knowledge from the Web (III)1.37 MBAdobe PDFView/Open

Page view(s) 50

checked on Oct 24, 2020

Download(s) 50

checked on Oct 24, 2020

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.