Please use this identifier to cite or link to this item:
Full metadata record
DC FieldValueLanguage
dc.contributor.authorLim, Lionel Guan Chuan.
dc.description.abstractCurrently, there is a trend where users post questions and edit questions via the use of online websites. These sites are also known as Community Question Answering (CQA) sites. CQA sites are beneficial to the web users because of the valuable knowledge accumulated from everybody around the world. However, as beneficial as CQA sites may be, there comes a complexity of how to extract only relevant information which is beneficial to the web user. The goal of this project aims to consolidate healthcare information and allow web users to extract information which is beneficial to them. To do so, java-programmed web crawlers are programmed to retrieve the URL, category, question answer from the CQA health category. The question answer pairs crawled are then saved into an XML format. Lucene, a java IR java library, is used for speed indexing of the various XML documents.Another goal is to design a centralised search engine that can retrieve relevant healthcare information from CQA data. As this project will be a continuation from Senior Lee Qian Hui’s progress, i am tasked to utilise Information Retrieval Models to data crawl from more CQA sites that resemble WikiAnswers, which was previously implemented by Senior Lee.en_US
dc.format.extent43 p.en_US
dc.rightsNanyang Technological University
dc.subjectDRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrievalen_US
dc.subjectDRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processingen_US
dc.titleExtracting integrate and search healthcare knowledge from the web (III)en_US
dc.typeFinal Year Project (FYP)en_US
dc.contributor.schoolSchool of Computer Engineeringen_US
dc.description.degreeBachelor of Engineering (Computer Science)en_US
dc.contributor.supervisor2Gao Congen_US
item.fulltextWith Fulltext-
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)
Files in This Item:
File Description SizeFormat 
SCE 11-0066.pdf
  Restricted Access
Extracting integrate and Search healthcare knowledge from the Web (III)1.37 MBAdobe PDFView/Open

Page view(s) 50

checked on Oct 24, 2020

Download(s) 50

checked on Oct 24, 2020

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.