Please use this identifier to cite or link to this item:
Title: Question classification
Authors: Teh, Li Li.
Keywords: DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications
Issue Date: 2012
Abstract: There is an increasing trend for web users to ask question and get answers from the web portals. Web portals which provide the functionality for asking and replying to the questions are commonly known as Community Question Answering (CQA) services. These CQA services also allow users to search through question-answer pairs previously asked or browse through the categories. However, the categories might not be precise enough to be searched effectively. Recent studies have shown that it is more efficient if more specific subcategories are used. The project focuses on the CQA for the topic of cancer, whereby the topic is subcategorised into six different health stages. The six stages are as follows: 1) Stage 1: when healthy 2) Stage 2: when think might be ill 3) Stage 3: before getting a medical test or checkup 4) Stage 4: when diagnosed or self-diagnosed as ill 5) Stage 5: before a treatment, surgery, or taking certain medications 6) Stage 6: when receiving or taking treatments, medications, or exercise routines The aim is to explore the effectiveness of CQA with more specific subcategories using the text classification process to organize the questions asked into six different health stages. A web crawler is developed to extract thousands of cancer-related questions from a CQA portal and stored in XML format. In the experiment, there are two classification techniques used, namely Naive Bayes and Decision Stump. It is proven that Decision Stump performs better than Naive Bayes. Decision Stump has an overall accuracy of 57.384% compared to Naive Bayes which has 51.8987%.
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
  Restricted Access
1.94 MBAdobe PDFView/Open

Page view(s) 50

checked on Oct 23, 2020

Download(s) 50

checked on Oct 23, 2020

Google ScholarTM


Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.