Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/62935
Title: Knowledge discovery from forum data
Authors: Li, Jun
Keywords: DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
Issue Date: 2015
Abstract: Advancement in information retrieval and data mining techniques has provided more and more useful mechanisms for the retrieval of most relevant information from documents, as well as for knowledge discovery from the same. The knowledge embedded in online forums, a kind of knowledge-rich data source, has yet be fully utilized because of the limited search functionalities provided by most existing forum platforms. This project provides a prototype solution to improve search functions of online forums. More specifically, a multithreaded Crawler and a Parser have been implemented to download and parse the posts published in a local forum in HTML format. A Topic Modeler which is built based on the MALLET package is used to generate the high-level topics of the forum data. An Indexer and a Searcher are then developed based on Lucene, to support searching over the forum data. A web search interface which supports sophisticated search requests and search result facet visualization is developed for users to discover knowledge in online forums. As the result, the solution provided by this project allows users to search relevant information by simple (e.g. single-keyword) as well as sophisticated queries. It also shows users a high-level view of the search results in aggregative and multi-facet visualized form. Furthermore, it enables users to understand the high-level topics of the search results by topic modeling. This search interface helps users to find the relevant information more effectively and efficiently. This study ends with a few limitations identified but not tackled due to the project scope and time constraint. Nevertheless, recommendations on addressing these limitations are made as future work.
URI: http://hdl.handle.net/10356/62935
Schools: Wee Kim Wee School of Communication and Information 
Rights: Nanyang Technological University
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:WKWSCI Theses

Files in This Item:
File Description SizeFormat 
TsciG1301348F.pdf
  Restricted Access
11.6 MBAdobe PDFView/Open

Page view(s) 50

525
Updated on Jun 24, 2024

Download(s) 50

23
Updated on Jun 24, 2024

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.