Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/54823
Title: A framework for mining opinions from user generated content
Authors: Amit Kumar Saini
Keywords: DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Issue Date: 2013
Abstract: In this thesis we have presented a scalable framework for mining features and opinions from online reviews. Large scale opinion mining requires scalable components for data storage, along with unsupervised learning solutions for extracting features and opinions, with the ultimate goal of generating meaningful summaries. We have built our system using travel reviews but the system can be used on any domain with minimal changes. Our focus is to come up with a highly scalable framework. A system which can scale both horizontally and vertically to deploy on large scale distributed systems. Hence, we presented an architecture by carefully examining every component used in the system including the database for storing reviews. We have compared various choices and chosen state-of-the-art open source technologies that use distributed multi-node architecture. As a result, millions of reviews can be stored and indexed. We have used travel reviews for testing purpose but the system can be used on any domain with minimal changes. We have implemented a dynamic feature extraction engine that utilizes unsupervised learning to associate extracted features and opinions starting with only one domain seed feature. For example, the feature seed word 'hotel' is all that is needed to extract a list of related hotel feature words like 'room' and 'service'. Next we extract the opinions expressed on the dynamically extracted features and perform sentence level sentiment analysis. To present the results to the end user in an intuitive manner, we subsequently created a web interface and experimented with new visualization techniques. Experiments were conducted to evaluate the system and proposed methods. From the analysis of the results we discuss drawbacks of our current approach and future direction of the research. Finally, a fully-functioning prototype has been created to demonstrate the end-to-end system.
URI: http://hdl.handle.net/10356/54823
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
final-report.pdf
  Restricted Access
Final SCE Theses2.87 MBAdobe PDFView/Open

Page view(s)

263
checked on Sep 30, 2020

Download(s)

18
checked on Sep 30, 2020

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.