Please use this identifier to cite or link to this item:
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBanerjee, Snehasishen
dc.contributor.authorChua, Alton Y. K.en
dc.contributor.authorJung-Jae Kimen
dc.identifier.citationBanerjee, S., Chua, A. Y. K., & Jung-Jae Kim. (2015). Distinguishing between authentic and fictitious user-generated hotel reviews. 2015 6th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 1-7.en
dc.description.abstractThe objective of this paper is to distinguish between authentic and fictitious user-generated hotel reviews. To achieve this objective, it adopts a two-step approach. The first seeks to classify authentic and fictitious reviews by leveraging on their possible textual differences. The second step attempts to identify the textual traits that are unique to authentic and fictitious reviews. For the purpose of this paper, a ground truth dataset of 1,800 reviews, uniformly divided between authentic and fictitious, was created. With respect to the first step, authentic and fictitious reviews were classified by using four forms of textual differences: understandability, level of details, writing style, and cognition indicators. Classification was performed using voting by average probability among logistic regression, C4.5, Support Vector Machine, JRip, and Random Forest classifiers. Using five-fold cross-validation, the proposed approach was found to outperform two existing baselines. Furthermore, with respect to the second step, the textual traits unique to authentic and fictitious reviews were identified using Information Gain, and Chi-squared feature selection techniques. A sequential forward feature selection approach was further adopted to identify the top five features that aid the classification of authentic and fictitious reviews. These include the use of nouns, articles, function words, punctuations, and in particular, exclamation points in reviews. The implications of the results are discussed.en
dc.format.extent7 P.en
dc.rights© 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [].en
dc.subjectClassification algorithmsen
dc.subjectData miningen
dc.subjectMachine learningen
dc.subjectText analysisen
dc.titleDistinguishing between authentic and fictitious user-generated hotel reviewsen
dc.typeConference Paperen
dc.contributor.schoolWee Kim Wee School of Communication and Informationen
dc.contributor.conference2015 6th International Conference on Computing, Communication and Networking Technologies (ICCCNT)en
dc.description.versionAccepted versionen
item.fulltextWith Fulltext-
Appears in Collections:WKWSCI Conference Papers
Files in This Item:
File Description SizeFormat 
Banerjee.pdf435.05 kBAdobe PDFThumbnail

Citations 50

Updated on Jul 12, 2024

Page view(s) 20

Updated on Jul 14, 2024

Download(s) 20

Updated on Jul 14, 2024

Google ScholarTM




Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.