Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/99292
Title: | An evaluation of classification models for question topic categorization | Authors: | Qu, Bo Cong, Gao Li, Cuiping Sun, Aixin Chen, Hong |
Keywords: | DRNTU::Engineering::Computer science and engineering::Information systems | Issue Date: | 2012 | Source: | Qu, B., Cong, G., Li, C., Sun, A., & Chen, H. (2012). An evaluation of classification models for question topic categorization. Journal of the American society for information science and technology, 63(5), 889-903. | Series/Report no.: | Journal of the American society for information science and technology | Abstract: | We study the problem of question topic classification using a very large real-world Community Question Answering (CQA) dataset from Yahoo! Answers. The dataset comprises 3.9 million questions and these questions are organized into more than 1,000 categories in a hierarchy. To the best knowledge, this is the first systematic evaluation of the performance of different classification methods on question topic classification as well as short texts. Specifically, we empirically evaluate the following in classifying questions into CQA categories: (a) the usefulness of n-gram features and bag-of-word features; (b) the performance of three standard classification algorithms (naive Bayes, maximum entropy, and support vector machines); (c) the performance of the state-of-the-art hierarchical classification algorithms; (d) the effect of training data size on performance; and (e) the effectiveness of the different components of CQA data, including subject, content, asker, and the best answer. The experimental results show what aspects are important for question topic classification in terms of both effectiveness and efficiency. We believe that the experimental findings from this study will be useful in real-world classification problems. | URI: | https://hdl.handle.net/10356/99292 http://hdl.handle.net/10220/17203 |
ISSN: | 1532-2882 | DOI: | 10.1002/asi.22611 | Fulltext Permission: | none | Fulltext Availability: | No Fulltext |
Appears in Collections: | SCSE Journal Articles |
SCOPUSTM
Citations
10
33
Updated on Mar 9, 2021
PublonsTM
Citations
10
23
Updated on Mar 9, 2021
Page view(s) 20
599
Updated on Jul 2, 2022
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.