Discourse parsing of sociology dissertation abstracts using decision tree induction
Author
Ou, Shiyan
Khoo, Christopher S. G.
Heng, Hui Ying
Goh, Dion Hoe-Lian
Date of Issue
2003Conference Name
14th ASIS SIG/CR Classification Research Workshop
School
Wee Kim Wee School of Communication and Information
Version
Published version
Abstract
In this study, we investigated the use of decision tree induction to
parse the macro-level discourse structure of sociology dissertation abstracts.
We treated discourse parsing as a sentence categorization task. The attributes
used in constructing the decision tree models were stemmed words that
occurred in at least 35 sentences (out of 3694 sentences in 300 sample
abstracts). Sentence location information was also used. The model obtained
an accuracy rate of 71.3% when applied to a test sample of 100 abstracts.
Another model that made use of information regarding the presence of 31
indicator words in neighboring sentences was also developed. Although this
model did not obtain better results, a comparison of the two models suggests
that an improvement in the classification of sentences in problem statement
and research method section is possible by combining the models.
Subject
DRNTU::Library and information science
Type
Conference Paper
Rights
© 2009 The Author(s) (ASIS SIG/CR Classification Research Workshop). This paper was published in Proceedings of the 14th ASIS SIG/CR Classification Research Workshop and is made available as an electronic reprint (preprint) with permission of The Author(s) (ASIS SIG/CR Classification Research Workshop). The published version is available at: [http://journals.lib.washington.edu/index.php/acro/article/view/14114]. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law.
Collections