|
Title:
|
High precision treebanking - blazing useful trees using POS information.
|
|
Author:
|
Tanaka, Takaaki.; Bond, Francis.; Oepen, Stephan.; Fujita, Sanae.
|
|
Copyright year:
|
2005 |
|
Abstract:
|
In this paper we present a quantitative
and qualitative analysis of annotation in
the Hinoki treebank of Japanese, and investigate
a method of speeding annotation
by using part-of-speech tags. The Hinoki
treebank is a Redwoods-style treebank of
Japanese dictionary de nition sentences.
5,000 sentences are annotated by three different
annotators and the agreement evaluated.
An average agreement of 65.4% was
found using strict agreement, and 83.5%
using labeled precision. Exploiting POS
tags allowed the annotators to choose the
best parse with 19.5% fewer decisions. |
|
Subject:
|
DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics. |
|
Type:
|
Conference Paper |
|
Conference name:
|
Proceedings of the 43rd Annual Meeting of the ACL |
|
School:
|
School of Humanities and Social Sciences |
|
Rights:
|
© 2005 ACL. This paper was published in Proceedings of the 43rd Annual Meeting of the ACL and is made available as an electronic reprint (preprint) with permission of Association for Computational Linguistics. The paper can be found at: [DOI: http://dx.doi.org/10.3115/1219840.1219881]. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law. |
|
Version:
|
Published version |