Semantic annotation of a Japanese speech corpus.
Date of Issue2000
Workshop on Semantic Annotation and Intelligent Content (2000 : Luxembourg)
College of Humanities, Arts, and Social Sciences
This paper describes the semantic annotations we are performing on the CallHome Japanese corpus of spontaneous, unscripted telephone conversations (LDC, 1996). Our annotations include (i) semantic classes for all nouns and verbs; (ii) verb senses for all main verbs; and (iii) relations between main verbs and their complements in the same utterance. Our semantic tagset is taken from NTT's Goi-Taikei semantic lexicon and ontology (Ikehara et al., 1997). A pilot study demonstrates that the verb sense tagging can be efficiently performed by native Japanese speakers using computergenerated HTML forms, and that good interannotator reliability can be obtained in the right conditions.
© 2000 ACL This is the author created version of a work that has been peer reviewed and accepted for publication by In COLING Workshop on Semantic Annotation and Intelligent Content, Association for Computational Linguistics. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document.