|
Title:
|
A reexamination of MRD-based word sense disambiguation.
|
|
Author:
|
Baldwin, Timothy.; Kim, Su Nam.; Bond, Francis.; Fujita, Sanae.; Martinez, David.; Tanaka, Takaaki.
|
|
Copyright year:
|
2010 |
|
Abstract:
|
This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 dictionary task, we demonstrate that sense-sensitive definition extension over hyponyms, hypernyms and synonyms, combined with definition extension and word tokenisation leads to WSD accuracy above both unsupervised and supervised baselines. In doing so, we demonstrate the utility of ontology induction and establish new opportunities for the development of baseline unsupervised WSD methods. |
|
Subject:
|
DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics. |
|
Type:
|
Journal Article |
|
Series/ Journal Title:
|
ACM transactions on Asian language information processing |
|
School:
|
School of Humanities and Social Sciences |
|
Rights:
|
© 2010 Association for Computing Machinery. This is the author created version of a work that has been peer reviewed and accepted for publication by ACM Transactions on Asian Language Information Processing, Association for Computing Machinery. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [DOI: http://dx.doi.org/10.1145/1731035.1731039]. |
|
Version:
|
Accepted version |