Enhancing an English/Korean dictionary
Date of Issue2003
Papillon Workshop (2003)
School of Humanities and Social Sciences
In this paper, we introduce a machine-tractable Korean/English lexicon. We use engdic, an open source dictionary. engdic is an English-Korean dictionary for human use. The formatting is sometimes inconsistent, and there is missing or duplicated information, therefore it is not ready for machine use. We rearrange the disorganized format as well as improve the content. This makes it easier to use the dictionary bidirectionally. Our main purpose is to develop and document clear syntactic and semantic features useful for NLP applications such as machine translation. The original lexicon contains about 98,000 English lemmas and about 210,000 English-Korean pairs. Each entry consist of three parts: English lemma form, part of speech codes, and Korean translation/explanation. We transformed this to a more structured format consisting of eight fields.
© 2003 Proceedings of Papillon 2003 Workshop (CDROM).