|
Abstract:
|
In this paper, we introduce a machine-tractable Malay-English lexicon currently
being developed at NTT MSC, Malaysia. The lexicon is designed to satisfy
three criteria: Firstly, to develop and document detailed syntactic features
useful for both analysis and generation. Secondly, to use a well-developed
semantic ontology, in our case the semantic classes from the 2,710 classes used
in the machine-tractable Japanese-English Goi-Taikei ontology. Thirdly, to get
as wide cover as possible. The lexicon currently contains around 91,000
Malay-English pairs. Each entry consists of nine major fields, which include
such information as numeral classifiers associated with common nouns, and
meta-codes to show honorific use, register and origin. In addition, English and
Chinese translations and comments are provided for future use in machine
translation systems and also as an aid for non-Malay speakers. A version of the
dictionary, which does not show all fields, is available on-line. |